Hacker News new | past | comments | ask | show | jobs | submit login
PyTorch 1.0 is out (github.com/pytorch)
470 points by kashifr on Dec 7, 2018 | hide | past | favorite | 70 comments



What surprised me most is the elegance of C++ API. Compared to its equivalence in Python, the C++ version is almost the same if we discard the "auto" keyword [0]. As mentioned in the doc, they put user-friendliness over micro-optimizations, which also proves the expressiveness of modern C++ (at least when they want to prioritize user-friendliness!) [0]: https://pytorch.org/cppdocs/frontend.html#end-to-end-example


i picked up c++ for some gpu stuff with the arrayfire api... felt the same. firstly, modern c++ takes no time to learn if you come from java / c/ c# / etc. secondly, things like operator overloading and type inference make for pretty seamless apis. E.g. want to add matrices? auto C = A + B.

A lot of things suck (closures, generator functions, first order functions all suck in c++), but oh my does it all run fast when you get it working; plus wrapping it with Rcpp or Lua is easy as pie.


I'm not sure how I feel about operator overloading. It starts out simple enough, but can lead to extremely confusing code.

Take this snippet of an API call[1] that parses ISO strings to chrono::time_point using date[2]:

    std::istringstream in(iso_string);
    in >> date::parse("%FT%TZ", tp);
At first glace my brain cannot comprehend the second statement. Why is there an input stream going into a function? Is that even valid C++? Why not just pass `in` as a function parameter and return time_point as the return value?? When you actually dig into the source, it's a namespace overloaded operator and it's extremely heavily templated to be generic. So now if I think of `>>` as simply a second function call using the result of `date::parse` it makes sense, but... I still don't understand all of the technical reasons behind that decision. I assume they're valid reasons though since Howard Hinnant is the c++ datetime expert.

[1] https://stackoverflow.com/a/33438989/2516916

[2] https://github.com/HowardHinnant/date


> Why is there an input stream going into a function?

The function is constructing a temporary object and reading into that object. The object includes a reference to a named tp object, and the temporary parser object is parsing from the input stream, which writes the data into the tp object.

> but... I still don't understand all of the technical reasons behind that decision.

Because he wants to support input from an istream, and inputting from a istream necessarily means overloading '>>'. That's just how things are done in C++; if you want to support reading from or writing to an istream/ostream, you overload the '>>' and '<<' operators.

It's a trade off; he could have designed it so that the constructor of tp simply accepts the format string and the istream as parameters. But that has the downside of not looking like idiomatic istream code.

I personally think that the way iostreams are implemented in C++ was a mistake. (because I don't like the way idiomatic istream looks, and I think friend functions are generally smelly.) Unfortunately, it's something like 35 years too late to correct it. (I'm aware that C++ is only 33 years old; iostream.h and cout << "blah" << endl; and cin >> foo; preexist C++) But here's the thing: it's badly designed iostreams that make the linked code confusing, not operator overloading.


Maybe because my background coming into C++ was Turbo Pascal, and I already knew OOP from 5.5 and 6.0, including a nice experience with Turbo Vision and its stream framework, introduced with Turbo Pascal 6.0.

I never grasped the hate iostreams get, as they are type safe and composable in ways that FILE will never be.

Sure they might be a couple of ms slower than stdio calls, but unless I would be writing a HPC trading application communicating over stdio, I hardly see the relevance.

And for the operators, oh well, as long as they are consistent.

Then again, it might just be my biased background as I got introduced to them.


Thanks for explaining that design logic a lot more clearly. I figured out what the code did, but understanding the why was very helpful.


I recently encountered code (at work no less) that overloaded the == operator.

a.) It was not a const overload. b.) Inside said overload, it modified FOUR member variables. c.) The 'new' keyword was used twice inside said == overload. d.) I learned when gdb fails, to grep the codebase for the 'operator' keyword.

I am not a fan of operator overloading. If the SW enginner in question had instead written a equals(<type> rhs) function, I'd have saved myself a lot of headache.


Sounds like operator overloading isn't the real problem here. An function equals(T rhs) -> bool that calls new and modifies member variables still sounds pretty terrible; changing "==" to "equals" doesn't fix the problem.


The difference is in the time it takes to track down the aberrant behavior, and the learning opportunities available.

In the named case, it's obvious the other programmer may be doing something nefarious.

In the operator case, you learn much more, like how to grep, really work gdb, etc.


But you're familiar with cin/cout right?

Same thing. Weird at first every C++ programmer should be familiar with it now.

I pronounce << and >> as 'goes to' and 'comes from' or something similar.


Of course. But what I'm not used to is seeing something like this:

   cin >> foo(x);


This is common in many functional languages, and even in the shell. Think of pipes such as input | sed ... | grep ..., or in F#, xs |> List.map (fun x -> x.Value). Basically, each input goes into the function and is piped through to get to the output.


Why do closures suck? What are they missing?


I’m confused about that too - closures are one of the nicer parts of C++ imo.


I like to pass closures as return values or as function parameters. In my experience this is folly in c++.


Here's an example of doing both:

  $ cat test.cpp
  #include <iostream>
  #include <string>

  template <typename F>
  auto pass_func(F func) {
    return func();
  }

  auto return_func() {
    std::string test = "Hello World!";
    return [=](){ return test; };
  }

  int main() {
    auto func = return_func();
    std::cout << pass_func(func) << std::endl;
  }

  $ clang test.cpp -lstdc++ -std=c++14 -o test
  $ ./test
  "Hello World!"
What's hard about it?


You didn't pass a function to std::cout, you called it and passed its return value.


The way I interpret the GP, they're talking about passing closures as a function argument. In the example I pass func (a closure) to pass_func. Not sure what std::cout has to do with it?


Sorry, I misread the code.


I'm hoping that this results in a nice, high-level API for https://github.com/fragcolor-xyz/nimtorch as well, which AFAIK has been wrapping the low-level Aten API. I've been keeping my eye on that project, and been really excited about it.


Likewise, I hope this leads to a nice wrapper API for Julia.



So cool to see people saying good things about C++ and I've never had the chance to take a uni course on computer science but I've always wanted to the GUI stuff with C and C++. Maybe I should try again. Thanks OP. :)


Wow, that’s actually really awesome. I really missed a good languages-other-than-Python story in Tensorflow. Now I feel a little inspired to try out ML again...


I’m really excited to use it in C++. When it’s in Python, it becomes much harder to embed into other programs and is susceptible to Python version issues when sharing code.

This reduces barriers to use in addition to improving performance and portability.


I’m also excited that Caffe got merged in because it has nearly perfect GPU-scaling via MPI, whereas tf suffers (or did, when [0] came out) with more GPUs.

[0] https://arxiv.org/pdf/1711.05979


Wow, this is cool. How complete is the API? Could you use it for research? Last I checked, the TensorFlow C++ API was missing all sorts of important stuff for building models and was basically only useful for loading models saved from Python.


Yes! The API feels very much like using PyTorch from Python, and implementing models and working with tensors purely in C++ is very convenient. We're using it for our research platform for StarCraft: Brood War (https://torchcraft.github.io/TorchCraftAI).

[disclosure: I work at FAIR]


It's very complete. We have many kinds of builtin Modules (rnns, conv{1,2,3}d, batchnorm etc.), optimizers, a data loader, a serialization format for checkpointing and similar things you need for training. My hope is that any model you can train in Python you can now also train in C++.


Quite usable, all of TorchCraftAI uses it https://torchcraft.github.io/TorchCraftAI/ :)


Really grateful to the FAIR team for Pytorch. I use deep learning for computational biology. Pytorch lets me focus on the problem rather than nitpicking with the framework (looking at you tensorflow) to make something work.


In case anybody else was wondering, since this isn't in the fine article:

"PyTorch is a Python package that provides two high-level features:

* Tensor computation (like NumPy) with strong GPU acceleration

* Deep neural networks built on a tape-based autograd system

You can reuse your favorite Python packages such as NumPy, SciPy and Cython to extend PyTorch when needed."


ML noob hobbyist here. Would you use PyTorch for models not involving deep neural networks too or is it just good for that. Say if I use linear models (like least squares etc) or use custom algorithms (integer linear programming, optimization, or something else...) but need very fast linear algebra support is PyTorch a good lib? I'm a C kinda guy so I usually use blas, lapack etc or numpy+pandas+sklearn in python. Would PyTorch give a "complete" enough feel or would I just use it only for nn and use other libraries for other things?


PyTorch uses CuBLAS [1] under the hood, among other libraries, so basic linear algebra ops should be fast.

You might also look at CuPy [2], especially if you like NumPy.

[1] https://developer.nvidia.com/cublas [2] https://cupy.chainer.org/


Now I'm wondering if anybody wrote a DL library on top of CuPy, and if such library could be competitive with PyTorch in terms of performance.


Yes, it is called Chainer and it's quite competitive.



The new JIT is very interesting. Anyone know if this is for inference only or also for training?


training and inference.


Can you actually train a torch::jit::script::Module? I couldn't figure out how to feed its parameter tensors to a torch::optim::OptimizerBase constructor...


I wonder what is "just in time" about that compiler, does anyone know? If you want to compile your python code so that it can run in C++, then you'll have to statically compile, no?


TL;DR

- New JIT feature that lets you run your model without python. It now seems trivial to load a pytorch model in C++

- New distributed computation package. Major redesign.

- C++ frontend

- New torch hub feature to load models from github easily


I love PyTorch, but my experience with jits embedded in Python (eg. Numba) has been everything but simple, nevermind trivial. I'll really have to try it to believe it.


I’ve had the opposite experience with numba in production. It works almost flawlessly, very easy to reason about the generated code and inspect annotations, easy to debug.


Do you interact with numpy or other compiled numerical packages? this is where it usually breaks for me. The thing is that I use numpy, scipy, keras, tensorflow, etc. in literally every project, making numba not too useful


Check out github.com/google/jax, it’s NumPy on the GPU with automatic differentiation, JIT and autobatching.


That’s very cool. Numba and Cython work extremely well with virtually no overhead or extra effort on my part, so jax doesn’t seem like it would buy me much for most of my work. But I can imagine a lot of projects where jax woukd be useful, and I plan to keep current on best practices for it.


I've been using PyTorch, and the PyTorch 1.0 pre-release for a while now. I adore it but don't really want to write C++ backends in production.

Anyone want to start working on Golang bindings for C++ PyTorch?


Curious, can SWIG[1] help automate the process?

[1]: http://www.swig.org/


Is this the version combining Caffe2 and PyTorch into one framework? Is LMDB continuing as the default data load/store method?


According to this [0] blog, it it:

PyTorch 1.0 takes the modular, production-oriented capabilities from Caffe2 and ONNX and combines them with PyTorch's existing flexible, research-focused design to provide a fast, seamless path from research prototyping to production deployment for a broad range of AI projects.

[0] https://developers.facebook.com/blog/post/2018/05/02/announc...


If someone from fair is around, what does this mean for caffe2 ? Is the new torch c++ API a form a replacement for caffe?


Agreed. I've always thought, however, that D would be the perfect language for a new Python implementation.


Is there some sort of pytorch 1.0 migration guide or does anyone know if there is any breaking from .41 to 1.0 ?



in the release notes, there's a section called "Breaking Changes". There are no major breaking changes from 0.4.1, and all your code should largely work as-is.


What would be a good book and project to get started with this? Object recognition? Product recommendations?


I would suggest going through the Fast.AI course [1]. It's an excellent course to learn more about DL in general, and some of torch API. The downside is that the course material was produced when PyTorch was still at 0.3, so some of the API has changed since then.

[1] https://course.fast.ai/


A new version for PyTorch v1 will be released next month, FYI.


There are several good books written or in development, but if I had to pick one right now, I'd point you to this one: https://www.manning.com/books/deep-learning-with-pytorch


Thank you! :)


The best resource I've found are Udacity's Jupyter notebooks: https://github.com/udacity/deep-learning-v2-pytorch. Start with intro-to-pytorch.


Hm, the Mac version of LibTorch is suddenly unavailable!? [0] I swear it was available for download until a few days ago...

[0] https://pytorch.org/get-started/locally/


we are fixing this shortly. it was an oversight.

Edit: fixed links should be going live in a few mins, via: https://github.com/pytorch/pytorch.github.io/pull/141


> The JIT is a set of compiler tools for bridging the gap between research in PyTorch and production. It allows for the creation of models that can run without a dependency on the Python interpreter and which can be optimized more aggressively. Using program annotations existing models can be transformed into Torch Script, a subset of Python that PyTorch can run directly.

Isn't python bytecode simple enough that it can be run anywhere?


Python bytecode is a pre-processed form of instructions to a Python interpreter, such as getting the function with a specified name out of a Python library and calling it with certain Python objects as arguments.

The point here is to run PyTorch code without having a Python interpreter, and without having to run slow Python code.


There was a huge problem with converting weight normalization module (torch.utils.weight_norm) and with forward and pre-forward hooks generally, so I had to rewrite the full model. Hope they've improved it in the stable release


Alternative implementations of Python don't seem very easy to write, so I'm guessing the answer is no.


Python is not very difficult to implement [1], the hard part is making it fast. That's because every property access involves a lot of magic behind the scenes, like __getattribute__, __getattr__ and the method resolution order. And because Python is dynamically typed, that dispatch logic can't be compiled away but needs to execute every time. (PyPy's JIT can probably speed it up, but still needs deoptimization checks in case types change.)

[1] see https://github.com/nedbat/byterun for a Python implementation


Feels good to see torch-1.0.0 when I do: "python -m pip install torch torchvision --upgrade"




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: