PyTorch 1.0 is out

htfy96 · on Dec 7, 2018

What surprised me most is the elegance of C++ API. Compared to its equivalence in Python, the C++ version is almost the same if we discard the "auto" keyword [0]. As mentioned in the doc, they put user-friendliness over micro-optimizations, which also proves the expressiveness of modern C++ (at least when they want to prioritize user-friendliness!) [0]: https://pytorch.org/cppdocs/frontend.html#end-to-end-example

usgroup · on Dec 7, 2018

i picked up c++ for some gpu stuff with the arrayfire api... felt the same. firstly, modern c++ takes no time to learn if you come from java / c/ c# / etc. secondly, things like operator overloading and type inference make for pretty seamless apis. E.g. want to add matrices? auto C = A + B.

A lot of things suck (closures, generator functions, first order functions all suck in c++), but oh my does it all run fast when you get it working; plus wrapping it with Rcpp or Lua is easy as pie.

umvi · on Dec 7, 2018

I'm not sure how I feel about operator overloading. It starts out simple enough, but can lead to extremely confusing code.

Take this snippet of an API call[1] that parses ISO strings to chrono::time_point using date[2]:

    std::istringstream in(iso_string);
    in >> date::parse("%FT%TZ", tp);

At first glace my brain cannot comprehend the second statement. Why is there an input stream going into a function? Is that even valid C++? Why not just pass `in` as a function parameter and return time_point as the return value?? When you actually dig into the source, it's a namespace overloaded operator and it's extremely heavily templated to be generic. So now if I think of `>>` as simply a second function call using the result of `date::parse` it makes sense, but... I still don't understand all of the technical reasons behind that decision. I assume they're valid reasons though since Howard Hinnant is the c++ datetime expert.

[1] https://stackoverflow.com/a/33438989/2516916

[2] https://github.com/HowardHinnant/date

nwallin · on Dec 8, 2018

> Why is there an input stream going into a function?

The function is constructing a temporary object and reading into that object. The object includes a reference to a named tp object, and the temporary parser object is parsing from the input stream, which writes the data into the tp object.

> but... I still don't understand all of the technical reasons behind that decision.

Because he wants to support input from an istream, and inputting from a istream necessarily means overloading '>>'. That's just how things are done in C++; if you want to support reading from or writing to an istream/ostream, you overload the '>>' and '<<' operators.

It's a trade off; he could have designed it so that the constructor of tp simply accepts the format string and the istream as parameters. But that has the downside of not looking like idiomatic istream code.

I personally think that the way iostreams are implemented in C++ was a mistake. (because I don't like the way idiomatic istream looks, and I think friend functions are generally smelly.) Unfortunately, it's something like 35 years too late to correct it. (I'm aware that C++ is only 33 years old; iostream.h and cout << "blah" << endl; and cin >> foo; preexist C++) But here's the thing: it's badly designed iostreams that make the linked code confusing, not operator overloading.

pjmlp · on Dec 8, 2018

Maybe because my background coming into C++ was Turbo Pascal, and I already knew OOP from 5.5 and 6.0, including a nice experience with Turbo Vision and its stream framework, introduced with Turbo Pascal 6.0.

I never grasped the hate iostreams get, as they are type safe and composable in ways that FILE will never be.

Sure they might be a couple of ms slower than stdio calls, but unless I would be writing a HPC trading application communicating over stdio, I hardly see the relevance.

And for the operators, oh well, as long as they are consistent.

Then again, it might just be my biased background as I got introduced to them.

Fede_V · on Dec 8, 2018

Thanks for explaining that design logic a lot more clearly. I figured out what the code did, but understanding the why was very helpful.

irishcoffee · on Dec 8, 2018

I recently encountered code (at work no less) that overloaded the == operator.

a.) It was not a const overload. b.) Inside said overload, it modified FOUR member variables. c.) The 'new' keyword was used twice inside said == overload. d.) I learned when gdb fails, to grep the codebase for the 'operator' keyword.

I am not a fan of operator overloading. If the SW enginner in question had instead written a equals(<type> rhs) function, I'd have saved myself a lot of headache.

logicchains · on Dec 8, 2018

Sounds like operator overloading isn't the real problem here. An function equals(T rhs) -> bool that calls new and modifies member variables still sounds pretty terrible; changing "==" to "equals" doesn't fix the problem.

sgrove · on Dec 8, 2018

The difference is in the time it takes to track down the aberrant behavior, and the learning opportunities available.

In the named case, it's obvious the other programmer may be doing something nefarious.

In the operator case, you learn much more, like how to grep, really work gdb, etc.

emmelaich · on Dec 8, 2018

But you're familiar with cin/cout right?

Same thing. Weird at first every C++ programmer should be familiar with it now.

I pronounce << and >> as 'goes to' and 'comes from' or something similar.

umvi · on Dec 8, 2018

Of course. But what I'm not used to is seeing something like this:

   cin >> foo(x);

cercatrova · on Dec 10, 2018

This is common in many functional languages, and even in the shell. Think of pipes such as input | sed ... | grep ..., or in F#, xs |> List.map (fun x -> x.Value). Basically, each input goes into the function and is piped through to get to the output.

shaklee3 · on Dec 8, 2018

Why do closures suck? What are they missing?

grandmczeb · on Dec 8, 2018

I’m confused about that too - closures are one of the nicer parts of C++ imo.

usgroup · on Dec 8, 2018

I like to pass closures as return values or as function parameters. In my experience this is folly in c++.

grandmczeb · on Dec 8, 2018

Here's an example of doing both:

  $ cat test.cpp
  #include <iostream>
  #include <string>

  template <typename F>
  auto pass_func(F func) {
    return func();
  }

  auto return_func() {
    std::string test = "Hello World!";
    return [=](){ return test; };
  }

  int main() {
    auto func = return_func();
    std::cout << pass_func(func) << std::endl;
  }

  $ clang test.cpp -lstdc++ -std=c++14 -o test
  $ ./test
  "Hello World!"

What's hard about it?

black-tea · on Dec 9, 2018

You didn't pass a function to std::cout, you called it and passed its return value.

grandmczeb · on Dec 9, 2018

The way I interpret the GP, they're talking about passing closures as a function argument. In the example I pass func (a closure) to pass_func. Not sure what std::cout has to do with it?

black-tea · on Dec 9, 2018

Sorry, I misread the code.

perturbation · on Dec 7, 2018

I'm hoping that this results in a nice, high-level API for https://github.com/fragcolor-xyz/nimtorch as well, which AFAIK has been wrapping the low-level Aten API. I've been keeping my eye on that project, and been really excited about it.

nerdponx · on Dec 7, 2018

Likewise, I hope this leads to a nice wrapper API for Julia.

byt143 · on Dec 9, 2018

Sure, but Julia doesn't need it: https://julialang.org/blog/2018/12/ml-language-compiler

MarvelousWololo · on Dec 8, 2018

So cool to see people saying good things about C++ and I've never had the chance to take a uni course on computer science but I've always wanted to the GUI stuff with C and C++. Maybe I should try again. Thanks OP. :)

thosakwe · on Dec 8, 2018

Wow, that’s actually really awesome. I really missed a good languages-other-than-Python story in Tensorflow. Now I feel a little inspired to try out ML again...

stochastic_monk · on Dec 8, 2018

I’m really excited to use it in C++. When it’s in Python, it becomes much harder to embed into other programs and is susceptible to Python version issues when sharing code.

This reduces barriers to use in addition to improving performance and portability.

stochastic_monk · on Dec 8, 2018

I’m also excited that Caffe got merged in because it has nearly perfect GPU-scaling via MPI, whereas tf suffers (or did, when [0] came out) with more GPUs.

[0] https://arxiv.org/pdf/1711.05979

modeless · on Dec 7, 2018

Wow, this is cool. How complete is the API? Could you use it for research? Last I checked, the TensorFlow C++ API was missing all sorts of important stuff for building models and was basically only useful for loading models saved from Python.

jgehring · on Dec 8, 2018

Yes! The API feels very much like using PyTorch from Python, and implementing models and working with tensors purely in C++ is very convenient. We're using it for our research platform for StarCraft: Brood War (https://torchcraft.github.io/TorchCraftAI).

[disclosure: I work at FAIR]

goldsborough · on Dec 8, 2018

It's very complete. We have many kinds of builtin Modules (rnns, conv{1,2,3}d, batchnorm etc.), optimizers, a data loader, a serialization format for checkpointing and similar things you need for training. My hope is that any model you can train in Python you can now also train in C++.

snippyhollow · on Dec 8, 2018

Quite usable, all of TorchCraftAI uses it https://torchcraft.github.io/TorchCraftAI/ :)

nafizh · on Dec 7, 2018

Really grateful to the FAIR team for Pytorch. I use deep learning for computational biology. Pytorch lets me focus on the problem rather than nitpicking with the framework (looking at you tensorflow) to make something work.

wpietri · on Dec 7, 2018

In case anybody else was wondering, since this isn't in the fine article:

"PyTorch is a Python package that provides two high-level features:

* Tensor computation (like NumPy) with strong GPU acceleration

* Deep neural networks built on a tape-based autograd system

You can reuse your favorite Python packages such as NumPy, SciPy and Cython to extend PyTorch when needed."

gnulinux · on Dec 7, 2018

ML noob hobbyist here. Would you use PyTorch for models not involving deep neural networks too or is it just good for that. Say if I use linear models (like least squares etc) or use custom algorithms (integer linear programming, optimization, or something else...) but need very fast linear algebra support is PyTorch a good lib? I'm a C kinda guy so I usually use blas, lapack etc or numpy+pandas+sklearn in python. Would PyTorch give a "complete" enough feel or would I just use it only for nn and use other libraries for other things?

p1esk · on Dec 7, 2018

PyTorch uses CuBLAS [1] under the hood, among other libraries, so basic linear algebra ops should be fast.

You might also look at CuPy [2], especially if you like NumPy.

[1] https://developer.nvidia.com/cublas [2] https://cupy.chainer.org/

amelius · on Dec 8, 2018

Now I'm wondering if anybody wrote a DL library on top of CuPy, and if such library could be competitive with PyTorch in terms of performance.

smhx · on Dec 8, 2018

Yes, it is called Chainer and it's quite competitive.

pesenti · on Dec 7, 2018

Link to blog post: https://code.fb.com/ai-research/pytorch-developer-ecosystem-...

tbenst · on Dec 7, 2018

The new JIT is very interesting. Anyone know if this is for inference only or also for training?

smhx · on Dec 7, 2018

training and inference.

dekimir · on Dec 8, 2018

Can you actually train a torch::jit::script::Module? I couldn't figure out how to feed its parameter tensors to a torch::optim::OptimizerBase constructor...

amelius · on Dec 8, 2018

I wonder what is "just in time" about that compiler, does anyone know? If you want to compile your python code so that it can run in C++, then you'll have to statically compile, no?

l3robot · on Dec 7, 2018

TL;DR

- New JIT feature that lets you run your model without python. It now seems trivial to load a pytorch model in C++

- New distributed computation package. Major redesign.

- C++ frontend

- New torch hub feature to load models from github easily

make3 · on Dec 7, 2018

I love PyTorch, but my experience with jits embedded in Python (eg. Numba) has been everything but simple, nevermind trivial. I'll really have to try it to believe it.

mlthoughts2018 · on Dec 8, 2018

I’ve had the opposite experience with numba in production. It works almost flawlessly, very easy to reason about the generated code and inspect annotations, easy to debug.

make3 · on Dec 10, 2018

Do you interact with numpy or other compiled numerical packages? this is where it usually breaks for me. The thing is that I use numpy, scipy, keras, tensorflow, etc. in literally every project, making numba not too useful

alexbw · on Dec 8, 2018

Check out github.com/google/jax, it’s NumPy on the GPU with automatic differentiation, JIT and autobatching.

mlthoughts2018 · on Dec 8, 2018

That’s very cool. Numba and Cython work extremely well with virtually no overhead or extra effort on my part, so jax doesn’t seem like it would buy me much for most of my work. But I can imagine a lot of projects where jax woukd be useful, and I plan to keep current on best practices for it.

ritoune · on Dec 8, 2018

I've been using PyTorch, and the PyTorch 1.0 pre-release for a while now. I adore it but don't really want to write C++ backends in production.

Anyone want to start working on Golang bindings for C++ PyTorch?

metahost · on Dec 8, 2018

Curious, can SWIG[1] help automate the process?

[1]: http://www.swig.org/

Rafuino · on Dec 7, 2018

Is this the version combining Caffe2 and PyTorch into one framework? Is LMDB continuing as the default data load/store method?

thelastidiot · on Dec 7, 2018

According to this [0] blog, it it:

PyTorch 1.0 takes the modular, production-oriented capabilities from Caffe2 and ONNX and combines them with PyTorch's existing flexible, research-focused design to provide a fast, seamless path from research prototyping to production deployment for a broad range of AI projects.

[0] https://developers.facebook.com/blog/post/2018/05/02/announc...

pilooch · on Dec 8, 2018

If someone from fair is around, what does this mean for caffe2 ? Is the new torch c++ API a form a replacement for caffe?

Solar19 · on Dec 8, 2018

Agreed. I've always thought, however, that D would be the perfect language for a new Python implementation.

dspoka · on Dec 7, 2018

Is there some sort of pytorch 1.0 migration guide or does anyone know if there is any breaking from .41 to 1.0 ?

shafte · on Dec 7, 2018

Breaking changes are highlighted here: https://github.com/pytorch/pytorch/releases#breaking-changes

smhx · on Dec 7, 2018

in the release notes, there's a section called "Breaking Changes". There are no major breaking changes from 0.4.1, and all your code should largely work as-is.

openbasic · on Dec 7, 2018

What would be a good book and project to get started with this? Object recognition? Product recommendations?

dangom · on Dec 7, 2018

I would suggest going through the Fast.AI course [1]. It's an excellent course to learn more about DL in general, and some of torch API. The downside is that the course material was produced when PyTorch was still at 0.3, so some of the API has changed since then.

[1] https://course.fast.ai/

jph00 · on Dec 8, 2018

A new version for PyTorch v1 will be released next month, FYI.

jeffreysmith · on Dec 7, 2018

There are several good books written or in development, but if I had to pick one right now, I'd point you to this one: https://www.manning.com/books/deep-learning-with-pytorch

openbasic · on Dec 7, 2018

Thank you! :)

ipsum2 · on Dec 7, 2018

The best resource I've found are Udacity's Jupyter notebooks: https://github.com/udacity/deep-learning-v2-pytorch. Start with intro-to-pytorch.

dekimir · on Dec 8, 2018

Hm, the Mac version of LibTorch is suddenly unavailable!? [0] I swear it was available for download until a few days ago...

[0] https://pytorch.org/get-started/locally/

smhx · on Dec 8, 2018

we are fixing this shortly. it was an oversight.

Edit: fixed links should be going live in a few mins, via: https://github.com/pytorch/pytorch.github.io/pull/141

amelius · on Dec 7, 2018

> The JIT is a set of compiler tools for bridging the gap between research in PyTorch and production. It allows for the creation of models that can run without a dependency on the Python interpreter and which can be optimized more aggressively. Using program annotations existing models can be transformed into Torch Script, a subset of Python that PyTorch can run directly.

Isn't python bytecode simple enough that it can be run anywhere?

rspeer · on Dec 7, 2018

Python bytecode is a pre-processed form of instructions to a Python interpreter, such as getting the function with a specified name out of a Python library and calling it with certain Python objects as arguments.

The point here is to run PyTorch code without having a Python interpreter, and without having to run slow Python code.

levan_ts · on Dec 7, 2018

There was a huge problem with converting weight normalization module (torch.utils.weight_norm) and with forward and pre-forward hooks generally, so I had to rewrite the full model. Hope they've improved it in the stable release

skybrian · on Dec 7, 2018

Alternative implementations of Python don't seem very easy to write, so I'm guessing the answer is no.

yorwba · on Dec 7, 2018

Python is not very difficult to implement [1], the hard part is making it fast. That's because every property access involves a lot of magic behind the scenes, like __getattribute__, __getattr__ and the method resolution order. And because Python is dynamically typed, that dispatch logic can't be compiled away but needs to execute every time. (PyPy's JIT can probably speed it up, but still needs deoptimization checks in case types change.)

[1] see https://github.com/nedbat/byterun for a Python implementation

punnerud · on Dec 8, 2018

Feels good to see torch-1.0.0 when I do: "python -m pip install torch torchvision --upgrade"