Gluon is an attempt by Microsoft and Amazon to regain some influence in AI tools. Keras looked like it was going to become the standard high-level API, but now Theano is dead, and CNTK and MxNet are controlled by Google's rivals, and they're ganging up against Google's tools. Francois Chollet is committed to keeping Keras neutral, but he's still a Google engineer, and that probably makes Microsoft and Amazon nervous. This is the equivalent of Microsoft creating C# in response to Java. The company that controls the API, has enormous influence on the ecosystem built atop the API, just like Google has had with Android, or MSFT with Windows. MSFT and AMZN are carving out their own user base, or trying to, at the price of fragmenting the Python community.
Unlike Keras and Tensorflow, Gluon is Define-by-run Deeplearning framework like Pytorch, Chainer.
Network definition/debugging/flexibility are really better with dynamic network (define-by-run).
That's why Facebook seem to use Pytorch for research and caffe2 for deployment. Gluon/Mxnet can do both define-by-run with Gluon API and "standard" define-and-run with it's Module API.
I think you're both right here. Competition will force the other to innovate. I don't think end users lose by there being multiple interfaces even if fragmentation is ultimately what happens here.
Data scientists arguably have too much choice. 10 data scientists will have 50 different tools, can't share work or build on another's experiments or even remember what the result of an experiment were. those are some of the reasons why most data science projects fail. that and integrations. standardization has real benefits.
Of course standardization has benefits but how do you choose? Standardization only works if choice is eliminated so choice is a barrier to achieving standardization.
It often just comes down to project requirements.
Eg, what kind of model is required? How hard would it be to build with tool x?
For example, a big reason why a lot of computer vision research was built (and sorta still is because of momentum) on caffe was pre existing model zoos.
A big reason why people choose TF (despite lacking dynamic graphs) is just because of existing community.
Requirements for both papers as well as industry will continue to evolve. Each framework will have their own trade offs.
There's tradeoffs to choice. In the case of another commenter "too much choice" means a ton of churn and a lot of friction when it comes to building models.
I think there's always a trade off of innovation vs stability that people should be thinking about here.
Granted, things like the model formats should help long term, but for now we're going to be dealing with a ton of churn on APIs.
I'm sure another thing like dynamic graphs will come along and we'll need to update the apis.
I suspect keras will respond to this at some point by adding primitives for eager mode and the like.
I know both data scientists who need more advanced models and others who prefer the keras api just building off the shelf models.
Can someone please shed light to why so many ML tools and frameworks are being implemented in Python? What makes Python so special for doing ML?
Personally, I would love for MS to release or support a .NET based ML toolkit. There is open source stuff like http://accord-framework.net but I would assume that it isn't as big nor complete as a framework being supported by a major corporation.
> Python's Buffer Protocol: The #1 Reason Python Is The Fastest Growing Programming Language Today
> The buffer protocol was (and still is) an extremely low-level API for direct manipulation of memory buffers by other libraries. These are buffers created and used by the interpreter to store certain types of data (initially, primarily "array-like" structures where the type and size of data was known ahead of time) in contiguous memory.
> The primary motivation for providing such an API is to eliminate the need to copy data when only reading, clarify ownership semantics of the buffer, and to store the data in contiguous memory (even in the case of multi-dimensional data structures), where read access is extremely fast. Those "other libraries" that would make use of the API would almost certainly be written in C and highly performance sensitive. The new protocol meant that if I create a NumPy array of ints, other libraries can directly access the underlying memory buffer rather than requiring indirection or, worse, copying of that data before it can be used.
(The italic emphasis was copied from the original article.)
That doesn’t sound like a very satisfying reason to me. Isn’t a ByteArrayBuffer in Java pretty much the same (its underlying implementation is a char[] which can be used directly from C)?
Is there perhaps another factor, such as an existing ecosystem or that it’s widely used in the academic field?
If you use direct allocation yes. Java has great facilities for direct memory management. (Look at the number of libraries like netty and co that exist). The problem with java is more the friction at getting at some of the lower level apis.
Python just has momentum and a fairly easy to use FFI.
>Is there perhaps another factor, such as an existing ecosystem or that it’s widely used in the academic field?
Yes, it's widely used in science in general. Don't underestimate the learning curves of other languages when your audience is scientists and mathematicians. Python is incredibly easy to use, even when using numpy and other scientific tools.
As a long-time Java developer, Python was a beauty. It brings back the joy of programming and makes data manipulation a breeze. C# is better than Java, but it's still not as elegant/simple/clean as Python for data science.
Python is only fun for tiny projects. Once you reach 120k LOC in a project, refactoring in Python is an insanity even with PyCharm, and debugging becomes impossible, too.
Data Engineer here... How do you get to 120K LOC without splitting up your infrastructure? If anything, it is poor design on your part. Python is beautiful. I've used it at 3 different companies now, 2 of which i encouraged them to try it out and they have nothing but love for it.
Even if you split your infrastructure, have you ever tried refactoring larger projects, with many contributors, while ensuring API contracts are kept?
Without a strict and static type system it becomes quite problematic to ensure new code keeps the API contract, unless you have unit codes for every possible value.
A good type system accelerates your coding speed, compared to writing equivalent unit tests, and it improves your quality, compared to no testing.
> A good type system accelerates your coding speed, compared to writing equivalent unit tests, and it improves your quality, compared to no testing.
Coding speed is least of my concern for a ML project, to be honest. And unit tests aren't useful either, since ML by large is not deterministic. A lot u said is true for web application, but didn't really apply for a ML project
Once you reach 120k LOC in a machine learning project, you should have split it up into disparate projects (input adapters, interactive applications, transforming results...) many orders of magnitude ago, even in verbose and refactoring-friendly languages like Java.
Can you be specific on what makes this a headache? Your "refactoring" tells me that the code probably wasn't well structured in the first place and if so this would make refactoring difficult for any language, particularly dynamically typed ones.
> Your "refactoring" tells me that the code probably wasn't well structured in the first place
Well considering this is a realistic scenario for fallible humans, it’s still decent advice to keep your exploratory projects in python small to avoid ridiculous tech debt. It’s not quite as bad as with ruby, but it’s close.
Many languages make it problematic to keep code actually bug-free and maintainable, and Python and especially Ruby are problematic for that, while Java and Kotlin, but even C++ (with a strict style guide) are a lot nicer to work with at scale.
If you want to keep consistent APIs between modules, strict types and checked exceptions are very helpful, while with python one typo can lead to accesses being lost — which is why so many use slots nowadays, and TypedPython, and annotations. But if I do that, I might as well use Java or Kotlin, and get a better IDE.
Compared to unit tests, strict and static types are faster, compared to no testing, static types are safer.
Python in data/ml rarely goes into that scale. It is used for Training. Several thousands line per project at most, it is tractable and I don't think machine learning models really can be refactored or debugged like a web application.
Python has always had a good selection of scientific/mathematical libraries (numpy, scipy).
In addition, notebook apps like Jupyter fit well with the experimental nature of scientific code. I have a colleague who was attempting to do some stuff in Ruby (to fit with our application stack) who would leave IRB sessions open for weeks at a time. He's recently switched to Zepplin for notebook stuff, and it has been a huge productivity boost for him.
Mostly because Python is a great "glue" language. It isn't performant enough to implement the actual low-level computation, but is better at running other applications, getting data from them, feeding them into other applications (a.k.a pipelines).
Sure but if I have invested a lot in MS technologies I am hesitant to learn and implement things in a completely new language if I can find something that is more fitting to the stack I already use.
What relevant stack of Microsoft technology are you hoping to leverage for high performance numerical computation? Surely nothing involving .NET.
Python for Windows and Python extensions can be compiled with Visual Studio; I don't see other ways to be more Microsoft-friendly.
Sure but if I have invested a lot in MS technologies I am hesitant to learn and implement things in a completely new language if I can find something that is more fitting to the stack I already use.
The rest of the ML world is in that exact situation, but on Python. They aren't going to throw away their familiar tools unless everyone else does too.
Momentum, community uptake. Python with numpy, matplotlib had the framework in place as a free alternative to the exhorbitantly priced Matlab. Community ran with it.
Python had a scientific community long before even this whole new fad of Data Science. Also, financial institutions helped make software such as Pandas. The last thing is that that the syntax of the language is really friendly and easy to use with batteries included. Others have mentioned its a great glue language since it was originally designed as a systems programming language and a bunch of *nix distributions use it for that purpose.
It's a common denominator, few downsides, speed handled in lower level code/libs. Really good "get shit done" language, best GSD lang I've used. Scientists + programmers, data engineers and PhDs, all are cool w/ the syntax. It's open source, has a shitton of supporting libs.
Outside of speed, I've read very few valid criticisms. What other languages are cross-platform, great lib support, delegate easily to lower level libs for perf, are there?
(FWIW, any MS based language is probably excluded from consideration depending on its cross platform ability. Many data people -- like me! -- won't use a MS based OS)
Python has its share of cruft and idiosyncrasies. I find some of the syntax irritating, e.g. boolean logic, argument handling, hidden / private / magic symbols, and those pesky half-open intervals that routinely lead to off-by-one errors.
Same here, kind of defeats the purpose of many libraries in my opinion if all of them are released for Python and look exactly the same. Think: front-end frameworks in javascript.
AWS / MSR response to dynamic graph computation paradigms such as Chainer, Tensorflow Fold. There is certainly a distribution advantage to having this be the default backend engine as so many store their data on S3, Redhsift.
The DyNet paper is still the best source for background on the relative advantages to using "Define-By-Run" networks:
I feel like DL/ML frameworks are getting about as frequent as the JavaScript framework craze of the past decade. I started out on Torch and recently I've been using Keras, and this looks quite similar to the latter. Hard to see why one would switch to this if they are already comfortable in another.
Do you think they would still be doing this if TF hadn't become the near de-facto ML software? Seems like more of a response to its rise than an actual research project
Curious to see what HN thinks of this. Are Amazon and Microsoft going against Google and Tensorflow? Will we see Gluon Processing Units on AWS and Azure in the near future?
We love TensorFlow (and have a ton of developers using it on AWS).
Just like databases we’ll support a wide range of engines on AWS; some of our own like Gluon, along side others from the community like PyTorch and TensorFlow. They’re all first class citizens.
We even fund separable (competing!) teams internally to focus on making sure AWS is the best place to run each of these popular engines.
One of four (MXNet and CNTK alongside TF and Theano), and the Amazon deep learning API forked Keras to default to MXNet support before it was really ready - which irked the Keras authors quite a bit.
Not to mention that "gluon" is the name of a particle (strong force carrier). There seems to be a trend recently to ride science hype by choosing a sciency-sounding name. :/
Recently MSFT partnered with FB releasing a deep learning framework Open Neural Network Exchange, and now Microsoft with AWS. Seems Microsoft has learnt that Google is far ahead in this game (just like they lost Internet game to Google) so let's make something happen that Google might not become another unbeatable winner in AI.
There seems to be a little bit of partnership between Amazon and Microsoft. For example, they recently announced that users will soon be able to use Alexa from Cortana, and Cortana from Alexa. I'm curious to see if this trend continues.
The most important executives at both companies live and work around Seattle, so maybe that has something to do with it. There are some nice golf courses in Bellevue!
This is not the case. Probably more relevant is that there are dedicated AI / Cloud Computing meetups and groups in the Bellevue and Seattle area that employees of all the companies attend. This means that the people that do this research and build these products frequently chat with each other.
I know my counterparts at AWS as a result and as we are friends, I push for collaboration whenever opportunities arise. At least on the MS side of the house these sorts of outreach and collaborative projects are a ground up push.
This is interesting, but because of the growth of the number of ML frameworks and languages, when new ones pop up it would be great for them to release methods to transfer existing models to their language. I would love some extra compatibility with AWS for deploying deep learning models in prod but since I already have existing models running in production, it's a hard sale for me to re-train and re-implement from scratch existing work.