If you're interested in this field check out this talk by Bharath Ramasundar from Prof. Pande's lab at Stanford: https://youtube.com/watch?v=sntikyFI8s8. He's also the author of http://deepchem.io - a deep learning library for drug discovery.
why I am NOT bullish on this solving molecular medicine discovery problems:
medchem is notoriously NOT generalizable. A crude example is the reason why the developed heroin is because in the early days of medchem, the reasoning was acetyl-salicylic acid is awesomer than salicylic acid, so therefore acetyl-morphine must be awesomer than morphine. Actually, in many ways it is awesomer (and that's why it's a bad drug).
Consider Gleevec. Even if you knew the structure of gleevec's target (BCR/ABL) you would not be able to predict Gleevec, because it works by displacing an entire segment of the protein out of place which happens to be thermodynamically more stable (but kinetically disfavorable). Gleevec is a medchem drug (discovered through combinatorial synthesis) but sadly the insight into this mechanism is only generalizable in the conceptual sense, if you take that molecular fragment and graft it onto another molecule intended for a different target, it probably won't work.
Deep learning depends strongly on generalizable knowledge, and medchem is notoriously not easily generalizable for well-understood reasons.
Some aspects of medchem - like optimizing bulk synthesis reactions, picking synthetic routes, guessing at bioavailability, stability in formulations, might be amenable to ML, but I am not bullish on discovery. Let's hope I'm wrong.
I'd put even odds on some sort of ML replacement for Lipinski's rule of five. But there are also whole classes of exceptions, like strategically methylated cyclic peptides.
> One reason molecular data is so interesting from a machine learning standpoint is that one natural representation of a molecule is as a graph with atoms as nodes and bonds as edges. Models that can leverage inherent symmetries in data will tend to generalize better — part of the success of convolutional neural networks on images is due to their ability to incorporate our prior knowledge about the invariances of image data (e.g. a picture of a dog shifted to the left is still a picture of a dog). Invariance to graph symmetries is a particularly desirable property for machine learning models that operate on graph data, and there has been a lot of interesting research in this area as well
Google has a lot of ML experts. Lot of fields can potentially benefit from ML and also contribute data and concept to ML but these fields don't have ML experts. I was thinking about this only a few days ago. I am so glad google is looking to personally contribute to every field possible.
Though property predicting is a hard problem,I think there are low hanging fruits in other fields. For example, Anthropology where only partial skeletons are found but we know there is symmetry there. Software regeneration is slow and expensive and doesn't exploit the symmetry a lot.
A joint project between google and CERN also sounds really cool to me. Or maybe google can set up a system where researchers with large data can approach google and see if a symbiotic relationship can be formed.
A personal anecdote to support your point: The first company I worked for in Europe was a well-established ML company that had been doing predictive analytics long (10+ years) before the current fad.
Founded by a former professor from CERN, and staffed about 90% from CERN postdocs. I was the only member of my team who was not a co-author on the Higgs boson discovery paper.
So yeah, people at CERN are pretty well aware of what can be done with ML.
I never said CERN doesn't use ML. But I would imagine google has more ML experts and computational capacity then CERN. Correct me if i'm wrong. There is nothing wrong with collaboration.
Materials science is buzzing in this respect from nanoscale to microstructural level but ML cannot predict the existence and, in case, the stability of the novel items you find. It is more a targeted, extrapolated attempt (because of some exact properties we wish to improve for targeted applications) than magic 8 ball.
Can you? People already tested a big number of sub-exponential algorithms without any luck. If it was something easy to interpolate, one'd expect somebody to have some amount of success already.
Probably rolling out of laughter. You don't need ML to "predict" properties of molecules. Not a single physicist will but ML predicted molecule properties.
The main problem is that SAT problems come in many different sizes, and while resampling a picture "makes sense" to make it smaller, resampling a SAT problem does not. Also the general lack of shape or structure makes things hard.
NN can help bad SAT solvers get better, but the heuristics in the best ones are (at present) better than anything a NN can produce.
I doubt anytime soon (at least as part of current SAT solvers). A big barrier to applying ML to the process of SAT solving is that a lot of times, it's just faster to do a search with a simple heuristic for variable selection than try a much more time consuming ML method (and neural networks will be quite time consuming relative to the kinds of heuristics usually used) to do variable selection better.
Quantum is a special case really as DFT computation is already very time consuming.
Only ones I've seen try and predict the satisfiability directly. But usually in SAT, you're either interested in a solution or a proof of infeasibility. Prediction can't do the latter and afaik, existing non-ML based SAT solvers are far better at doing the former.