Hacker News new | past | comments | ask | show | jobs | submit login
AI in physics: are we facing a scientific revolution? (4alltech.com)
196 points by ezrakewa on July 21, 2020 | hide | past | favorite | 131 comments



Some applications in computational physics involve solving a "variational" problem, where you have some parameterized function and try to numerically find the parameters that minimize energy or error. This does not necessarily involve supervised learning from outside data as in this article -- it can be purely an optimization problem.

But neural networks are very good parametric function approximators, generally better than what traditionally gets used in physics (b-splines or whatever). So people have started to design neural networks that are well-suited as function approximators for specific physical systems.

It's fairly straightforward -- it's not an "AI" that has "knowledge" of "physics" -- just using modern techniques and hardware to solve a numerical minimization problem. I think this will probably become pretty widespread. It won't be flashy or exciting though -- it will be boring to anyone but specialists, as the rest of machine learning ought to be.


Yes, I think this is a great use for neural networks since they are effectively high dimensional function approximators, and something like Schrondinger's equation is a PDE where the number of dimensions is the number of observables so it can get very high dimensional very fast. Classical methods don't necessarily scale that well in high dimensions (curse of dimensionality: cost is exponential in dimensions), but using neural networks does very well. This gives rise to the physics-informed neural network and deep backwards stochastic differential equation approaches which will likely be driving a lot of future HPC applications in a way that blends physical equations with neural network approaches. We recently released a library, NeuralPDE [1], which utilizes a lot of these approaches to solve what were traditionally difficult equations in an automated form. I think the future is bright for scientific machine learning!

[1] https://neuralpde.sciml.ai/dev/


This is fascinating. ELI5: how does this work? (I'm couldn't find references on the linked site)

Let's say I supply a high-dimensional DAE, f(x', x, z) = 0, x(0) = x₀, where classical methods like quadrature are unwieldy. Does the algorithm generate n samples in the solution space by integrating n times and then fitting an NN? With different initial conditions? Or does it perform quadrature with NNs instead of polynomial basis functions?


A lot of these methods here utilize the universal differential equation framework described here: https://arxiv.org/abs/2001.04385 . Specifically, the last example in this preprint describes how high dimensional parabolic PDEs can be solving using neural networks inside of a specific SDE (derivation in the supplemental). Discrete physics-informed neural networks also are a subset of this methodology.

The other subset of methods, continuous physics-informed neural networks, are described in https://www.sciencedirect.com/science/article/pii/S002199911... .

For a very basic introduction, I wrote some lecture notes on how this is done for a simple ODE with code examples: https://mitmath.github.io/18S096SciML/lecture2/ml


These methods are really interesting for high-dimensional PDE (like HJB), but there's a ton of skepticism about the applicability of NN models for solving the more common PDE that arise in physical sciences and engineering.

The tests are rarely equivalent, in that standard PDE technology can move to new domains, boundary conditions, materials, etc., without new training phases. If one needs to solve many nearby problems, there are many established techniques for leveraging that similarity. There is active research on ML to refine these techniques, but it isn't a silver bullet.

Far more exciting, IMO, is to use known methods for representing (reference-frame invariant and entropy-compatible) constitutive relations while training their form from observations of the PDE, and to do so using multiscale modeling in which a fine-scale simulation (e.g., atomistic or grain-resolving for granular/composite media) is used to train/support multiscale constitutive relations. In this approach, the PDEs are still solved by "standard" methods such as finite element or finite volume, and thus can be designed with desired accuracy and exact conservation/compatibility properties and generalize immediately to new domains/boundary conditions, but the trained constitutive models are better able to represent real materials.

A good overview paper on ML in the context of multiscale modeling: https://arxiv.org/pdf/2006.02619.pdf


Yes, and our recent work https://arxiv.org/abs/2001.04385 gives a fairly general form for how to mix known scientific structural knowledge directly with machine learning. In fact, some of these PDE solvers are just instantiations of specific choices of universal differential equations. I agree that in many cases the "fully uninformed" physics-informed neural network won't work well, but we need to fully optimize a library with all of the training techniques possible in order to prove that, which is what we plan to do. In the end, I think PINNs will be most applicable to (1) non-local PDEs where classical methods have not fared well, so things like fractional differential equations, and (2) very high dimensional PDEs, like 100's of dimensions, but paired with constraints on the architecture to preserve physical quantities and relationships. But of course, something like a fractional differential equation is not an example for the first pages of tutorials since they are quite niche equations to solve!


You've got a lot of broken references (??) in that preprint, BTW.

I think I understand why you're putting in the learned derivative operator, but I think it's rarely desirable. Computing derivatives with compatibility properties is a well-studied domain (e.g., finite element exterior calculus), as is tensor invariance theory (e.g., Zheng 1994, though this subject is sorely in need of a modern software-centric review). When the exact theory is known and readily computable, it's hard to see science/engineering value in "learned" surrogates that merely approximate the symmetries.

More generally, it is disheartening to see trends that would conflate discretization errors with modeling errors, lest it bring back the chaos of early turbulence modeling days that prompted this 1986 Editorial Policy Statement for the Journal of Fluids Engineering. https://jedbrown.org/files/RoacheGhiaWhite-JFEEditorialPolic...


>When the exact theory is known and readily computable, it's hard to see science/engineering value in "learned" surrogates that merely approximate the symmetries.

I completely agree, which is why the approach I am taking is to only utilize surrogates to think which are unknown or do not have an exact theory. I don't think surrogates will be more efficient than methods developed that exploit specific properties of the problem. In fact, I think the recent proof of convergence for PINNs simultaneously demonstrates this might be an issue (there was no upper bound to the proved convergence rate, but the one they could prove was low order).

>More generally, it is disheartening to see trends that would conflate discretization errors with modeling errors, lest it bring back the chaos of early turbulence modeling days that prompted this 1986 Editorial Policy Statement for the Journal of Fluids Engineering. https://jedbrown.org/files/RoacheGhiaWhite-JFEEditorialPolic....

Agree, this is a difficult issue with approaches that augment numerical approaches with data-driven components. There are ways to validate these trained components independent of the training data (i.e. by using other data), but validation will always be more difficult.


With enough coaxing, we can get the optimizer to converge to known methods (high-order, conservative, entropy-stable, ...), and I'm sure this tactic will lead to more papers, though they'll be kind of empty unless we're really discovering good methods that were not previously known.

I presume you meant "verify" in the last sentence.


No, what I am doing is using high order, conservative (universal DAEs), strong-stability preserving, etc. discretizations for the numerics but utilizing neural networks to represent unknown quantities to transform it into a functional inverse problem. In the discussion of the HJB equation, we mention that we solve the equation by writing down an SDE such that the solution to the functional inverse problem gives the PDE's solution, and then utilize adaptive, high order, implicit, etc. SDE integrators on the inverse problem. Essentially the idea is to utilize neural networks in conjunction with all of the classical tricks you can, making the neural network have to perform as small of a job as possible. It does not need to learn good methods if you have already designed the training problem to utilize those kinds of discretizations: you just need a methodology to differentiate through your FEM, FVM, discrete Galarkin, implicit ODE solver, Gaussian quadrature, etc. algorithms to augment the full algorithm with neural networks, which is precisely what we are building.

So I completely agree with you that throwing away classical knowledge won't go very far, which is why that's not what we're doing. We utilizing neural networks within and on top of classical methods to try and solve problems where they have not traditionally performed well, or utilizing it to cover epistemic uncertainty from model misspecification.


This looks really interesting.

I think it would be a good topic for a blog post or teaching paper that shows how to do this for very simple problems "end-to-end" (e.g. advection eqt, diffusion eq, advection-diffusion, burgers eqt., poisson eqt, etc.).

I see the appeal in showing that these can be used for very complex problems, but what I want to understand is what are the trade-offs for the most basic hyperbolic, parabolic, and elliptic one-dimensional problems. What's the accuracy? What's the order of convergence in practice? Are there tight upper bounds? (does that even matter?), what's the performance, how does the performance scale with the number of degrees of freedom, what does a good training pipeline look like, what's the cost of training, inference, etc.

There are well-understood methods that are optimal for all of the problems above. Knowing that you can apply these NN for problems without optimal methods is good, but I'd be more convinced that this is not just "NN-all-the-things hype" if I were to understand how these methods fair against problems for which optimal methods are indeed available.


No, it will not work well without the optimal method. But the method is no longer optimal if say a nonlinear term is added to these equations, so you can use the "optimal" method as a starting point and then try to nudge towards something better. Don't throw away any information that you have.


This comment sounds good. I was objecting to approaches like Eq 10 of your paper and much of the Karniadakis approach.


Cool example, thanks!


this is very cool.

I was thinking specifically of this and related approaches https://arxiv.org/abs/1909.08423 where they search for the ground state by iteratively using an MCMC sampler and doing SGD. The innovation is a network architecture that takes classic approaches from physics and judiciously replaces parts with flexible NNs.

I had not even considered how things might work if you actually want to think about time.

Do you know if anybody has been running this NN+DiffEq solver stuff on big HPC systems that also have GPUs? If you know of any papers where they tried this, would be interesting to look at.


I see a Poisson solver in the docs.

Is there a paper comparing the performance of this particular solver against the state of the art ?

(if you are using GPUs, the AmgX library has a finite-difference solver for Poisson in their examples - very far from the state of the art, but a comparison might put performance in perspective)


Almost every time a PDE is solved on a computer, it is a variational problem. Maybe neural networks are indeed good at this but I haven't seen any literature that shows that it is provably better. A reference would be good, especially to this point "But neural networks are very good parametric function approximators, generally better than what traditionally gets used in physics (b-splines or whatever)."


https://arxiv.org/abs/1909.08423 and https://arxiv.org/abs/1909.02487 are some examples I've been looking at recently.


Thanks, not familiar with QM at all, but it seems to me from glancing through one of the papers that the neural network is used to replace a popular way of representing the wave function which itself is an Ansatz. Not very convincing, but of course, as I said not familiar with the background and so I may be overlooking something.


that's exactly it -- they take an existing form for the ansatz (or the general idea of it, at least), and make it more flexible by replacing pieces with neural networks that have many more parameters, while maintaining constraints required by physics. I think this will become very common in the future.


That maybe true but what I was looking for is a more convincing way of showing that a neural network approximates a function better than other functions, such as say a b-spline. For example, you say that the neural network with many more parameters works better, but what if we had a b-spline with many more nodes.


I don't know anything about anything but I'm willing to bet that the end result is very similar. They're "just" using neural networks as rich approximators.


I'm an author of one of the arXiv papers above. One thing to consider is that the approximative power of a given parametric function is not the only criterium. Being able to optimize that function efficiently is as important. Neural networks excel in this. So the comparison you ask for most likely won't appear, because any other parametric ansatz with tens of thousands (or more) parameters would be impossible to optimize. At the least that's the case in quantum Monte Carlo, the domain of our paper. As for "provable", I also don't think that will appear. All the exact theorems about neural networks are way too abstract to be applicable to practical problems.


> Almost every time a PDE is solved on a computer, it is a variational problem.

Not true. In computational fluid dynamics, variational methods are only one category out of many, and they aren't dominant.


Its usually only finite difference methods that are not variational. But finite difference is dominant in academia, not in industry. And that is changing as well with methods such as the discontinuous Galerkin method. The more popular finite volume method in industry, can also be seen as a variational problem.

Yes, I exaggerated when I said that, but its still mostly variational problems.


> The more popular finite volume method in industry, can also be seen as a variational problem.

By that standard, you could interpret almost any numerical method for PDEs used in academia or industry as variational (aside from some fringe ones). By "variational" I mean methods which are designed in a variational way from the start, not can be merely interpreted variationally.


Well, it helps to see these connections. For example, realising that the lowest order DG method is finite volume lets one think about how to extend well studied finite volume properties to high order DG methods.


So the idea of surrogate models (for parameter estimation) has been around for some time, where f(x, θ) is some (computationally) simplified model of a complex model/simulation (x = factors, θ = parameters).

f can be any arbitrary choice that works.

Not sure if the choice of f being a NN is necessarily related to AI, where some cognitive function is being replicated. It is a good function approximator though.


Why ought machine learning be boring to anyone but specialists? Does this imply that specialists ought to be born, rather than become specialists out of interest?


There are lots of things that I think are comparable to machine learning in the sense that they combine applied math and heavy computation and are very practically important, like simulating chemical reactions, solving operations research problems, or computational fluid dynamics. You cannot talk about these things at cocktail parties, though, because people will slowly shuffle away from you -- whereas you can talk about deep learning, which is odd.

Basically, I think if somebody wants to work in machine learning then they should be encouraged, and I think it's great that barriers to entry are lower than most fields, but the average person should not feel like they need to care about it, and if they do it might be because they have an inaccurate narrative.


>You cannot talk about these things at cocktail parties, though, because people will slowly shuffle away from you -- whereas you can talk about deep learning, which is odd.

It's really not odd at all. The average person has some familiarity with ML/AI, so you don't have to expend the energy to introduce them to the topic in a way that is understandable and also engaging to them. They already have a baseline, and are likely already aware of some interesting use cases. By contrast, they might not even know what "operations research" is, so you have to be both willing and able to expend the energy to explain the field in a way that is comprehensible and interesting. I'm sure it's possible, but the cross-section of people with the knowledge, the interest, and the social graces to do it is probably small.

To me it seems that a large swath of the science community dislikes buzzwordification and pop science more than they like proliferation of knowledge, based on how negative responses seem to be to things like normie interest in AI here. I would be fascinated to read any peer reviewed studies on the negative impacts of pop science on long term scientific advancement, so that I could understand this bias (and debunk my own bias that more interest in science is better in the long term).


I interpreted this to mean something like 'how phones / car engines / etc work' is not of interest to most of their users as long as they get the job done. If they get 'interesting' it can means that something isn't working right. Where 'interesting' = 'suddenly noticeable'.


I don't think they mean ML in general is boring, just that this particular application of it isn't particularly flashy.


Maybe I misinterpreted this:

>it will be boring to anyone but specialists, as the rest of machine learning ought to be.


Thank you. More importantly, this is not new. Sheesh, how much of AI is just hype. I am in favor in not using the term AI for such things.


> John McCarthy, the Father of AI, famously said: "As soon as it works, no one calls it AI any more." Leading researcher Rodney Brooks says "Every time we figure out a piece of it, it stops being magical; we say, 'Oh, that's just a computation. '"

https://cacm.acm.org/blogs/blog-cacm/138907-john-mccarthy/fu...


this is an extremely overused fortune-cookie like quote. There's a legitimate distinction to be made between intelligence on one hand, and simply computation or calculation on the other. If we start calling every numerical method AI we're rendering the term meaningless.


None of existing AI, ML, DL, RL algorithms is intelligence either.


What’s the distinction?


In the most basic sense intelligence involves the aqcuisition of knowledge, which is representation or generalisation at some higher level of abstraction and the ability to make decisions.

The mere ability to perform computational work is something virtually even the tiniest piece of hardware entails, or even an abacus for that matter.


intelligence is inductive

finding correlations and using a human to filter the interesting ones from the flukes doesn't make the correlations engine an ai, the intelligence is still in the human


The power that I see in machine learning is the techniques being developed to handle the unavoidable noise in empirical data. I think that poses a large obstacle for traditional techniques although I am not familiar enough to compare.


To me the value is in matching relationships (equations) of curated parameters from empirical data and using simulated recreations of the experiment as the objective. As soon as you can recreate experimental results in a simulation then you’ve made a successful model for that domain. This is an incredibly important and difficult task for fluid dynamics and plasma physics.


Fusion is right around the corner!


Is this AI taking the place of preconditioners as used in iterative solvers?

Would an AI be able to learn how to apply multigrid methods?


We have prototypes in Julia for this. The answer is yes, there are tricks you can use to do this effectively.


Could you elaborate a bit more on this? I have thought about employing NNs in this way for quite a while but the thing I never wrapped my head around was ensuring how it generalizes to different problems.


This is something I've wondered about (along with potential applications of autograd outside of deep learning). Do you have a recommended starting point for someone who wants to learn more about this?


I'm studying in the intersection of physics and data science, and I think there's a number of places where physics can benefit from ML. From my current point of view though, most of these applications lie more on the experimental/computational sides of physics rather than the theoretical side. One of the current use cases is using ML to aid in the processing and analysis of data obtained from experiments.

I would like to see more truly innovative work done on the theoretical side, but I don't think we'll see "AI" bridge the gap between QFT and GR any time soon. I think in order for something like that to happen we need a new approach, as the current approach of throwing deep learning models at it doesn't feel like the right answer.

On a more general note, the SciML organization [1] has been quite successful in helping incorporating more ML into science.

[1] https://sciml.ai/


I agree that the potential impact of ML on the theoretical side is very exciting. I think there’s a lot of bridging to be done between the most advanced mathematics and the most advanced physics that could lead to new insight, but it’s a hard problem for humans to tackle since we have very few people who are deeply proficient in both—although it is becoming more common. I’m thinking something like GPT-3 trained on literature in both fields could be the kind of thing we want, but like you I still doubt that a DL system is likely to come up with any real insight. I’d like to be proven wrong, though.


GPT-3 is already not too bad with basic physics:

https://www.lesswrong.com/posts/L5JSMZQvkBAx9MD5A/is-gpt-3-c...

And this is without training on the specific task. It's getting scary...


On the theoretical side, ML can be used to find a conceptual pattern in the existing literature. E.g. here's a paragraph describing a novel idea, go read all physics (and beyond) papers and find those that describe similar ideas.


Really cool! What problem are you working on? I live on the experimental side. At least in condensed matter, there are people having fun on the theory side as well.


I'm not working on any specific problem yet, but for my master's thesis I'm hoping to do something related to the use of neural networks in numerical solutions to differential equations. Along the lines of this sort of stuff [2].

[2] https://arxiv.org/pdf/2001.04385.pdf


There were some interesting talks on neural differentiation applied to physics at ICLR. You probably saw: https://arxiv.org/abs/1906.01563

Very fun!


I came across that paper just recently, it was a very good read!


I am actually surprised this is not more mainstream.

20 years ago I wrote my PhD thesis in physics, using genetic algorithms and neural networks to "guess" some basic physical behaviour in particle physics.

It was difficult to find good reporters because the application was quite exotic but I felt that this is something which would be worth investigating. I quit academia afterwards and did not come back - but I am happy to see that this road is back on the radar.


I wrote my diploma thesis 10 years ago and had to do a lot of pen and paper calculations. Actually it was kind of standard stuff (Lagrangians of Standard model, calculating parametrized decay widths) At that time I really hoped I could automatize the error-prone steps of plugging in and simplifying equations but I found nothing, except for isolated steps. Maybe this is also due to the fact that the most powerful tools for manipulating symbolic expressions are closed source. Not sure how it is now but as long as these tools are not expressive enough to work "end-to-end" with SM Lagrange densities, I doubt anything innovative could be done by automatizing that with AI.


That problem of pen-and-paper calculations featuring unintended errors is what I try addressing in a project I work on [1]. My approach is to use Sympy (which has a lot of Physics support) to validate expressions entered by a human. Not quite the AI-focus of this thread, but still a machine augmenting the work of researchers. To your point about the complexity of the math, the Physics Derivation Graph is able to handle simple inference rules but there's nothing preventing more advanced use.

[1] https://derivationmap.net/


What field or occupation followed for you?


During my time in CERN I discovered system administration and joined a large company to be initially in charge for the IT aspects of the R&D divisions in Europe.

This then extended to greenfield operations, and finally I moved to information security.


There's a ML group at Fermilab just outside Chicago working on ML applications in high energy physics and astrophysics.

https://computing.fnal.gov/machine-learning/

One of the "AI" applications I remember seeing -- potentially applicable outside physics -- involved using CNNs to read a 2D graph (as in graphical plot, not G = (V,E)) in order to visually detect certain patterns/aberration. (probably many physics groups around the world are doing the same)

At first glance this sounds kind of silly and trivial -- one might say, why not just detect those patterns from the data arrays directly? Instead of from a bitmap image of a plot of the data?

Unfortunately some patterns are contextual. A trained human eye can detect them easily, while writing a foolproof mathematical algorithm is difficult: e.g. it has to pick out the pattern, apply a bunch of exclusion rules etc.

(One instance of this, for example, is an old mechanic telling you what's going on under the hood just from listening the vibrations of a car, while a traditional DSP algorithm might not be able to do it as reliably because it hasn't seen all the patterns and contexts in which those sounds arise.)

This is a domain where neural networks/transfer learning really shines. It can capture "intuition" by learning the surrounding context, rather than relying on handcrafted features.

So Fermilab has an AI algorithm that looks at millions of graphs via a CNN, which replicates the work of thousands of human physicists looking for patterns. We've already seen examples of this in radiology.


Makes sense. A graph can be represented by a matrix which is what an image is.


Images and matrices are 2D data structures of numbers, but that is where the similarities end. An image is more like a vector, which matrices can be applied to. You would never matrix multiply an image onto another vector. Still, it isn’t uncommon to visualize matrices as images.


Well a matrix is a collection of vectors so... I guess I somewhat agree.. You can certainly apply projections to images, I mean this is what photoshop does.


> Well a matrix is a collection of vectors so

That's like saying "a matrix is a collection of real numbers, so anything you say about one applies to the other".

> You can certainly apply projections to images, I mean this is what photoshop does.

This doesn't seem to refer to anything in the comment you're replying to.


Would you please elaborate on your last point?


In reply to a comment that said nothing about projections, you wrote:

> You can certainly apply projections to images, I mean this is what photoshop does.

What's the relationship of this to anything in the comment you replied to?


"You would never matrix multiply an image onto another vector."


> "You would never matrix multiply an image onto another vector."

That wasn't me. But I can still elaborate: while you can certainly consider a non-color image as a matrix, the operation of multiplying this matrix with a vector is rather meaningless.

While a lot of things can be made into or viewed as matrices, a matrix is typically only meaningful as a representation of a linear map.


Even if you never matrix multiply with an image, it's still useful to have it in matrix form for other things like PCA/SVD.


Typical use on PCA/SVD on image, what you do is treat each image as a vector, create a matrix out of a collection of images, and then do PCA/SVD on the matrix to analyze the distribution of the images, normalize, get the eigen-images (principal components), etc.


Yeah, in retrospect that seems like the way to do it. The toy examples I learned from in college did it on a single image split up by row, but I can’t think of a great use case for that besides some naive compression.


An image "is" not a matrix. Yes, the values of each pixel can be considered an entry of a matrix, but that's where the similarity ends. Unless the graph is very dense, doing convolutions on the image representing the graph is a pretty arbitrary thing to do. Graph CNNs exist and make a lot more sense in general.


I think the terminology may have been confusing here. I meant graphs in the sense of graphical plots, not graph theory graphs.


Ah, my bad.


>using CNNs to read a 2D graph

Why don't they use CNNs on the data series itself?


I believe in effect that is what they do in execution.

The 2D plot is more for training -- a human physicist picks out visual artifacts of interest to bootstrap the training. Humans of course can see blips and weird curves better on a visualization than in a pure data series.

For instance, a human can say if this little tail off a contour bends this way, it's right; if it bends in a different way, it's wrong. Or if an contour is "prickly" or "blobby".

Whether something "looks right" or "wrong" is really hard to mathematically reduce to a parsimonious description, especially when there's variance in the samples -- after all, there could be multiple subtle descriptions of "looks right" or "looks wrong" -- but a CNN is perfect for generalizing based on labeled samples. Similar to a radiologist looking at a scan and toggling isTumor = (True, False).


> If AI is like Columbus, computing power is Santa Maria

Does that mean when AI finally arrives, it slaughters all of us?


A lot of the death was due to the introduction of new diseases, so maybe AI is to blame for Covid-19?

Also, in the similie I think humanity is supposed to be the Old world, so I'm really wondering who we're supposed to find and enslave ...


Everybody thinks they're the center of the universe. Sometimes they're right, for a time. When they decide they were wrong, they tear down the old statues, if only to make the new overlords feel welcome...


I am not working in AI so I only know what I read here or on other sites etc. There seems to be a lot of buzz for AI and ML. But where actually are these techs succeeding currently? I feel like there is supposed to be a revolution going on everywhere but anywhere I look, it's just plans and press releases...


One interesting facets of the hype cycle is that it never focuses on what IS being done, because it's boring. Plus, tech improvements like this tend to be pretty operational in nature, so you wouldn't notice it without looking.

Nonetheless, "AI" applications are pervasive:

Auto - improved robotics, adaptive cruise control Finance - High Frequency Trading, Credit Risk Modeling (i.e. your Credit Score) Health Care - Health insurance risk estimation, Predictive staffing Government - Predictive policing, recidivism risk, benefits decisions Retail - improved customer targeting, inventory management etc etc etc - name an industry, I'll give you 3 examples.

The issue isn't that it's not there; the issue is that it's BORING. And nobody gives a press release saying "we saved 0.4% of COGS from improved inventory demand forecasts," even if that represents $10M, because nobody cares.

But boring doesn't mean it's not a bazillion dollar opportunity for a lot of companies.


Siri, Google Assistant, speech detection, speech generation, textual photo library search, similar data augmentations for web search, Google Translate, recommendation algorithms, phone cameras, server cooling optimization, phone touch screens touch detection, video game upscaling, noise reduction in web calls, file prefetching, Google Maps, OCR, etc.

AI has already won, most people just don't realize it.


Won what? Is there a competition? Humans still have jobs. Humans are still politicians, judges, CEOs, generals. They even still play chess!

All those successful forms of AI are narrow, not the AGI of science fiction (like Data, Skynet, HAL) or Ray Kurzweil predictions. AI is a tool humans use to extend human capabilities. It always has been. Maybe someday it will be something more.


Won a place in the software stack, alongside traditional software approaches, much like the GPU won, and became a second pillar of computing.


Yes, thank you. There has already been a revolution over the past 5 years or so and many things that had been too audacious for science fiction became every day products. I think the ML revolution hit me personally about 5 years ago as I was able to get perfect speech recognition from my phone on a loud, crowded subway platform as a train was pulling in. I would have never thought that possible. I would have been skeptical if star trek had shown it.


I know there is this moving target where once a given piece of (allegedly) AI becomes mainstream, people claim "it's not AI".

That said, what definition of AI are you using? It seems to me you're stretching it a bit...


From my position this narrowing of the term AI to refer only to ‘real intelligence’ has always seemed like little more than an attempt to control the narrative against an astonishingly successful trend of connectionist architectures doing incredible things. Nobody complained when Pac-Man's ghosts got called AI, but now it's political.

All of what I mentioned are neural networks.


You got a point there with the neural networks. Though I never considered the ghosts from Pac-Man to have an AI. And why are you framing this as a "control the narrative"/political argument?


Predicting quantum mechanics energies of molecules using neural networks actually works, and can be used to speed up geometry optimization during drug discovery.

See:

https://arxiv.org/abs/1912.05079

https://chemrxiv.org/articles/Extending_the_Applicability_of...


well, this is not predicting "quantum mechanics energies", it's just parametrizing the molecular bond interaction potential with a neural network instead of an analytic function (such as e.g. a Lennard-Jones potential).

It's nice, but not really quantum-mechanics level (which is maybe HF, DFT or coupled cluster), which takes a lot more cycles (but also allows to optimize geometries without knowing wether a bond exists)


These neural network models do not need to know whether a bond exists - in fact, they have no concept of bond topology. They are designed to be a drop-in replacement for DFT in terms of energies and forces. The only inputs are XYZ coordinates and chemical element labels for the nuclei (and, in the near future, net charge of the system).


with the non-bonded interactions parametrized with dimers it might work... sometimes (might be good enough for a lot of things though)


In physics it just recently became mainstream to try experimenting with/incorporating ML into thesis projects. Most stuff ive seen it used for is signal processing related. An example might be particle track reconstruction in a time projection chamber with ML instead of a hough transform. I think it's inevitable that these methods will grow in application, but the two biggest problems right now in my opinion are reproducability and quantification of uncertainties. It's much easier to believe someone's stated uncertainties when you can see the analytic functions they were propagated through. There are ways to kindof work around this, but in my mind those two points are the main things holding back ML from broader applications in science. The article talks about ML tools closer to proof assistants / tools for experimentally driven mathematics. Less of a problem in that domain since the ML model only need make an interesting conjecture which can then be examined the traditional way.


I agree with you on UQ. For example, I have seen a couple of talks in my field of neutron scattering, where people are using denoising autoencooders to remove artifacts and fit data. It's also clear that no one has any idea how this effects the uncertainties on parameters for models that are fit on the denoised data, much less what happens if the models are not appropriate for the data.

I think reproducibility can be tackled--at least some journals (shameless plug--I'm a lowly associate editor on science advance) are strongly encouraging people include data/code with publications. I have reviewed papers in Nature Comput. Materials where people have included data/jupyter notebooks (not perfect, but a very good start). It would be great if funding agencies started adding more teeth to requirements on data sharing. However, many more groups are putting their code on Github.


It becomes easier to take the progress seriously and understand it when you drop the "intelligence"-style labels which misleads people into thinking something is there that isn't.

Machine "learning" isn't ideal either, but is at least a bit more limited in the scope of what it conveys.

Once you leave the hype baggage behind, it's more easy to see the significant progress that these tools - in concert with increased power and data resources - have made in many different areas over the last few years, some of them listed elsewhere in the answers to your question.


I think this is down to the loudest, most ambitious projects ("AGI! Fully autonomous vehicles!") getting a lot of press. The reality is, production ML is basically everywhere already:

- Basically every piece of software that makes recommendations (Netflix, Google, Facebook, YouTube, Instagram, TikTok, etc.) uses machine learning.

- Anything that makes time series forecasts (Uber/Maps ETA prediction, Walmart's 2 hour delivery, etc.) uses machine learning.

- All the most popular speech-to-text assistants (Alexa, Google Assistant, Siri) use machine learning.

- Smartphone cameras use machine learning to enhance picture quality.

- A lot of very highly-used security monitoring solutions (Stripe's fraud detection, CloudFlare's bot detection, etc.) rely on machine learning.

- A surprising number of physical commerce-type situations rely on machine learning (autonomous filling stations, for example, are pretty common in the trucking industry).

- A lot of smart image manipulation tools (Instagram/SnapChat filters, etc.) rely on deep learning.

- Email clients, particularly Gmail, use machine learning for spam filtering and for things like Smart Compose.

- Some infrastructure products use machine learning, as in the case of EC2's predictive autoscaling.

And those are just hyper-scale examples. There's a ton earlier-stage-but-still-in-production projects doing awesome things with ML:

- Wildlife Protection Solutions legitimately doubled their detection rate of poachers in nature preserves with ML.

- PostEra, Benevolent AI, and a bunch of other ML-based medicine platforms (medicinal chemistry, drug discovery, etc.) have already had exciting results.

- There are a bunch of startups building industry-specific APIs out of models, like Glisten.ai, that are already profitable.

- A number of computer vision products have been brought to market in the healthcare space—Ezra.ai screens full-body MRIs for cancers, SkinVision detects melanomas.

- ML-powered chatbots are a pretty huge market. Olivia (a financial assistant) has something like 500k users. AdmitHub has successfully lowered summer melt (the attrition of college-intending students between spring and fall) at a bunch of colleges. Rasa is an entire platform that helps startups build NLP-powered bots.

Sorry that went a bit long, but basically, the production ML space is incredibly deep, and spans most industries/company sizes. Unfortunately, press coverage of ML tends to treat it as if it's this mystic, sci-fi future technology, and as a result, this "Show me AGI or it's snake oil" mindset naturally emerges.


Any type of preditive analysis in high dimensional data (medical imaging, surveillance, remote sensing, machine translation, speech recognition/synthesis, music information retrieval). Other important work being done on causality, AI ethics/safety, and explainability (XAI) but little industry impact yet.


Speech recognition, google search engine, autonomous driving are just some of the areas we are seeing major advances in thanks to ML methods.

Expert system approach to search and speech reckognition never worked well.

Digital assistance predicting that it needs to remind you about an upcoming flight, going to work etc are other examples.


Well.. for starters with machine learning, we can automatically make anyone naked by just applying a few algorithms and an ML model to a picture of a clothed person. Depending on who you are talking to that’s quite a break through for some people.


I always assumed most audio agents (e.g. Siri) use some form of AI and/or ML. And that Google search results probably has some somewhere in their pipeline. But don't know for sure.


just being jaded , all the money is in shareholders and investors pockets. Because once you have a disruptive AI-based startup, you have become englightened and now are on the course to change the future of human race with your amazing AI. (AI? i meant series of if statements and linear regression).


AI means you have managed to throw a shitton of processing power at a problem and P-hack the shit out your results so it shows you made a significant improvement.


> But where actually are these techs succeeding currently?

Marketing.


IMO not a revolution, but I can see a solid evolution. My reading of the work on embedding ML into physical models so far is that the best strategy is to take it as far as possible with the standard physics approach of abstraction and reduction, and once you exhaust that, apply ML to solve the remaining (often crucial) complex behavior.


There are a lot of cool advances in AI and physics. In my particular field of condensed matter physics, a number come to mind. One is trying to automatically extract synthesis recipes from the literature. Imagine that you want to see how people have synthesized a given solid state compound. Then searching through the literature can be painful. A great collaboration from MIT/Berkeley did this using NLP. I don't know what blood oaths they signed, but they were able to obtain a huge corpus of articles. But, how to know if an article contains a synthesis recipe? They set up their internal version of Mechanical Turk and had their students label a number of articles. Then they had to find the recipes, represent them as a DAG, etc. They have now incorporated the result with the Materials project (https://materialsproject.org/apps/synthesis/#).

There are groups that are using graph neural networks to understand statistical mechanics and microscopy. There are also a number of groups working on trying to automate synthesis (most of it is Gaussian process based, a handful of us are trying reinforcement learning--it's painful). On the theory side, there is work speeding up simulation efforts (ex. DFT functionals) as well as determining if models and experiment agree (Eun Ah Kim rocks!).

Outside of my field, there has been a push with Lagrangian/Hamiltonian NNs that is really cool in that you get interpretability for "free" when you encode physics into the structure of the network. Back to my field, Patrick Riley (Google) has played with this in the context of encoding symmetries in a material into the structure of NNs.

There are of course challenges. In some fields, there is a huge amount of data--in others, we have relatively small data, but rich models. There are questions on what are the correct representations to use. Not to mention the usual issues of trust/interpretability. There's also a question of talent given opportunities in industry.


GPT3 + replication crisis = huge volume of scientific papers produced, but nobody can know if they're accurate or not.

Landmark to watch for will be when the first GPT-generated paper gets a citation in a human-authored paper without the human realising.


> For this they use so called neural graph networks (GNN). These neural networks rely on graphs instead of layers arranged one after the other.

This affirmation shows the author has little idea about GNNs. GNNs have layers, and each layer is a graph. In order to implement the graph GNNs use the adjacency matrix to propagate information along the edges. But there are multiple layers of GNN, without multiple layers they would not be able to do multi-hop inferences.


Columbus is probably not the best character to use for analogies...

"If AI is like Columbus, computing power is Santa Maria"

and intractable physics problems are like... indigenous people?


As someone who has worked on ADAS software and saw a simple un-optimized ML object detector beat a custom hardware solution at both speed and accuracy, I can honestly say machine learning is amazing.

Just in this domain alone, excluding the 100 other applications of ML, and the fact that we haven't even begun optimization in earnest, I certainly believe ML will change the direction of computing. It already has: look at where investment and research dollars have gone. (not to say that trends don't happen, but when I saw the performance results I thought: sh*t, this is big.)

Add to this the rise of the qubit, and the next 50 years are going to be even crazier than the last 50.

Yes, I am a proselytizer of school of James Gleick. "Faster" was a prophecy[1].

[1] https://www.amazon.com/Faster-Acceleration-Just-About-Everyt...


I've often thought that maybe the reason we can't get a quantized theory of gravity is that it's too complicated for human brains rather than we need a bigger accelerator. You might be able to get somewhere with a brute force type approach of almost randomly coming up with equations for a theory and then trying to see if they make any sense and predict anything interesting. I suspect a breakthrough may be like AlphaGo's move 37 where it leaves the humans saying wow what happened there? https://www.huffpost.com/entry/move-37-or-how-ai-can-change-...


Shameless plug--The American Physical Society has a topical group on Data Science. Since our annual meeting was cancelled due to Covid, we've been running a free series of webinars on data science and physics: https://www.youtube.com/channel/UCfPG-nSsgnFeWuzgPcbKlCw/vid...

If anyone is interested, we have one on data science in industry coming up: https://attendee.gotowebinar.com/register/604483936035643777...


> If you want to read a linear function from the data in a two-dimensional coordinate system in math lessons, you can do it in five minutes - or quickly watch a video on YouTube.The situation is different for more complex tasks: Physicists, for example, have been trying to combine quantum theory and relativity theory for almost a hundred years. And if this succeeds, it could take generations to clarify the effects, says physicist Lee Smolin .

What on Earth does that paragraph mean? Parts of the article read to me like they were generated automatically, but other parts don't.


The author has no idea what he's writing about, calling it a "graphene" network, and several awkward phrasings about dark matter etc. Read the papers instead.


Re: "scientific progress could be bound by Moore's law and increase so much."

Moore's law appears to be slumping lately.

Re: "This coincides with our previous experience in physics, says Cranmer: "The language of simple symbolic models describes the universe correctly."

As an approximation, yes, but that doesn't mean a "true" formula has necessarily been found.


> Moore's law appears to be slumping lately.

Not so. https://docs.google.com/spreadsheets/d/1NNOqbJfcISFyMd0EsSrh...


I think it's more of an engineering revolution. The opaqueness of (at least current) machine learning means we won't really enhance our understanding of the universe, just our ability to predict it.

Some people would argue that these things are one, I think otherwise.


Not sure about if this is really science. Physical formula are derived from known physical laws in order to understand the original of the phenomenon. If theorists are allowed make up arbitrary formula of course it can fit the data with less error.


The article confuses me. I was doing symbolic genetic algorithms to derive formulas back in the mid-90s so that's not new. But this seems to suggest a combined genetic algorithm/NN approach is being used. Curious to see the underlying paper.


I'm also interested to see how this is different from generalized additive models (GAMs - not GANs). It seems to be the same principle except with a genetic mutation and selection aspect.


What is the purpose of the neural network and how does that help generate the symbolic regression using genetic algorithms?

Are they somehow using the parameters of the ANN to seed the generic algorithms (and structure)?


The site is throwing a security error for me: PR_CONNECT_RESET_ERROR Anybody else have the same issue? Or is the site just being hugged to death.


Do you use (willingly or not) any proxy, including an antivirus? The problem might be there.


I'm OK (on mobile, android and chrome).


I think this is pretty significant. I would have guessed this to be among the very last things to be automated.


I was expecting AI to become an indispensable part of science well before it was able to turn natural language descriptions into functional code, but: https://mobile.twitter.com/sharifshameem/status/128410376521...


TLDR: Using neural network to model physical systems as black boxes and then, later, using symbolic regression (genetic algorithm to find a formula that fits a function) on the model to make it explainable and improve its generalization capacities.

The system managed to reinvent Newton's second law and find a formula to predict the density of dark matter.

(note that symbolic regression is often said to improve explainability but that, left unchecked, it tends to produce huge unwieldy formulas full of magical constants)


No, we have merely found a new and slightly better way of interpolating between (slow and properly calculated) known data points.


I work in this space (intersection of science and ML) and I can say with high certainty that Betteridge's Law[0] is likely accurate.

But then again, pretty much any article that uses AI instead of ML is hogwash too. Are we crediting someone with this one?

[0] https://en.wikipedia.org/wiki/Betteridge%27s_law_of_headline...


curve fitting is no science, no matter how deep the net goes, it's great for calculus, and obtain numerical models of what we already can measure, but all the correlation would require an human to verify and a theory to be synthesize post fact, especially if there's a margin of error or confidence, as generating infinite correlation would only result in finding models that are not there

this shows the effect of infinite dissecting data searching without a theory pretty well https://xkcd.com/882/




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: