> I really dont see any reason why this could not have been done 10 or even 20 years ago.
The advancements in tooling, infrastructure and accessibility of ML in the last 3 years alone have made the difference. That's seems obvious.
Maybe your point is that the underlying techniques haven't changed, and thus it would have been possible to have made this discovery decades ago. But isn't that true of even the greatest inventions? Much of what's created or discovered is a function of the environment and conditions surrounding it.
In other words, it's not surprising to see a halo effect in other sectors as a result of tech investment in ML.
I agree that it is exactly this. New tooling has made machine learning easier to use. As a result, people with deep domain knowledge but less machine learning expertise are starting to apply ML to the problems they understand that best.
One of the biggest roadblocks to this happening more today is that people don't know how to perform feature engineering to prepare raw data for existing machine learning algorithms. If we could automate this step, it would be a lot easier for subject matter experts to use ML.
For example, I work on an open source python library called featuretools (https://github.com/featuretools/featuretools/) that aims automated feature engineering for relational datasets. We've seen a lot of non-ml people use it make their first machine learning models. We also have demos for people interested in trying it themselves: https://www.featuretools.com/demos.
I expect to see a lot more work in the automated feature engineering space going forward.
Yes, I think so. Featuretools is actually the core of my company's commercial product.
Performance is tricky thing to answer. If you care about machine learning performance such as AUC, RSME, F1, then I think the answer would be 80%-90% of coding. If you care about building a first solution, then I think the automation would be 5-10x better.
Yeah, the grandparent is hung up on the theory vs. application delay.
By the same logic, nothing in modern CMOS logic or its production process requires physics or chemistry of a vintage later than the 1940's to explain, so why did it take us three quarters of a century to get where we are? Because it's hard. Knowing how it works and figuring out how to do it are two different things.
> Lots of feature engineering based on domain expertise
Exactly. This is what is required to make machine learning work well.
For most people, this issue with machine learning isn’t that it doesn’t work but that it’s hard to use.
I suspect that if we gave domain experts who often don’t know how to code more power to do feature engineering than we’d see a lot more applied machine learning research like this.
Ultimately, yes, more power means time aka money to pursue a target freely while messing with feature engineering. Brute force a la full DL stack is not there yet for two reasons: on one side, the space domain to search for novel materials is immense; on the other side, novel materials found through ML methods must be stable somewhere in their physics state diagram, synthesizable to be manufactured properly and cheap enough to be worth engineering deployment. The x10 process acceleration (from 20-30 years to 2-3 years) is actually in the space domain search thanks to ML methods working through several thousand experiments like in the linked article, not in the engineering readiness protocol for a candidate novel material from the lab confirmation to the real application. Outsiders can help as well by implementing their own pipeline after collecting their niche-specific datasets through journal papers, conference contributions and meeting minutes. I for example am interested in novel alloys or steels for Gen IV nuclear and now creating my own dataset for a first shot, having got a benchmark already from a known, valdated and successfully deployed material.
>I suspect that if we gave domain experts who often don’t know how to code more power to do feature engineering than we’d see a lot more applied machine learning research like this.
With a lot of talk about high paying AI whiz kids recently I wonder whether it is not much more promising to try to bring basic ML techniques into a really wide field of day-to-day business, given how many small businesses are still completely left out.
I liked this example very much. A small family business of a handful of people used standard ML to automate their process of classifying cucumbers for their business.
Just imagine how many people we could free from manual labour to seek higher education if even only a fraction of family businesses had a use case like this and every one of those farmers or small shop owners who is bogged down by repetitive classification tasks could free up the time of a family member or two. That must be tens of millions if not more people on the whole planet.
I'm sure that there is a treasure trove of ready to be applied knowledge spread out over many sciences.
Example: tue release candidate of the newest version of GIMP added a "new" type of smart blurring: symmetric nearest neighbor, which is surprisingly effective. I looked it up: it is a super simple algorithm, original paper was from 1987, yet for some reason the only mention of it that I found outside of the GIMP page describing it was a wiki for "subsurface science", so a specialisation within geology I guess.
That's not odd. German Wikipedia is one of the largest, it's about even in quality with the English one, and so you'll just s frequently find an article that's only in english as you'll find one that's only in German.
I meant that the paper is originally by English-speaking authors, meaning one would expect it to be more well known in English speaking scientific circles
I agree with the sentiment (nothing new methodically) but have a thought: these methods were in the field of computer science and operations research (maybe). The popularity of ML and data science is taking place in the same 20 yrs that every non-beta science is becoming more quantified. It takes a novel generation of researchers to combine the old with the new. ML's popularity, and ease of entry (in a broad sense, with tools and information easily available) is only helping the spreading.
Sorry, that might be too local: Beta = natural science, alpha = humanities, gamma = social science.
So my point is even humanities and social sciences are becoming more empirical (at least in subfields, and the retort that a lot of statistics got founded in humanities is well taken) and they are using the tools that are popular and widely known.
> I really dont see any reason why this could not have been done 10 or even 20 years ago.
"They started with a trove of materials data dating back more than 50 years, including the results of 6,000 experiments that searched for metallic glass. The team combed through the data with advanced machine learning algorithms developed by Wolverton and Logan Ward, a graduate student in Wolverton’s laboratory who served as co-first author of the paper."
As a researcher in this field I'll just add that in many cases, automating the mat sci workflows (the sample prep and the characterization) is a massive leap in and of itself, even without adding machine learning. The benefit of machine learning in many of these projects is to pick the automated runs optimally (choose he right neighborhood of composition space), which probably adds a 10-100x speedup on top of the already 100-1000x speedup gained from just not making and characterizing samples manually. It's truly a synergistic combination of advancements in both fields and has great potential for accelerated discovery. /shill
The paper is the first scientific result associated with a DOE-funded pilot project where SLAC is working with a Silicon Valley AI company, Citrine Informatics, to transform the way new materials are discovered and make the tools for doing that available to scientists everywhere.
This field is ebullient! Bad joke, I know, but glorious times for materials scientists, expecially when not grant-constrained and free to go deep into domain applications. One recent state-of-the-art lit review is here https://www.nature.com/articles/s41524-017-0056-5 , outdated here and there already.
Transparency is a function of wavelength/photon energy as well as the underlying material. Glass (the silica version) is opaque to portions of the UV spectrum.
Generally you'll get absorption (opacity) when the photon energy exceeds an energy gap, allowing valence band electrons to be bumped into conduction bands, creating a corresponding hole in the valence. In the case of metals, there is effectively no gap at points in k-space, so there is absorption throughout the spectrum.
I imagine a clever arrangement of atoms where non-metallic regions alternate with metallic ones could do the trick, as long as the transparent regions line up enough.
But anything this orderly doesn't look like a glass anymore.
For aluminum, you purify it, oxidize it, and then fuse it into a single crystal of corundum.
A hybrid alumina-silica glass can be created. The hybrid glass has a higher elastic modulus than other silica glass systems, and sheets of it can bend further before fractures form. This glass can be further toughened by creating it with high sodium content, then later exchanging the sodium ions with potassium ions. This type of glass is probably in your pocket right now.
Doped alumina-yttria hybrids are also useful in lasers.
This was the first thing I thought of when i read the title. Some ML nerd went "Hmm, I bet I could use ML to figure out how to make the mystical Star Trek alloy."
I read about efforts to discover compounds using random methods back in the 90s, and have been trying to research it lately, to see if the "shake and bake" method is still a thing. Can anyone point me to relevant research? I was surprised by estimates given about the number of possible compounds, so many that there would not be enough time in the universe to make and test them all, even limiting the primary elements to a dozen or so. I guess there are a multitude of ways that the same atoms can fit together. I've tried to find research on computer simulations. Apparently, only rough predictions can be made. My searches have been pretty fruitless, though, and I'd welcome help.
There's an old Kaggle competition[1] "Predicting Transparent Conductors" which had a similar objective.
There's a decent discussion of the 5th place entry[2]. Judging by a very quick read it looks like performance for methods like this could improve dramatically with larger amounts of data.
The amorphous material’s atoms are arranged every which way, much like the atoms of the glass in a window. Its glassy nature makes it stronger and lighter than today’s best steel
Why would this arrangement of atoms be stronger than a crystal lattice of steel?
From the actual paper:
"""
For example, the absence of deformation pathways based on gliding dislocations leads to exceptional yield strength and wear resistance
"""
Yield strength means "how much strain a material can handle before it fractures or otherwise breaks"
In crystalline metals, a crack that forms anywhere can propagate through the lattice quickly and lead to bulk fracture (see the southwest engine failure recently). In an amorphous material, the deformation caused by a local crack can be "absorbed" by the surrounding atoms because they're able to reposition more easily.
It does, check out its properties compared to other materials. The process of turning silica into glass makes it much stronger than other materials made of silica.
Strength, hardness, Flexibility all mean different things when engineers talk about them.
Glass is a brittle material which means it will not deform (change dimensions i.e. stretch) when you apply an external force. The opposite of brittle is a ductile material, like steel, which yields and begins to deform once you apply a load above the material's yield threshold. Depending on the material's application ductility/brittleness can be a desired property.
My only credentials are a lifetime of wikipedia, but I think it's because breaks in metals propagate along crystal boundaries. You can harden metal by cooling it quickly which results in small crystals, or compressing or stretching it (i.e. with a hammer), which breaks larger crystals into smaller crystals. Maybe with a completely amorphous structure it's like the smallest crystals ever.
I work kind of adjacent to two of the supervisors on this project at NIST, and the workflow for these types of projects usually goes:
(1) build predictive model of chemical composition -> superconductivity from past experiments or databases of simulations
(2) Build and program automated sample prep (this is actually the hardest part, not the machine learning)
(3) Build and program automated structural characterization and superconductivity measurement
The difficulty is finding a system whose design space can be explored with and measured with automated tools, otherwise the machine learning isn't used effectively. As others have noted the models are decades old in some cases, what's driving this is researchers who know how to automate traditional mat sci workflows and know enough about machine learning to pick the automated runs optimally
Every time I see "artificial intelligence" in a headline I mentally replace it with "a sleep deprived computer science grad student". The result is usually much more accurate.
(Lots of feature engineering based on domain expertise. This is not end-to-end DL)
Do a smaller set of new experiments to explore a small subset of the solution space.
Retrain the model with these new experiments.
Perform another smaller set of experiments, this time over a more varied sample of the solution space.
Overall, a x10 improvement in predicting the glass property of an un-tested sample (although the entire process is biased toward positive samples)
Conclusion: classical ML still rocks.
I really dont see any reason why this could not have been done 10 or even 20 years ago.