Thanks, this was pretty funny. I also didn't realize how many different Pokémon there are today, definitely more than in the first Pokémon game on that clunky grey Gameboy I had :)
Crispr is widely used and there are even therapies approved based on it, you can actually buy TVs that use quantum dots and click chemistry has lots of applications (bioconjugation etc.), but I don't think we have seen that impact from AlphaFold yet.
There's a lot of pharma companies and drug design startups that are actively trying to apply these methods, but I think the jury is still out for the impact it will finally have.
AlphaFold is excellent engineering, but I struggle calling this a breakthrough in science. Take T cell receptor (TCR) proteins, which are produced pseudo-randomly by somatic recombination, yielding an enormous diversity. AlphaFold's predictions for those are not useful. A breakthrough in folding would have produced rules that are universal. What was produced instead is a really good regressor in the space of proteins where some known training examples are closeby.
If I was the Nobel Committee, I would have waited a bit to see if this issue aged well. Also, in terms of giving credit, I think those who invented pairwise and multiple alignment dynamic programming algorithms deserved some recognition. AlphaFold built on top of those. They are the cornerstone of the entire field of biological sequence analysis. Interestingly, ESM was trained on raw sequences, not on multiple alignments. And while it performed worse, it generalizes better to unseen proteins like TCRs.
The value in BLAST wasn't in its (very fast) alignment implementation but in the scoring function, which produced calibrated E-values that could be used directly to decide whether matches were significant or not. As a postdoc I did an extremely careful comparison of E-values to true, known similarities, and the E-values were spot on. Apparently, NIH ran a ton of evolution simulations to calibrate those parameters.
For the curious, BLAST is very much like pairwise alignment but uses an index to speed up by avoiding attempting to align poorly scoring regions.
BLAST estimates are derived from extreme value theory and large deviations, which is a very elegant area of probability and statistics.
That's the key part, I think, being able to estimate how unique each alignment is without having to simulate the null distribution, as it was done before with FASTA.
The index also helps, but the speedup comes mostly from the other part.
Great achievement, although I think it's interesting that this Nobel prize was awarded so early, with "the greatest benefit on mankind" still outstanding. Are there already any clinically approved drugs based on AI out there I might have missed?
In comparison, the one for lithium batteries was awarded in 2019, over 30 years after the original research, when probably more than half of the world's population already used them on a daily basis.
Arguably awarding early is more in line with the intention expressed in Nobel's will: "to those who, during the preceding year, have conferred the greatest benefit to humankind". It seems to have drifted into "who did something decades ago that we're now confident enough in the global significance of to award a prize". I suspect that if the work the prize recognized reslly had to have been carried out in the preceding year the recipients would be rather different.
Given that drugs take around 10 years to get to market, and that some time is needed for industrial adoption as well, it's not very reasonable to expect clinically approved drugs before a few years.
This is really sad. A new recipe for feeding honeybees to make tastier honey could get to market in perhaps a month or two. All the chemical reactions happening in the bees gut and all the chemicals in the resulting honey are unknown, yet within a matter of weeks its being eaten.
Yet if we find a new way of combining chemicals to cure cancer, it takes a decade before most can benefit.
I feel like we don't balance our risks vs rewards well.
I think the idea is that we're, as a species, much more comfortable with the idea that 15 years down the line that 50% of treated colonies collapse in a way directly attributable to the treatment than we are with the idea that 15 years down the line 50% of treated humans die in a way directly attributable to the treatment.
Now if the human alternative to treatment is to die anyway than i think that balance shifts. I do think we should be somewhat liberal with experimental treatments for patients in dire need, but you have to also understand that experimental treatments can just be really expensive which limits either the people who can afford it, or if it's given for free, the amount the researcher can make/perform/provide.
10 years is a very long time. I've had close family members die of cancer and any opportunity for treatment (read: hope) is good in my opinion. But i wouldn't say there's no reason that it takes so long
I was a bit surprised when they initially announced this, there was also some discussion here around them making the shirts themselves, where people were quite skeptical:
Quite funny. I haven't used Google search in a while, is this just some artifact of Gemini + RAG taking search results from the internet at face value?
Great article with very nice examples, really tempts me to play around with some of these techniques myself!
I saw an exhibition of Refik Anadol's Nature Dreams last year in Copenhagen, can only recommend to see one of his video installations if there is one near you, they are quite mesmerizing:
https://refikanadol.com/works/
> AlphaFold 3 will be available as a non-commercial usage only server at https://www.alphafoldserver.com, with restrictions on allowed ligands and covalent modifications. Pseudocode describing the algorithms is available in the Supplementary Information. Code is not provided.
How easy/hard would be for the scientific community to come up with an "OpenFold" model which is pretty much AF3 but fully open source and without restrictions in it?
I can image training will be expensive, but I don't think it will be at a GPT-4 level of expensive.
If you need to submit to their server, I don't know who would use it for commercial reasons anyway. Most biotech startups and pharma companies are very careful about entering sequences into online tools like this.
The DeepMind team was essentially forced to publish and release an earlier iteration of AlphaFold after the Rosetta team effectively duplicated their work and published a paper about it in Science. Meanwhile, the Rosetta team just published a similar work about co-folding ligands and proteins in Science a few weeks ago. These are hardly the only teams working in this space - I would expect progress to be very fast in the next few years.
How much has changed- I talked with David Baker at CASP around 2003 and he said at the time, while Rosetta was the best modeller, every time they updated its models with newly determined structures, its predictions got worse :)
It's kind of amazing in retrospect that it was possible to (occasionally) produce very good predictions 20 years ago with at least an order of magnitude smaller training set. I'm very curious whether DeepMind has tried trimming the inputs back to an earlier cutoff point and re-training their models - assuming the same computing technologies were available, how well would their methods have worked a decade or two ago? Was there an inflection point somewhere?
Cool idea! A back button would be nice to correct answers. It also sometimes suggests the same choice for both options that I wanted to decide between :)
I tried google.com, and it gave 403/429 errors from three locations (Nuremburg, Ashburn, Helsinki), which seems surprising - maybe an issue with Hetzner?