I'd add that the "omics" train has really left the station. It seems almost every week I hear about some new species that has been sequenced, including crazy polyploid species with huge genomes. For that matter it seems that after 20-30 years they finally are figuring out the regulatory function of "Junk DNA" is. It seems like molecular biology is in a golden age now.
The sheer amount of data being generated seems overwhelming. For example these researchers created a new family tree of just the grass species (very import agriculturally and industrially of course):
> "The research team generated transcriptomes — DNA sequences of all of the genes expressed by an organism — for 342 grass species and whole-genome sequences for seven additional species."
This does allow analysis of very complex but desirable traits like drought tolerance, which involve a great many genes, but sorting through these huge volumes of data to ascertain which gene variants are the most important is challenging at best.
Everyone always calls it "next word prediction" but that's also a simplification.
If you go back to the original Transformer paper, the goal was to translate a document from one language to another. In all prompt systems the model is only given "past" tokens (as it's generating new tokens in real time) but in that original paper the LLM can use backwards and forward context to determine the translation.
Just saying the architecture of how a model is trained and how it outputs tokens is less limited than you think.
True. and the quality of the genome assemblies is getting far better thanks to sequencing technologies that can easily generate data for 200,000 consecutive basepairs.
But just to circle back on junk DNA—-most of it is non-functional baggage—-the result of our constant battles with viruses and replication errors.
Perhaps at least some of that "non-functional" DNA is so only after the completion of our nanotechnological self-assembly process, which is surely somewhat complex.
meh a lot of the mega size genome have been well known. i can think of ameoba at 200x more genome than human and a gorillion polyploid plants. Some have this has been known the last 20-30 years. Some further back. Anyone can look at some chromosomes in a microscope, no sequencing, can say if something is polyploid or not. It's a whole field, called cytobanding.
Now as far as assemblies what is interesting there, we never sequencing the human genome . What we did was sequence a lot of the human genome "except for the hard parts, seems impossible". Well in the past few years the T2T projects has successfully sequenced the whole dang thing. Just because something is "hard" or "more repetitive" don't mean ain't nothing interdasting in dere.
Sure, but much of what has been discovered is that there arent "genes for X". The human genome is only 20k genes, 90%+ we share with a mouse.
This really sours the whole edifice of 20th C. 'genetic statistics' (, heritability, etc.) -- and puts well into light the eugenicist origins of frequentist statistics itself, and the vast amount of pseudoscience it's given rise to.
Yet, popsci, and many research areas have yet to catch up.
Children are still taught there are such things as dominant traits, and genes for eye colour, etc. Yet we've no idea of the full genetics of eye colour.
I would say this "golden age" is more a discovery of how everything we have previously believed is BS.
I don’t understand your arguments at all. Why does the homology of genes between apes and rodents bother you? They are part of a single “super-primate” clade—the euarchontoglires:
https://en.wikipedia.org/wiki/Euarchontoglires
This degree of overlap is entirely expected and accepted and has been for many decades before fully genome sequence gave us hard numbers.
Can you clarify your second paragraph? Heritability is not a controversial topic but if you are saying the estimates are often abused, then I definitely will agree.
There are clearly dominant traits (and recessive traits). Huntington’s disease is the canonical example. If a student does not understand Mendelian genetics first they will not be able to understand complex quantitative genetics.
Your last statement is extreme. Do you also think that all of Newtonian physics is BS and should not be taught to kids.
As far as mice and people: there arent enough genes for there to be "genes for traits" given the difference of traits between mice and people.
I also take heritability to be quite controversial, since what is the point of the covariance of genes with traits across populations? In almost all cases this isn't informative of anything.
I also don't think we teach children that mass can be arbitrarily large, nor velocities arbitrarily high, etc. -- ie., the false parts of newtonian mechanics arent taught; and the remainder serves as a useful toolset for model construction.
The idea that we have genes for traits is simply false.
Ah thanks. This is a set of comments I get. Appreciated.
The headlines that read “Discovery of Gene X for Autism, Intelligence, Prostate Cancer, even Religiosity”
are definitely crude and wrong click-bait.
Regarding heritability: Almost all of us in genetics use heritability as an operation measure of how difficult it will be to uncover the often large set of DNA variants that modulate trait variation. Since I am now trying my best to map DNA variants that modulate lifespan I need to know how hard I am going to need to work and what same size I will need.
Turns out we will need about 25000 mice to do a good job. And our result will depend strongly on the quirky environment in which we raise our mice; almost equally quirky as human environments, but not as noisy and plagued by as much disease and warfare!
Almost all of us in the field of quantitative genetics know that heritability is strongly dependent on environmental context. There is no heritability for nicotine addiction in a world with no nicotine. This message gets lost in translation—why we seem to recursively fight pointless nature-nurture battles.
> I also take heritability to be quite controversial, since what is the point of the covariance of genes with traits across populations? In almost all cases this isn't informative of anything
If there's no simple linear gene-trait relationship, then how inheritable a trait is will not be tracked by heritability which is a statistic of covariance across a population.
This problem becomes compounded in the extreme if there's any environmental modulation of the trait-gene relationship.
Consider that a scottish accent is nearly 100% heritable: it entirely covaries with genes (localized in scotland), but is 0% inheritable. Suppose eye colour is 100% inheritable, but a product of 100 gene interactions, which have substantial geographical localization. You could easily find a case where heritability of eye colour was 0%, but inheritability 100%.
Heritability is a dumb metric for charlatans anyway, even in a simplified trait-gene world -- but in the actual world we live in, >99% of its uses constitute pseudoscience.
It was originally invented for working in agriculture where population were under genetic control by experimenters -- here the covarience is actaully causally indued by experiment, forcing a much simpler and deterministic relationsihp between genes and traits. In basically all other cases it's meaningless.
Right, thanks. But, at least for the last half century, statistical analyses in quantitative genetics have included genetic and environmental effects in their models, i.e. explicitly trying to find evidence for genetic influences after taking into account environmental effects. (I'm not saying it's easy to do so).
There are definitely things like dominant and recessive traits- it's just that only a limited number of genes actually demonstrate this behavior in a non-ambiguous way.
Eye color and hair color have been studies pretty intensively and they definitely have models which are more complex than "if you have value Y at position 37 of gene X, you have blue eyes". The way it's normally stated is that we can explain 50% of the heritance of eye color using a limited set of SNPs.
What we don't have is a comprehensive model of how complex phenotypes arise from genotypes. This is only because of historical oversimplification, and partly because the underlying causality of phenotypes is remarkably sophisticated due to a combination of many factors, overlapped with enormous amounts of noise and confounding factors.
> Eye color and hair color have been studies pretty intensively and they definitely have models which are more complex than "if you have value Y at position 37 of gene X, you have blue eyes". The way it's normally stated is that we can explain 50% of the heritance of eye color using a limited set of SNPs.
That's all very nice, but can you predict eye color from a genetic test with reasonable confidence?
That a very simple question and the answer is apparently a resounding no.
The second study demonstrates that more than 50% of variance in eye pigmentation can be explained by simple additive effects.
“
We identify 124 independent associations arising from 61 discrete genomic regions, including 50 previously unidentified. We find evidence for genes involved in melanin pigmentation, but we also find associations with genes involved in iris morphology and structure. Further analyses in 1636 Asian participants from two populations suggest that iris pigmentation variation in Asians is genetically similar to Europeans, albeit with smaller effect sizes. Our findings collectively explain 53.2% (95% confidence interval, 45.4 to 61.0%) of eye color variation using common single-nucleotide polymorphisms.”
I think the answer is, "yes, we can predict eye color from a genetic test" but the prediction will be probabilistic and frequently wrong. For example: https://www.ancestry.com/c/traits-learning-hub/eye-color and the references suggest such a test exists.
> Children are still taught there are such things as dominant traits, and genes for eye colour, etc. Yet we've no idea of the full genetics of eye colour.
Adding FUD doesn't help. There are perfectly knowable things to teach kids —
- Blue-eyed parents have blue-eyed kids.
- Brown eyed parents have brown-eyed kids, if they are homozygous.
- Sometimes, two brown-eyed parents have recessive alleles! But it's rare!
- A blue-eyed parent and a brown eyed parent will have ~50% blue-eyed children
You don't have to 100% characterize all hazel and green shades to capture most of the state of knowledge. These are 99.9% true. You're just trying to cast doubt on an increasingly well-understood field, akin to people trying to pick apart climate change research.
Painting yourself as the sort of person who thinks frequentist statistics is flawed because of unsavory associations with its founders in the early 20th century is not a way to make anything else you write be taken seriously.
Well, what explains the climate? The mineral composition of the earth? (etc.)
Sure, if that were different, so would be the climate. But the climate is severely under-determined by that composition -- if you drive it by a different sun, a different meteor strike, etc. it would be radically different.
Take the same genome and biochemically intervene on the conception, pregnancy, development, etc. of an animal. My claim would be that for a wideclass of such interventions the very same genes will do radically different things, producing quite different kinds of animal.
Likewise, after birth, different ecologies will make significant differences/etc.
Every aspect that makes humans human (or mice mice) was produced by evolution, and therefore must be based on genetics. That's because evolution only acts on information encoded in genes (no, epigenetic information doesn't count).
There is a side channel for information: the human (or the mouse) itself. A genome is, among other things, a recipe for making X, assuming that you already have X. But if you don't have X and don't know what it looks like, it's not clear that the genome contains enough information to make X.
Right, some information is encoded in the egg. You can't take human DNA and put it in a mice egg. And then you have the womb which itself directs growth in the first stages.
DNA is the machine code, but you need a compatible computer to run it.
There is some information in the cytoplasm, but how much? Human nuclear DNA encodes something like 8 billion bits of information; mitochondrial DNA a bit more (although there are many copies of mitochondrial DNA in a cell). The specific sequences of all the proteins and such in the cell is a consequence of DNA sequences; it's not a separate (or transmissible) information pool.
That human DNA produces a human when in a human cell doesn't mean the rest of the cell is carrying significant amounts of information, just that that's the environment in which the DNA has evolved to operate.
A piece of clay comes from a mountain and is pressed into a mould, then thrown into a river and melts, it lands at the end on a rock, squished. Why is it that shape?
To say, "because of the chemical composition of the mountain" is, at the very least, foolish.
Organic lifeforms are self-modifying clays that are pressed into environmental moulds. Genes play both a far more complex, and far simpler role, in this than the mid-20th C. eugenicist biological science supposed. A pseudoscience that still predominates in how people think about genes, and in many downstream research areas (eg., https://slatestarcodex.com/2019/05/07/5-httlpr-a-pointed-rev... )
The problem is none of that is information in the sense of something that can be acted on by natural selection. If it isn't copied from generation to generation, it's can't be molded over generations by differential survival of variants.
Part of the molbio story in the last year or so is that people are getting some insight into the regulatory function of what they used to call "Junk DNA".
which I think is usually ahead of it's time but the genetics unit seems backwards in that it doesn't say a word about molbio and has the same experiments were you observe the same few phenotypes that are sorta-kinda described by Mendelian inheritance, not telling you that those are the only ones, that we can't trust Mendel's lab notebook with the peas, etc.
"Personal Genomics" in the sense of 23 and me has been a wash. For one thing the SNP approach is limited in what it can do, but even if you had a real sequence teasing out the personal variation in terms of Genes x Environment can only go so far. What's interesting to me about genomics is the things that are the same and can be understood to a great deal mechanistically such as all of us Eukaryotes sharing a common "operating system" in terms of the machinery of protein synthesis, cell division and such. Hundreds of genes are known that affect diseases such as asthma, diabetes and schizophrenia and even if a polygenic risk score is possible a doctor is going to give a person with a high risk score the same advice as a person with a low risk score and give the same tests (fasting glucose, A1C) except maybe he ought to be a little more emphatic to the high risk person and the high risk person should take it more seriously.
(For that matter my experience with animals leads me to think animals are much more the same as us than different even when it comes to things like intelligence and communication. My belief was confirmed when my stray cat, "Bob B" seemed to understand that his path to a better life went through me when after 2 1/2 months of stonewalling me he got quite articulate in terms of using his "voice" and gestures such as pointing at the window and door to indicate he wants to go out, pointing at the TV and having an expression that seemed like disapproval, etc. I think these behaviors are basically intrinsic to mammals and birds if not some reptiles. You can certainly learn to communicate with animals better but a lot of it is instinctual)
> I would say this "golden age" is more a discovery of how everything we have previously believed is BS.
Whoa. I think your wording works more accurately observing previous beliefs are only a subset of a bigger picture, and are at times context dependent in such bigger pictures.
Honest, no-agenda question: if you go to East Africa where homo sapiens originated, there is (reportedly) much more genetic diversity than in the rest of the world. I haven't been there, myself.
So forgetting scientific studies for the moment: if you just walk around in a city, is that apparent to you? Do you think, "Wow, there sure are a lot of different types of people here?"
I grew up in the US but have traveled through East Africa as well as most of the rest of the world, I don't think there's any major human population center I haven't been to.
My answer to your question would be a qualified no.
You'll see more phenotypical diversity walking around New York than Nairobi thanks to globalization.
With that said, the qualification I'd put on that is that if I consciously think about what region of the world I've seen the most variation in body plan/portions and facial geometry, I'd probably pick East Africa. You do do encounter pretty much the full spectrum of shapes human beings come in.
I will also say that I remember being surprised by how much variation in body shape I saw in China. Maybe this is pure western ignorance but I had an idea in my mind about what the average Chinese person looked like and I found myself having to adjust my expectations in ways I have no had to do anywhere else.
The last thought I'd leave you with is that you absolutely should make it a priority to visit East Africa. Kenya in particular is where I always tell people to go but Uganda, Rwanda, Ethiopia are all good options. It far exceeded my expectations.
Thanks. I've been to Sweden and Italy, and they definitely do not follow stereotypes. Swedes, especially, are not all or even mostly blond & blue-eyed.
There actually are electric fields in nature that can get pretty strong sometimes. My undergrad school was New Mexico Tech which has a strong atmospheric physics program so one of our senior lab projects was