TL;DR for biologists: synonymous codon variation is heavily constrained by the need to maintain transcription factor binding motifs.
While I found the article very interesting as a new postgraduate student, it seems like such a straightforward deduction that I find it difficult to believe no one has ever put it into words until now.
After talking to someone who knows quite a bit about this stuff, apparently most people in the genomics/bioinformatics community kinda already knew about this, just that the researchers who wrote the paper were the first one to officially categorize them.
My first thought on seeing the headline was, "What? Only 2?" Off the top of my head I can think of:
1. Amino acid coding
2. Transcription factor binding and transcriptional regulation
3. Post-transcriptional regulation and RNA degradation
4. Intron/Exon Splice sites
5. Chromosome structure and methylation
6. Origins of replication
All of these have been shown to be controlled at least in part by the DNA's sequence. This story seems to be about a (very interesting) new wrinkle in (2) above.
I agree. I believe I recall my biologist friend saying at one point that there are up to three such "layers" of genetic code, depending on where you start reading the sequence. Am I understanding that correctly?
Well, yeah, if what he's talking about are nucleotide bases, it takes three of them to make a codon. So if you have a sequence ABCABC, then ABC might be a codon (which codes for an amino acid, like a "bead" in a necklace that makes up a protein), BCA might be a codon, and CAB might be a codon. Then at the same time you could go backwards. So really there's six potential different codons that a single basepair might be involved in. But that's probably not what this paper is talking about.
You're correct in that this is not what the paper is talking about. However, there is a third layer (at least) beyond the gene-coding, most-studied area, and the article submitted; epigenetics studies how gene expression is modified by histone modification and DNA methylation, basically different ways of switching gene expression on/off or changing the amount of protein made from a certain gene.
My understanding is that GMO crops have snippets taken from other species' DNA. Those sequences would be internally consistent, and only the patch between the native and foreign sequence would be affected.
Actually I didn't know exonic TF binding was even significant enough to put synonymous codon variation under constraint. I've always assumed it was some sort of metabolic constraint, as in more efficient production of certain tRNAs putting a constraint on codon usage.
While I found the article very interesting as a new postgraduate student, it seems like such a straightforward deduction that I find it difficult to believe no one has ever put it into words until now.