I sincerely doubt that anything is truly random, there has to be some type of co...

gus_massa · on Aug 12, 2019

The main idea of "molecular Darwinism" is that the initial life for had a very short DNA [1]. While the species evolved, the short DNA evolves and got longer [2].

* For example some genes are repeated, a bad copy may repeat a gene and the DNA get longer. Virus may cause some duplications too.

* Some genes are almost repeated,each copy has a slightly different function, so each one has a variant that is better for each function. The idea is that an error in the copy made two copies of the original gene and then each copy evolved slowly differently.

* Some parts of the DNA are repetitions of a same short pattern many many many times. IIRC these apear near the center and the extremes of the chromosomes, and are useful for structural reasons, not to encode information. The DNA can extend the extreme easily because it's just a repetition of the pattern.

* Some parte are just junk DNA that is not useful, but there is no mechanism to detect that it is junk and can be eliminated, so it is copy form individual to individual, and from specie to specie with random changes. (Some parts of these junk may be useful.)

So the idea is that the initial length was not 1000000000, but that the length increased with time.

Your calculation does not model the theory of "molecular Darwinism". Your calculation is about the probability that is a "human" miraculously apear out of thin air with a completely random genome, it will get the correct one [3].

[1] Or perhaps RNA, or perhaps a few independent strands of RNA that cooperate. The initial steps are far from settle.

[2] It's not strictly increasing, it may increase and decrease the length many times.

[3] Each person has a different genome, so there is not 1 perfect genome. The correct calculation is not 1/4^1000000000 but some-number/4^1000000000 . It's difficult to calculate the number of different genomes that are good enough to be human, but it's much much much smaller than 4^1000000000. So let's ignore this part.

GregoryPerry · on Aug 12, 2019

Again and irrespective of how much genome information was there initially and what it eventually became, you are still talking about a final optimization problem of 4^1,000,000,000. Even one tenth of that amount of the human genome is an unfathomably large number to randomly iterate to given the generally accepted statistics cited above. The math behind stochastic molecular Darwinism doesn't work out at all.

fourthark · on Aug 12, 2019

I don't see where you are getting the idea that humans had to be pulled out of a hat of all possible genetic sequences.

They, like, evolved, right? As the GP says, there was a short sequence that worked, a little got built on, a little more...

There was never any time that any creature was generated by random choice.

GregoryPerry · on Aug 12, 2019

Got math?

GregoryPerry · on Aug 12, 2019

What in the world are you talking about?

This thread of discussion is about the computationally intractable nature of 4^1,000,000,000

Got math? Maybe post a proof?

fourthark · on Aug 12, 2019

Tell me why evolution would require all of those combinations to be tried?

Edit: Microsoft Windows 10 is 9GB. It would be impossible to try 8^9000000000 different programs. Yet, Windows exists, and most of us believe it's contained in those 9GB.

GregoryPerry · on Aug 12, 2019

So per your logic the Windows 10 operating system was created by random iteration of x86 opcodes over a lengthy period of time? Huh?

fourthark · on Aug 12, 2019

Exactly the opposite. Just because there are so many possibilities doesn't mean that all of them have to be tried or make sense.

You wouldn't code that way and nature doesn't either.

Dylan16807 · on Aug 12, 2019

If you just want to talk about how computationally tractable it is, the math is trivial. Optimize one base pair at a time. Now it's an O(4) problem repeated over a billion generations, most of which are bacteria where a generation is measured in minutes.

In practice the changes happening in each generation are all sorts of different rearrangements, but that's different from proving the basic and obvious fact that when you have multiple steps you don't have to spontaneously create the entire solution at once.

Bogosort will never ever sort a deck of cards. Yet it takes mere minutes to sort a deck of cards with only the most basic of greater/less comparisons. Even if your comparisons are randomized, and only give you the right answer 60% of the time, you can still end up with a sorted-enough deck quite rapidly.

(Why sorted-enough? Remember that reaching 'human' doesn't require any exact setup of genes, every single person has a different genome. It just has to get into a certain range.)

nootropicat · on Aug 12, 2019

There's no random iteration, it's more like stochastic gradient descent with noise. Your number isn't correct even if only because of codon degeneracy.

GregoryPerry · on Aug 12, 2019

Haha, so biological neuronal processes utilize a method of gradient descent? Perhaps you should submit your findings to the Nobel Prize Committee :)

Again, this thread is about the computationally intractable nature of 4^1,000,000,000.

Got math? A proof maybe to support your statements?

gus_massa · on Aug 12, 2019

>>> you are still talking about a final optimization problem of 4^1,000,000,000.

There is no final optimization step that analyze the 4^1,000,000,000 possibilities. We are not the best possible human-like creature with 1,000,000,000 pairs of bases.

> method of gradient descent

Do you know the method of gradient descent? Nice. It is easier to explain the problem if you know it. In the method of gradient descent you don't analyze all the possible configurations and there is no guaranty that it finds the absolute minimum. It usually finds a local minimum and you get trapped there.

For this method you need to calculate the derivatives, analytically or numerically. And looking at the derivatives at a initial point, you select the direction to move for the next iteration.

An alternative method is to pick e few (10? 100?) random points nearby your initial point, calculate the function in each of them and select the one with the minimum value for the next iteration. It's not as efficient as method of gradient descent, but just by chance half of the random points should get a smaller value (unless you are to close to the minimum, or the function has something strange.) So just this randomized method should find also the "nearest" local minimum.

The problem with the DNA is that it is a discrete problem, and the function is weird, a small change can be fatal of irrelevant. So it has no smooth function where you can apply the method of gradient descent, but you can still try picking random points and selecting one with a smaller value.

There is no simulation that picks the random points and calculate the fitness function. The real process in the offspring, the copies of the DNA have mutations and some mutations made kill the individual, some make nothing and some increase the chance to survive and reproduce.

dang · on Aug 13, 2019

Would you please not be a jerk on HN, no matter how right you are or how wrong or ignorant someone else is? You've done that repeatedly in this thread, and we ban accounts that carry on like this here.

If you know more than others, it would be great if you'd share some of what you know so the rest of us can learn something. If you don't want to do that or don't have time, that's cool too, but in that case please don't post. Putting others down helps no one.

https://news.ycombinator.com/newsguidelines.html

nootropicat · on Aug 12, 2019

>biological neuronal processes

who is talking about neurons? Beneficial random mutations propagate, negative don't, on average. In this way, the genetic code that survives mutates along the fitness gradient provided by the environment. The first self-propagating structure was tiny.

It's not literally the gradient descent algorithm as used in ml, because individual changes are random rather than chosen according to the extrapolated gradient, but the end result is the same.

>computationally intractable nature of 4^1,000,000,000

which is a completely wrong number, even if only because of codon degeneracy. Human dna only has 20 amino acids + 1 stop codon, which are encoded by 64 different sequences. Different sequences encode the same amino acid.

Elrac · on Aug 12, 2019

You're of course free to "doubt that anything is truly random" and to suspect that "there has to be some type of cosmic drummer," but I feel compelled to point out that your "case in point" completely fails to support your opinion.

Your example insinuates that (a) all of the human genome is required to correctly model the human phenotype, i.e. each bit is significant, and, more importantly, (b) the human genome came into existence as-is without a history of stepwise expansion and refinement.

I can't know whether you're a creationist, but I will point out that your attempted argument is on (e.g.) #8 on Scientific American's list of "Answers to Creationist Nonsense" (https://www.scientificamerican.com/article/15-answers-to-cre...). Amusingly, SciAm's rebuttal even explains how the "monkeys typing Hamlet" fails as an analogy to the human genome.

klodolph · on Aug 12, 2019

I mean, while you might have 4^1,000,000,000 possible genomes, only a vanishingly small fraction of those have ever existed.

GregoryPerry · on Aug 12, 2019

Your argument is essentially the anthropic principle, the entire "we are here because we are here" thing. Even the Second Law of Thermodynamics counters stochastic evolutionary strategies, the math behind non deterministic molecular Darwinism is simply not possible given the youth of the universe.

klodolph · on Aug 12, 2019

You’re definitely reading something I didn’t write.

All I said is that this is a large state space, which has been largely unexplored. The reason it’s largely unexplored is because most of the state space is useless, inert garbage. The amount of time it takes to create a genome this large is proportional to the size of the genome, not the size of the state space. That’s how evolution by natural selection works. If you hypothesized a world without evolution, where things appeared completely by chance arrangement of molecules, that’s when the size of the state space becomes important.

So I would say that your argument is not an argument against molecular evolution, it is an argument against something else.

Aaargh20318 · on Aug 12, 2019

> the math behind non deterministic molecular Darwinism is simply not possible given the youth of the universe.

Doesn't it also depend on the size of the universe ? We don't have any idea how big it really is. It could be infinite in which case it's not only likely, but inevitable.

rimliu · on Aug 12, 2019

What has the Second Law of Thermodynamics has to do with evolution? If you think Earth is an isolated system, just go out on a cloudless day and see for yourself why that is not the case.

JoeAltmaier · on Aug 12, 2019

The only drummer needed, is environmental pressure. This is well understood and computationally ordinary.

ben_w · on Aug 12, 2019

Last time I argued with someone who didn’t believe in evolution, I went home and wrote something which implemented it. It took me half an hour and worked first time. We’ve been using simulated evolution as one of several ways to train AI for quite a long time now.