Hacker News new | past | comments | ask | show | jobs | submit login
What learning algorithms can predict that our physics theories might not (firstestprinciple.com)
87 points by ad510 on July 10, 2016 | hide | past | favorite | 31 comments



I wasn't expecting this article to be about Sleeping Beauty problems and Solomonoff induction. There are people trying to apply real machine learning algorithms to real physics problems (e.g. [1][2]). This article is not about that.

It should go without saying that Solomonoff induction is totally useless for practical applications, if interesting theoretically. Brute forcing the space of all programs is ridiculously mindbogglingly universe-crushingly expensive. (Actually, even if the process was magically tractable, there would still be limitations and dangers [3].)

1: http://journals.aps.org/prl/abstract/10.1103/PhysRevLett.108...

2; http://arxiv.org/abs/1404.1333

3: http://lesswrong.com/lw/jg1/solomonoff_cartesianism/


> It should go without saying that Solomonoff induction is totally useless for practical applications, if interesting theoretically.

Indeed, just like Turing machines and the lambda calculus. Still an important theoretical step though!


Could Solomonoff induction guess sequences of prime-numbers?


You can think of Solomonoff induction as a weighted set of experts. At each step, each expert makes a prediction, and the overall prediction is the one that gets the highest total weight. Then the true data comes in, and the weight of each expert is increased or decreased based on whether that expert was right or wrong. And here's the kicker, the experts are all possible prediction programs, initially weighted by 2^-(program length). So yeah, in the long run Solomonoff induction will be at least as good as you at predicting any particular sequence, including prime numbers etc., because your own prediction algorithm is somewhere in the Solomonoff mixture. That also explains why approximating Solomonoff induction takes a huge amount of computation time. It's mostly a theoretical idea.


More specifically, every incorrect expert is removed at each step, and the remaining experts have their probabilities uniformly rescaled to sum to 1.


That relies on noiseless, unbiased data, right? What if an expert gets ruled out by accident?


Being able to average over every possible computer program is so powerful that it doesn't really matter.


The output of a program could be infinite and thus it never halts. Without Chatlin's constant or the Busy Beaver values, brute forcing is not feasible in a countably computable universe. It is still interesting to talk about Oracles, ie. Somehow getting hold of Chatlin's constant and thereby easily solving the halting problem and being able to use the induction.


These are easily fixed by dove tailing through the programs and using an increasing-over-time cutoff. Not fixed in the "made practical" sense, but in the "made non-contradictory" sense.


You can't use cutoffs. Timeouts deliver no guarantees within the universe of complexity.


Solomonoff Induction starts by using Kolmogorov complexity to calculate its prior distribution, which already requires Chaitin's constant to be known.

So yeah.


Solomonoff induction goes from program to output to prediction weighted by program's length. It never goes from type of prediction to shortest equivalent program. It doesn't need to compute Kolmogorov complexities.

(Well, more specifically, approximations to it with time cutoffs don't have to compute Kolmogorov complexities. Raw Solomonoff induction already trivially runs into the halting problem.)


This is a very interesting article. Scott Aaronson touches on some of these ideas related to quantum cloning and the concept of "you" in his blog posts.

I've always thought this kind of concept might define the limits of standard science, as currently practiced. Science requires reproducibility. But by whom? Well, other scientists of course. If every scientist tries your experiment and gets the same result, then you have a validated scientific theory.

But suppose that you manage to set up an experiment where the perception of which measurement resulted depends upon who is perceiving the result (I can think of a few ways that this scenario might arise if we could ever figure out a way to generate macroscopic, human-scale, superpositions [which is unlikely, I'll add]). That would really throw a wrench in things. In that case, you would have to have each scientist convincingly prove to each other scientist that they all see something different, in which case perhaps the result could still be universally accepted. But there may be a limit on how much consensus we can ultimately get.


Hum... That looks odd because it is. Things do not work this way.

Actually, the reproducibility applies to the hypothesis (conclusion), not to the experiment. It's only that, since the observer is irrelevant, people simplify and call it "reproducing the experiment".

If your experiment depends on the observer, and you get to know that, that dependence will go on the conclusion, and the people reproducing your experiment will expect to see the result your hypothesis say they would, not the same one you got. If you have a correct predicting model, everybody will conclude it's correct.


> But suppose that you manage to set up an experiment where the perception of which measurement resulted depends upon who is perceiving the result

One way that I think you could prove it to others is to have a way to predict the measurement depending on who's perceiving it, that is >=60% effective. This would consist of some form of proof to others at least.


> What if instead of connecting just 2 people’s brains together, we connected everyone’s brains like that to the internet? Would that mean that every human being on Earth would feel like they are just one small body part of a single, greater being?

I don't think that joining everyone's brains together would make everyone feel like a small body part of a single, greater being, because our brain architecture really wouldn't support that. You might be able to handle some shared "input" with another person or a small group of people (like these conjoined twins who share a thalamus[0]), but you're going to run into bandwidth issues pretty quickly given that there are only ~1-2 million nerves in each optic nerve[1]--if you're trying to split that 7 billion ways, you're going to have a difficult time getting coherent information through, let alone processing it.

The second major limitation to joining brains together is the speed of light--once we're able to open up communication between brains to allow "communicating via thoughts", we'll be communicating at the speed of our thoughts, which is much faster than physical speech. Connecting your brain with the brain of someone on the other side of the world might be a pretty disappointing experience because they wouldn't be nearly as responsive as someone physically nearby. Uploading brains and running them at higher "clock speeds" than biological hardware permits would make this limitation more significant, because you might subjectively experience a communication time lag that would feel like hours, days, or longer when connected to someone far away. In other words, there would be a limiting radius in physical reality for effective brain-connecting communication that varied depending on the speed of your subjective experience.

Those limitations aside, sign me up! Brain-AI merging and brain-brain communication are going to be the bees knees.

[0]: https://en.wikipedia.org/wiki/Krista_and_Tatiana_Hogan#Progr...

[1]: https://en.wikipedia.org/wiki/Optic_nerve#Structure


Question about this paragraph:

> Is it possible to define a process that Solomonoff induction cannot predict? The short answer is yes, but the kinds of computers needed to simulate these processes don’t exist in the real world, and it’s unlikely that we’ll ever be able to build them.

Doesn't "pick a truly random number" qualify? And we can build those today.


When I wrote "predict" there, I was referring to whether the predicted probabilities approach the true underlying probabilities, and Solomonoff induction can "predict" a true random number generator in that sense because its predicted probabilities will approach those of the random number generator. [1] However, if you tried to use it to predict a halting oracle, the halting oracle would be deterministic but Solomonoff induction would never be able to predict it with complete confidence, and this is what I was referring to in that paragraph.

But you're right that random numbers are inherently unpredictable; maybe I should add another footnote explaining what I meant there. (Edit: I added a clarification to the paragraph you quoted.)

[1] http://twistedoakstudios.com/blog/Post5623_solomonoffs-mad-s... in the "Thinking with Programs: Random Data" section


One of the principles of turing machines is that they are deterministic. In that vain, there exists no programmable RNG except for pseudo random ones. One has to ask oneself if piping a transform of the digits of a transcendental real number is a violation to this rule -- thus what is random really? Is random the lack of ability to find a correlation or program to reproduce it -- or is it something more like Komogrolov complexity? These are tough and inscrutable questions. Shannon, Turing, Curry, Church, Post, and others explored them deeply. Information theory gets extremely existential and esoteric. Is randomness a monad of our universe? Or is physicals and natural chaos just extremely leathery when it comes to extracting the generating program? Our lives depend on it. But either way, we'll carry on. Nature you goddamn enigma.


If you feed biased random bits into Solomonoff induction, the shortest surviving programs at any given point would tend to be things like arithmetic encoders that match the bias and specify just a bit more output. (Assuming you're not using pseudo-random numbers, of course.) So it will at least start to predict probabilities with the correct bias.

If you're feeding in unbiased independent random bits, any and every process is already an optimal predictor.


Did someone have to be on drugs to write this?


Well, I must say that I'm pleasantly surprised with the comments so far. I was actually expecting that this would be quickly shot down as either unoriginal or fundamentally flawed. But instead it seems no one has caught on to what this post is actually claiming, so I suppose I should now be very blunt about it.

At the beginning of the blog post, it claims that it explains 2 things:

1. where exactly might you be able to use learning algorithms where you can't just use existing physics theories instead 2. a hands on guide to applying learning algorithms in these situations

This is physicist code for "this blog post claims that it solves a major unsolved problem in physics." Let me explain.

Currently, we have the standard model and general relativity, which have been experimentally verified to extreme precision but are fundamentally incompatible with each other. So people have proposed theories of everything such as string theory, loop quantum gravity, and information/digital physics (which I'm obviously a fan of) to resolve these incompatibilities.

One of the biggest problems in fundamental physics right now is that the standard model and general relativity have been verified to such precision that it's hard to think of a practical experiment to show how they are wrong. The conventional wisdom is that this is only possible if we do things like measure the Planck scale or what happens inside a black hole, which are completely impractical on human timescales.

What this post proposes is that you actually don't need to measure the Planck scale or what happens in a black hole in order to test the proposed theories of everything, and instead you can do it with a sufficiently powerful computer simulation and a sufficiently good brain-computer interface. If our technology keeps improving exponentially, this may be possible in the next several decades.

So yeah, I made a bit of a white lie when I framed this post as a summary of recent research in information physics. I can back up almost everything in the post with the sources I linked to, but the part about the 0 or 1 experiment and predicting its outcome using Solomonoff induction is actually original research on my part, and I suspect it would actually be a very big deal if this works the way I think it does.

So here are the possible outcomes for this blog post:

1. The problem in physics I just described is actually already solved. 2. The blog post is fundamentally flawed, and/or it actually doesn't solve the problem that I'm claiming it solves. 3. The blog post actually does solve a major unsolved problem in physics, and this is a huge deal.

This is why I am so surprised at the comments I'm getting so far, since this proposal for experimentally testing theories of everything seems to be passing the internet commenter test. So if no one on HN finds anything seriously wrong with the blog post, can we get people like Scott Aaronson, John Baez, Juergen Schmidhuber, Stephen Hawking, or people of that caliber to look at it so we can get a more definitive answer to whether this actually solves an unsolved problem in physics?

Also, kudos to Xcelerate's comment, which is the closest to the point I was trying to get at with the blog post.


It is not clear to me that you realize that Solomonoff induction is a mathematical argument, not a practical algorithm. To run it at the level of generality necessary to discover the laws of physics is computationally infeasible. In fact, it's one of those cases where calling it "computationally infeasible" is an inadvertent understatement of the problem, because English doesn't have gradations for this level of difficulty. Merely a "singularity" doesn't help this problem; you need more computation than our physics appears to allow.


Yes, I know that Solomonoff induction is completely impractical for real life machine learning. My point was that if you can survive in the simulation to the point where you see either the 0 or the 1, we don't have any way even in theory (let alone in practice) to guess the probabilities of seeing a 0 or 1, unless you use some sort of learning algorithm. You can use any learning algorithm for this; it doesn't have to be Solomonoff induction.


But your argument seems to fundamentally rest on Solomonoff induction. Put any real algorithm in there, and now you need to ensure that 1. the biases of the algorithm encompass a hypothesis that matches the data and 2. the algorithm will be able to arrive at that hypothesis given a real data stream, and, ideally, a real amount of computation. Both of these are hard questions, in the strongest sense of the term.

And once you open that door, well, all you've really done is restate the fact that learning how the universe works seems to be really difficult.


OK, I see what you're saying now. In that case, can you think of a better way of predicting whether you see a 0 or 1 in that situation?


If I had an answer to that question, I probably wouldn't be putting it on HN. :) I'd be firing it at the market and making boodles of moolah.


I would note that "computation" is work done over time. Causality.

There may exist an alternate form of causality that isn't time bound, which may be exposed here over short periods of time. I would hesitate to judge it "computationally infeasible" until we know more. :)


I think it's important to distinguish arguments that hypothesize that our understanding of physics is fundamentally, deeply flawed, from arguments that are based on our current understanding of physics. I can't prove that our understanding of physics isn't deeply flawed and there isn't some source of infinite computation somehow available to us; for instance, one proposed explanation of the Fermi Paradox is that all civilizations escape to a physics/computation regime more congenial to civilization before colonizing the galaxy. But it's still important to know when we're engaging in flights of fancy vs. speculating based on what we know.

And given that the topic in question is plumbing the depths of physics in the first place, this is perhaps a notch more important than it might otherwise be. How would we discover that physics has an infinite/acausal computation mechanism if we first must use Solomonoff induction to discover that, when we can only afford to use Solomonoff induction to discover that if we harness that computation?


Well, there are things we know and things we will know. If we take the hypothetical "all knowing I", we assume it has zero security and all knowledge (wisdom). With individuals, we have high security (you can't know what I'm thinking) and low wisdom. So, knowledge plays a part in all this, as is evident of the result of causality. There's a sutra that deals with this concept as well.

I have a hypothesis that reality is backed by a blockchain data structure, which is why it's robust and fairly immutable. One might create a simple reality based on a blockchain data structure and then attempt to model causality/matrix rotations with that structure in such a way that the behavior of "gravity" noted in a gyroscope can be observed to not occur, given the nature of the scientific method. i.e. model rotations in a blockchain without generating gravity/precision and you've disproved my hypothesis.

The correlation with this test and reality would be allowing brief access to a "search" across all knowledge (which could be optimized behind the scenes) and then allow that knowledge to exist until the block is closed, at which point you are left with whatever gets closed in the block and the resulting forces that have to occur to rationalize the rotation. Rinse and repeat.

Probably doing a horrible job of explaining it. First time I've really written it down.


"So if we hooked up two people’s brains together in year 2050, would they feel like a single person?...We won’t know until we actually try it. If the answer is no then the rest of this blog post is completely irrelevant..." My guess is that if this were done to people who, up to that point, had lived independent lives (i.e. not born conjoined at the brain), they would each feel like individuals who had suffered some sort of major stroke, rather than that they had become a single person.

With regard to the cloning phase of your argument, is this not effectively the same as what the many worlds interpretation of quantum mechanics says happens all the time? FWIW, while I don't feel as if those other versions of me are me (assuming many-worlds is true), I realize that I am in no position to assert that I am the 'real' me, and in fact that it is beside the point to ask which one is.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: