Hacker News new | past | comments | ask | show | jobs | submit login
Brains speed up perception by guessing what's next (2019) (quantamagazine.org)
180 points by yamrzou on March 31, 2023 | hide | past | favorite | 92 comments



This might be tangentially related - I'm not sure. I've recently started playing jazz piano, after years of classical piano.

Now, I'm supposed to be a reasonably good sight reader, or at least that's what I thought. The improv part of jazz I'm not good at but I get, but I've also tried playing from transcriptions (Bill Evans in particular), and have found I'm extremely slow at reading the music.

After thinking about it, I suspect it's because the music makes no sense to me so I can't guess what's coming ( which is what I do with classical music). So while I anticipate with classical music and just confirm after the fact/right when the note actually shows up, with Bill Evans I have to think hard and actually confirm everything.

My ears help but not much, since 'confirming' the sound is much harder with jazz than with classical music (it's unexpected , so again, there's some kind of delay between 'I hear the note' and 'ok that's right' while there's essentially no delay with classical music where my brain is kind of 'primed for acceptance'


I'm a drummer with loads of improv, including a fair amount of jazz. What's interesting is that over time what's sounded like (mostly) random noise with no pattern has progressively turned into sensicle flows from moment to moment -- it's definitely a predictive process (both brain-wize and nervous system!). It's manifested in surprising ways, like having a forward predictive understanding of which hand should be where depending what I want to play.

Shifting between large shifts during improv can be mentally taxing, very much the same feeling of trying to remember the name of something or someone. But just like with names, once you get that first mental hint (like a vowel), predicting what comes next just falls into place.


I'd love to get your feedback on a project I've been working on to 'sonify' DNA sequences. When played fairly quickly it sounds a lot like jazz to me, but I don't have a high musical intelligence so I'm not confident in how apt that description is. If you feel like having a listen to this:

https://www.youtube.com/watch?v=NSG9006YAeA&t=34s

And sharing your opinion on whether it bears similarities to the 'logic' of jazz, that would be very interesting to me!


Interesting project - you might be intrigued by algorithmic music composition.

The link you shared sounds atonal to me. So, you could be thinking about free jazz. Here's a YouTube video that discusses it: https://www.youtube.com/watch?v=0WhXtkMyPHU


Thanks! I'll check that out and see if it gives any insights.


A few ideas. DNA uses 20 codons, that could be 20 chords on piano. A sliding window of N codons can count the distinct codons and use that count as a pitch for the codon's chord.


I agree that codons are probably the best basic unit to work off rather than base pairs or other units of organization. I'd say we agree on step one, I'd be interested in your feedback on how I approached steps two and three.

Where I disagree with a lot of existing approaches is that given the decision to assign sounds on the basis of codons, most people assign them arbitrarily. I think that gets stuck in the postmodern dilemma, where everything is subjective so there's no basis to compare one system to another. So I wanted an approach that had logic in the loop somewhere, even if it had enough subjectivity that someone else following the same logic would come to a different conclusion.

What I hit on is convergence of emotional associations. Individual neurotransmitters are associated with particular emotions, and also particular amino acids - therefore, one can assign emotions to a subset of codons. One can also assign emotions to musical elements, hence there's a basis for assigning sounds to codons in a non-arbitrary way, with emotional association acting as the cypher between the two languages.

that's what I consider 'step two' of the process. My thinking is, someone could come up with a better logic (not using emotion as the common known element) or apply the same logic using different musical elements (assign different tones given the same emotional targets). Unfortunately, most people I asked from a music theory perspective didn't have much interest in suggesting different approaches here, so I had to work off my own subjective sense. If you've ideas what chords should be assigned to which codons, I'd be very interested in hearing your thoughts there (or anyone who's into music and logic puzzles).

Here's how I actually assigned them so people get an idea: I started with pentatonic scaling (G, C, A, B and D) and four major neurotransmitters (GABA, Serotonin, Dopamine, Acetylcholine) and assigned tones to the neurotransmitters based on which 'sounded' like the emotions those neurotransmitters give. So I started from a chart like this:

GABA Gminor

GAG 337hz

GAA

Serotonin Cminor

UGG 370hz

Dopamine Amajor

UAC 440 Hz

UAU

Acetylcholine Bmajor OR Dmajor (Acetylcholine has two precursor amino acids rather than one)

UCU 494

UCC

UCA

UCG

AGU 587

AGC

But even if one accepts the logic in step two, there's probably still better ways to assign sounds. I'd be interested in any suggestions from folks with more musical theory than myself.

I don't understand your second suggestion - pitch of a particular chord would be determined by how much variation there is in pitch around it? Could you give an example of how it would work?


Once, to gloss over many details, I had a signal lamp on my fire escape and started quietly yelling morse messages into the void. At first I used a reference sheet looking up each character as I needed it. Then something clicked in my head and I no longer needed the reference. I accidentally learned morse code. I've seen the same thing happen in real time with other people. There's something like a phase change above some fraction of learned information which makes the rest fall into place just after some threshold.


I get this. With anything from the core classical repertoire, the obvious major composers, to a good listener, you almost already know the piece before you learn it. With jazz standards too, but there is much more flexibility required in actually rendering them.

Part of the article seems to be confirming a neurobiological basis for something quite common-sensical to me.

From how you describe your issue I suspect there are things from the more recent classical repertoire that would stump you beyond the level of Bill Evans transcriptions. Bill Evans is still at least in a relatively "accessible/listenable" harmonic idiom.

Have you tried reading through any atonal/serial works of the Second Viennese School? Or 20th century classical composers like Ligeti, Elliott Carter, or contemporary pieces like Unsuk Chin's piano etudes, or Birtwistle's "Clocks"? I'm a good sight reader too but that's where I really hit a wall. This is music that provokes a strong negative reaction in many people. In the language of the article, their sensory expectations are defied. I think to really love this music (as I really do), you will have had to grapple with and deeply internalize things about compositional structure, so that what is a theme, a countersubject, a commentary, a development, an ornamentation, a point of rhythmic or textural elaboration, etc. are all still readily evident to the ear, and thus satisfy the ear, even if it's all occuring in an alien harmonic context.


Thanks a lot for the suggestions, I'll definitely listen to them /try to play them.

I'm fine with 'difficult' music, and I agree with you that Bill Evans is accessible/listenable - what bothers me I think is that I can't predict the actual notes (quartal, 11th, etc everywhere).

Edit: I really like ligeti.


Hope it leads to somewhere interesting for you! Unsuk Chin is in the school of Ligeti, perhaps try this magical little thing: https://www.youtube.com/watch?v=jP31oUljA_g

Yeah jazz is definitely it's own language, not just an extension of classical harmony as some would have it. Classical provides a model that makes jazz highly unlikely. I'm still figuring out what I can say in words about the difference. Something about freedom vs. structure in relation to the scope of expression. But you have to get used to hearing those non-triad notes as fundamental, not just as "passing", and a feeling for what idiomatic progressions are implicated by that.


The 2:00 timestamp in this Veritassium video on expertise goes into exactly that phenomenon. We acquire expertise by recognizing the patterns in what we do. But once you take the patterns away, a lot of that "expertise" disappears too: https://youtu.be/5eW6Eagr9XA

Jazz piano vs classical is a perfect example of this: same instrument, two paradigms with totally different idioms.



I can relate to this so much. Especially Bill Evans is really hard to play for me too!


Jazz is deliberately anti-conformist: knowing what's expected to come next and not doing that. This requires a setup of expectations, in order to dash them.


Over time with more exposure and practice, Jazz should make more 'sense' to your fingers and ears?


Yes, you're right and it's slowly happening. But it's a very slow process, and very surprising to me : I expected to be able to read fully written jazz easily, and, well, I can't, because reading fluently seems to be as much about predicting what's next as about knowing the alphabet and syllables!


I was studying something similar in my MSc about the neurobiology of language. It's well known that reading circuits in the brain make predictions about which words come next - a sentence like "The jam on the motorway made traffic slow" doesn't elicit much activity, whereas "The jam on the motorway was sticky" does, because "motorway" makes it much more likely that "jam" was about slow traffic rather than the condiment. (There are perhaps some interesting parallels to LLMs there...) But we were looking in the visual cortex, way back before the brain is even perceiving words, and we were finding pre-activation of circuits even there depending on whether people were presented with real words (e.g. sheep) or non-words (e.g. brast).

I think there's a lot to be said for conceptualising the brain as a statistical machine - it makes a lot of sense in evolutionary terms. There's a great video introduction to that idea with Karl Friston from UCL here: https://www.youtube.com/watch?v=NIu_dJGyIQI


I've also noticed this while learning Chinese -- I think quite a bit of the initial difficulty with learning a second language is just training your brain what to expect next.

Going into a 711 to get a coffee in Taiwan you know approximately what questions they are going to ask: size, hot/cold, do you want a bag (to hang on your scooter), and do you want a receipt.

Similar with the supermarket, a restaurant, when getting train tickets, etc.

You're just building up what questions to expect next for the scenario -- if the conversation deviates from that (even with words or questions you might know) your brain has a hard time initially adapting to that.


> if the conversation deviates from that (even with words or questions you might know) your brain has a hard time initially adapting to that.

This even happens in your native language. I'm sure all of us have said something like "Good, and you?" in response to an expected "How's it going?" or similar greeting, but the other person started with something different and it became a non-sequitur. I'd never given it much thought, but I wonder if part of the feeling of social awkwardness here is contributed by the pattern mismatch, rather than the outcomes?


The incredible awkwardness of expressing the wrong token in the exchange. Like after having paid someone and they say "thankyou" and you say "you're welcome" automatically, but it's wrong to say that.

A friend once answered the phone with "Hi, I'd like to speak to John in sales please" and even though (or because) I recognised his voice I was paralyzed for a moment. "But I rang you". I almost had to lie down.


"Here's your popcorn, enjoy the movie."

"Thanks, you too!" .... <dammit!>


I was recently wondering if this changed with age: when talking with older ppl, I sometimes think they prefer hearing what they want to hear - or maybe rather, what they predict I would say? And it can be challenging to get theough to them with novel concepts.


If the concept of the Bayesian brain - https://en.wikipedia.org/wiki/Bayesian_approaches_to_brain_f... - is roughly right, it's plausible that as you get older you might build up a pretty unassailable statistical model of the world which would need very significant evidence to the contrary to change in a short timeframe.

But I think there are probably other, more human, effects dominating like the one you suggest - only listening to evidence that backs up your existing points of view, and that happens from a pretty young age! That would actually be against the idea of the Bayesian brain, which should accept new evidence and update the statistical model appropriately. Much as I find the general approach of the Bayesian brain useful, in many domains humans aren't really particualarly optimal in statistical terms...


I don't think that's necessarily proof against the Bayesian brain. It seems reasonable that the brain is also using its statistical models to assess the relevance of new evidence. So it's not just "new evidence, I need to update" but more like "new evidence, how likely is this true? I'll update according to the magnitude of the likelihood."


Anecdotally, when we get older our ability to assess new evidence weakens, or the weights get rusted into place.

Being friends with an immortal vampire would be a chore. "Blood tasted so much better in Sumeria", "There's nothing wrong with clay tablets for a diary, it never crashes or needs an update."


Here is an interesting tidbit I learned: the median response time in a human conversation is 0 milliseconds. In other words, more than half the time you begin responding to your conversation partner before they finish speaking. And in most cases you respond nearly instantly after they finish speaking.

Anyone who has worked on voice assistants knows that computers take much longer than that to determine that someone has finished speaking. And anyone who has studied brain functions knows that it takes 100's of milliseconds for the brain to process information like spoken words.

So basically that just proves that the brain must be predicting what the other person is going to say.


I don’t actually disagree, but ‘must’ seems too strong here. It could mean that they’ve already reached a threshold of comprehension earlier in the current speech, or that they’ve latched on to a pertinent point and don’t care to hear the rest.


My wife hates when I do exactly this. It’s extremely hard to change, because it’s so subconcious.


I partly blame growing up on IRC for the behaviour - my brain seems to treat all communication as asynchronous. And while I often find myself apologising for the it, I also try and reassure people it's entirely because they said something I found interesting and was responding to it. It is a struggle though!


> It could mean that they’ve already reached a threshold of comprehension earlier in the current speech

I feel like that is a distinction without a difference. If they already comprehend, then haven't they basically predicted the rest?

> or that they’ve latched on to a pertinent point and don’t care to hear the rest.

This I suppose is a little different, in that they are ignoring the rest, so yeah you're right there, it may not be predictive in that case.


That's how our visual system works as well.

The visual cortex is actually providing very little information to our forward predictive models, but the model is giving a lot of information back to the visual cortex.


Apparently predicting is the most efficient way of running a brain, as opposed to reacting. Your brain is a predicting machine.

Lisa Feldman, a top neuroscientist and writer, goes over this topic in a podcast episode with Lex Fridman.

This is the chapter on the predicting brain, but the whole of it is worth a listen, mind blowing stuff:

https://youtu.be/NbdRIVCBqNI?t=1443


Almost like a, say, fancy autocomplete.


Indeed! maybe LLMs work in a more similar to our brains than we think


It could be. Maybe "understanding" is not so much more than predicting what's to come next with reasonable accuracy. Maybe LLMs understand better than we think.


Think back to cavemen days, or before actual spoken languages. You pointed and said oooh oooh... Still communication, but you couldn't wonder about the stars in the skies because you didn't even have words for what those were, or a clue about much, because prior generations of info wasn't passed down as easily as it is today.

Entire generations of info is now instantly at our fingertips whenever we want it, even w/ creative spins thanks to GPT models.

Language is the glue that holds up consciousness/sentience, it is intelligence. It's very hard to be intelligent without language, or the ability to understand language.

A lot of learning deficits are because of ones inability to understand the other person (Teacher, etc, or focus on them long enough to understand what they're saying), I think language itself could turn most things into AI or just I, perhaps anything that can communicate and build language could be intelligent regardless if it's silcon or carbon.

There's a lot of 'understanding' we glean from other senses, from what we see, etc... but we only know a tree is a tree because we've been told that (reinforcement learning), and because language exists to tell us that over and over again. So vision, audio, etc are all still language-based.


This idea is both fascinating and terrifying in equal parts


This particular stochastic parrot has been making this argument for some time. At some point there must be a basis for consciousness.


I think that part will emergently come from a (sufficiently complex) loop: projecting a model of self onto inputs, recognizing it, identifying focus, predicting own behavior, and then adjusting focus based on current goal. Rinse repeat. I assume brains do nothing else.


that good be quite... insightful. there needs to be a way to bookmark and annotate comments so you can come back in 5 years and see if they turned out to be right or not.


Thanks :) But I don't want to take credit for the idea, my intuition stems from "I Am a Strange Loop", by Douglas R. Hofstadter, so you could buy that book as a literal bookmark :)


No way, that would mean some people are just parroting back what they hear from others.


Anticipation and prediction are so important in getting a leg-up in the real world. Real-world physical processes take time. Cannot just wait for food to enter the mouth to start revving up the digestive system.

What is Health? by Peter Sterling is a lovely book that goes into this. Peter Sterling is a neuroscientist and the "co-inventor" of the term Allostasis (from Wikipedia: "Allostasis is the efficient regulation required to prepare the body to satisfy its needs before they arise by budgeting those needed resources such as oxygen, insulin etc., as opposed to homeostasis, in which the goal is a steady state." )

https://www.goodreads.com/book/show/44512576-what-is-health


>> Anticipation and prediction are so important in getting a leg-up in the real world.

As a kid sometimes I would look around the room and pick an object to touch or pick up. Then I'd close my eyes and see if I could do it. This might involve taking several steps or even turning at the end before reaching out. There is some feedback from the feet but it's mostly open loop without vision. I could visualize my position along the way - a form of prediction.

This skill is still useful when I need to turn off a light before crossing a room or opening a door :-)


Spatial awareness and the ability to internally visualize actors in that space along with predicting their vectors is a very useful skill.


What kind of like this? [1]

I know this is a silly thing to compare but haven't we all had huge issues with security vulnerabilities involving processors guessing what's next since like 2018? [2]

I wonder if there's any general parallels between any potential information processing system and the concept that to make it go faster you can do this. I didn't read through the whole neuroscience article yet but that's definitely the first thing that came to mind.

[1] https://en.wikipedia.org/wiki/Branch_predictor

[2] https://en.wikipedia.org/wiki/Spectre_(security_vulnerabilit...


Don't we all base our actions by guesstimating what the outcome will be?


My point is that this exact guessing of what might come next is something that is universally exploitable in systems which do this


There is something similar in Daniel Kahneman's book Thinking, Fast and Slow.

In my words, 1) The sender asks "how many X are there" (assuming that the recipient also knows that the answer is a large quantity). The recipient's mind indirectly loads and caches the concept of a large quantity. 2) The sender changes the "discussion point" to something different that also involves a quantity, however, because the original impression of the first quantity lingers, their perception may now be skewed.

Link to one such example: Studying Machine Intelligence with Been Kim - #571

https://www.youtube.com/watch?v=C8SuqeH_Mg0&t=755s


Yeah it screams established psychological marketing/behavioral economics techniques to me


Without prediction, we wouldn't have goalies.

A puck can pretty easily travel something like 30-40 m/s, and shots are regularly taken 4-5 meters from the net.

Goalies have, what, 1/10th of a second to track and evaluate the shot and make a qualified save. If that was pure evaluation and reaction we would be far too slow.

I also think most people who've played goalie in any sport know that you get a _sense_ for where the shot is going. High, low, left, right etc so it's not just guesswork and statistics either.


Without prediction, we wouldn’t be able to _walk_!


There is a lot built into the biomechanics of your legs too. The relevant term is “passive dynamics”.


And a lot of unconscious thought for the biomechanics of the body.

https://radiolab.org/episodes/91524-where-am-i

The section on Ian Waterman.

Another article about him - https://mcneilllab.uchicago.edu/pdfs/IW_lost_body.pdf

> Mr. Ian Waterman, sometimes referred to as ‘IW’, suffered at age 19 a sudden, total deafferentation of his body from the neck down—the near total loss of all the touch, proprioception, and limb spatial position senses that tell you, without looking, where your body is and what it is doing. The loss followed a never-diagnosed fever that is believed to have set off an auto-immune reaction. The immediate behavioral effect was immobility, even though IW’s motor system was unaffected and there was no paralysis. The problem was not lack of movement per se but lack of control.

https://topdocumentaryfilms.com/the-man-who-lost-his-body/


Fascinating stuff, thanks for sharing!


>you get a _sense_ for where the shot is going. High, low, left, right etc so it's not just guesswork and statistics either.

It is guess work and statistics. Just because it happens at the level of neurons rather than as a conscious series of mathematical steps does not change what it is. It may manifest as a "feeling", but "feeling" is just a way for the conscious mind to have a referent object as the source of information which comes from subconscious thought processes.


I've always thought that the feeling of déjà vu is the prediction "algorithm" being comprehended ahead of the real world input, thus it feels like you already experienced that real world input.


And editing the history if it guessed wrong. You were frozzen in fear, stunned, unable to contemplate, the hardware abstraction is fine, its the pilots failure.

https://en.wikipedia.org/wiki/Saccadic_masking

https://en.wikipedia.org/wiki/Change_blindness

https://en.wikipedia.org/wiki/Blindsight


Fascinating! How much of this world I perceive is real, I wonder!


Discussed at the time:

Brains Speed Up Perception by Guessing What’s Next - https://news.ycombinator.com/item?id=19858375 - May 2019 (86 comments)


Related:

Our brain is a prediction machine that is always active (https://www.mpi.nl/news/our-brain-prediction-machine-always-...)

https://news.ycombinator.com/item?id=32395840 (270 points | 7 months ago | 205 comments)


From 2014, two articles (same study, one from a science journalism source, the other from the university's publication)

Your Brain Is a ‘Prediction Machine’: It Predicts What the Other Person Is Going to Say https://www.learning-mind.com/your-brain-is-a-prediction-mac...

You Took the Words Right Out of My Brain - https://www.nyu.edu/about/news-publications/news/2014/april/...


In the last few decades, there has been an increased interest in the role of prediction in language comprehension. The idea that people predict (i.e., context-based pre-activation of upcoming linguistic input) was deemed controversial at first. However, present-day theories of language comprehension have embraced linguistic prediction as the main reason why language processing tends to be so effortless, accurate, and efficient.

https://www.psycholinguistics.com/gerry_altmann/research/pap...

https://www.tandfonline.com/doi/pdf/10.1080/23273798.2020.18...

https://onlinelibrary.wiley.com/doi/10.1111/j.1551-6709.2009...

https://www.earth.com/news/our-brains-are-constantly-working...

People say ChatGPT is just a next-word prediction machine. But these articles say human brains are doing the same, at some level.


This is sort of like pipelining in computer processors. As humans have quite a lot of latency, we make up for it by speculating what will happen in the short future with our experiences. This may also explain why people act irrationally for stressful situations or avoiding dangers and risks.


And why training for those situations, from disaster relief over surgeons to the military and professional fighters, is the way it is.


This seems to fit nicely with Jeff Hawkins' "Memory-prediction framework" theory of brain function.

https://en.wikipedia.org/wiki/Memory-prediction_framework


i think this helps explain why "you can't tell people anything" problem https://news.ycombinator.com/item?id=35282293 in that "most people aren’t very good at visualizing hypothetical, at imagining what something they haven’t experienced might be like, or even what something they have experienced might be like if it were somewhat different." Because the new information isn't aligned with their prediction. I think this is leveraged in propaganda to move a population to believe something untrue by leveraging a context and amplifying it. Literally enforcing <i>prejudice</i>.


Makes sense. Those who predict well are going to avoid that tiger and live another day


I wonder if this is related to prejudices. A survival instinct.


If you've ever tried machine learning, it's easy to understand what a prejudice is: inference based on learning on biased data. Or sometimes the data doesn't even need to be biased: your stereotyping may be factually true on average. It's just that we should not make assumptions based on averages, especially when they involve people.


Well... I mean if I'm going through an area known for violence and robberies and someone is following me, my prejudices might keep me alive. That's an extreme example of course but I think humans use their instincts every day. Calling it prejudices make it sound negative but I think it's just part of our survival instincts.

And just like anything else it can be used negatively, or positively. As long as we don't use our prejudices premeditated to harm others I think it's a survival tool. But when we use them to defame groups of people who are nowhere near us then it can become negative.


Yes.


Ask anybody who had psychosis - that's their prediction machine going crazy.


I think it depends. Part of it is definitely your “social predictor” turning on without being around others. But I often wonder if part of it is just thoughts - they’re things made up of cells (neurons) so they are organisms in their own right (and some theories suggest they compete with one another for your attention)

Most people are desensitized to this, some psychosis is the desensity breaking. Perceiving your thoughts as beings independent of yourself.


For me it's most noticeable when driving. Especially on the Autobahn and fast. You're guessing what other drivers will do constantly and can react way faster that way.


It's also way subtler effect. You see faster what you expect to see. You hear faster what you expect to hear.


Wonder if this brain predicting thing partially explains 'priming'. e.g. with the brainstorm/green needle audio https://www.youtube.com/watch?v=qXxV2C1ri2k


It sounded like the brain has a prior on what it perceives. So something like Bayesian information processing. This seems not to be the case.

At least in this preprint: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6516078/


Now I'm worried some enterprising security researcher will find a Spectre vulnerability in my thought process



Tons of vulnerabilities already discovered https://en.wikipedia.org/wiki/Cognitive_bias


Also optical illusions.


They already have that, it's called a cult, or some might say religion (I would, but that'll offend others).. I used to be religious (Mormon), now agnostic but still somewhat spiritual, I don't believe in a God, just that the universe could itself be one big consciousness, or even part of a bigger organism, etc... and I'm still inspired by people, my kids, the universe, technological accomplishments, etc... Same feels I got in religion, I get from watching a meteor shower now.


I think people who have played sports can attest to how much easier (and less physically demanding it is) to play defense in sports like basketball/soccer when you are able to see where the play is developing.


Autoregressive sequence modeling is all you need


NVIDIA DLSS for real life.


Too bad. Not a single mention of CTLoop in the whole paper.


Branch prediction?


all perception is already anticipation of the future, this is how the brain compensates for input lag


Sounds pretty autoregressive to me.


Just like ChatGPT right? :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: