More

mckirk · 2024-07-03T16:09:09

Something I learned only recently: Weather forecasts don't actually directly use sensor data. Instead, a physically consistent model is first fitted to all the available sensor data, and then the forecast is made based on the values that model produces. Doing it this way has the benefit that physically implausible sensor readings are given less importance, and the fact that this model can be sampled in regular intervals, whereas the sensors are all over the place (and often moving, e.g. in aircraft, which contribute crucial data).

Of course, higher density of sensors would lead to a better fit of the model to the real world, but there would still be no guarantee that the model would reflect the measured values exactly. I found that pretty interesting.

(And it's kind of funny to think about our own consciousness in this way, which seems to work somewhat similarly: we don't experience the actual 'sensor values', but instead we experience the output of a model our brain fits to those inputs.)

mckirk · 2024-07-03T15:58:07

Maybe it's a new PR move to seem more approachable, by buying high and selling low like the rest of us.

mckirk · 2024-06-13T21:45:36

Maybe it's just me (or maybe I've invested too much money into headphones), but I actually liked the Opus sound better at 6 kbps. The MLow samples had these... harsh and unnatural artifacts, whereas the Opus sound (though sounding like it came from a tin-can-and-string telephone and lacked all top-end) at least was 'smooth'. But I'm pretty sure that's because they are demonstrating here the very edge of what their codec can do, at higher bitrates the choice would probably be a lot clearer.

mckirk · 2024-06-10T13:23:13

I think you _can_ make an LLM 'have' curiosity, for all practical intents and purposes.

I'm e.g. thinking of the 'have the LLM research a topic' task, such as the 'come up with suitable search terms, search the web, summarize the article, then think of potential next questions' cycle implemented by Perplexity, for example. I'm pretty sure the results would vary noticeably between an LLM that was trained to be 'curious' (i.e., follow more unusual trains of thought) versus not, and the differences would probably compound the more freedom you give the LLM, for example by giving it more iterations of the 'formulate questions, search, summarize' loop.

HarHarVeryFunny · 2024-06-10T13:40:49

The problem is how can "follow more unusual trains of thought" apply to a language model ? Sure it can selectively attend to certain parts of the input and generate based on that, but what is the internal signal for "unusual" ? Any selective focus is also going to appear like groundhog day since the model's weights are fixed and what was "unusual" today will continue to be "unusual" even after it's been exposed to it for the 1000th time!

mckirk · 2024-06-10T16:45:54

That's a good point.

Thinking about this a bit, it might be a bit late actually to start to guide an LLM towards curiosity only at the fine-tuning stage, since this 'exploring unusual-trains-of-thoughts' is precisely what the LLM _isn't_ learning during training, where it sees (basically by definition) a ton of 'usual trains-of-thoughts'. Maybe you'd have to explicitly model 'surprise' during training, to get the LLM to try to fit precisely those examples better that don't really fit its already learned model (which would require the network to reserve some capacity for creativity/curiosity, which it otherwise might not do, because it's not necessary to model _most_ of what it sees). But then you enter the territory of 'if you open your mind too much, your brain might fall out', and could end up accidentally training QAnonGPT, and that you definitely don't want...

So maybe this way of 'hoping the LLM builds up enough creative intelligence during training, which can then be guided during fine-tuning' is the best we can do at the moment.

mckirk · 2024-06-07T23:00:29

I don't necessarily disagree with the rating of 10 here (I know anything about the actual impact of this vulnerability), but please note that CVSS really isn't a perfect system, and it is quite easy to reach ridiculously high CVSS scores with even minor vulnerabilities, if you are 'maybe a bit too literal' in its interpretation.

The official CVSS3.1 example score for a stored XSS is 9.0.

mckirk · 2024-06-06T14:04:53

'The little flap that could'...

Watching the stream and hearing the excitement of the whole team in the background honestly made me tear up a little. Congratulations!

matthewpick · 2024-06-06T20:13:05

A bit more exciting than my standard day at the office! Production outage is about as spicy as it gets…

mckirk · 2024-05-16T23:13:46

"No dude, the bribe you offered was too much so the LLM got spooked, you need to stay in a realistic range. We've fine-tuned a local model on realistic bribe amounts sourced via Mechanical Turk to get a good starting point and then used RLMF to dial in the optimal amount by measuring task performance relative to bribe."

akoboldfrying · 2024-05-17T02:01:21

RLMF: Reinforcement Learning, Mother Fucker!

mckirk · 2024-05-14T23:16:31

Great, now I have that whistling stuck in my head again.

Thanks for the reminder though, been a while since I've thought of The Wire :)

dpflan · 2024-05-15T02:40:16

Oh, indeed.

mckirk · 2024-04-27T09:11:31

I don't know if you know this yet, but what greatly helps me when I've stood up too fast and am on the verge of 'seeing stars' is tensing up my abdominal muscles, which essentially helps push back the blood towards the brain.

I figured, if it's good enough for fighter pilots pulling 8g maneuvers, it should be good enough for me accidentally getting out of bed too quickly. And turns out it works!

mckirk · 2024-04-24T00:53:00

That's always the problem with these pesky computers.

They do exactly what you tell them to.