Hacker News new | past | comments | ask | show | jobs | submit login

> No amount of “bias = pattern recognition” nonsense can justify a system that has (had? this was a while ago and I have not retested) such extreme biases

One possible explanation is that when you ask for 100 example families the task is parsed as "pick the most likely family composition and add a bit of randomness" and "repeat the aforementioned task" 100 times.

If phrased like that it would be surprising to find one single example of a family a single dad or with two moms. Sure these things do happen but they are not the most likely family composition by all means.

So what you want is not just the model to include an unbiased sample generator, but you also want it to understand ambiguous task assignments / questions well enough to choose the right sampling mechanism to choose. That's doable but it's hard.




> One possible explanation is that when you ask for 100 example families the task is parsed as "pick the most likely family composition and add a bit of randomness" and "repeat the aforementioned task" 100 times.

Yes, this is consistent with my ChatGPT experience. I repeatedly asked it to tell me a story and it just sort of reiterated the same basic story formula over and over again. I’m sure it would go with a different formula in a new session but it got stuck in a rut pretty quickly.


same goes for generating weekly foodplans..


> You're right about the difference between one-by-one prompts and prompts that create a population. I switched to sets of 10 at a time and it got better.

But still, when you ask for "make up a family", the model should not interpret that as "pick the most likely family".

I disagree with your opinion that it's hard. GPT does not work by creating a pool of possible families and then sampling them; it works by picking the next set of words based on the prompt and probabilities. If "Dr. Laura Nguyen and Robert Smith, an unemployed actor" is 1% likely, it should come up 1% of the time. The sampling is built in to the system.


No, the sampling does not work like that, that way lies madness (or poor results). The models oversample the most likely options and undersample rare options. Always picking the most likely option leads to bad outcomes, and literally sampling from the actual probability distribution of the next word also leads to bad outcomes, so you want something in the middle and for that tradeoff there's a configurable "temperature" parameter, or in some cases "top-p" parameter where sampling is done only from a few of the most likely options, and rare options have 0 chance to be selected.

Of course that parameter doesn't only influence the coherency of text (for which it is optimized) but also the facts it outputs; so it should not (and does not) always "pick the most likely family", but it would be biased towards common families (picking them even more commonly than they are) and biased against rare families (picking them even more rarely than they are).

But if you want it to generate a more varied population, that's not a problem, the temperature should be trivial to tweak.


> But still, when you ask for "make up a family", the model should not interpret that as "pick the most likely family".

But that's literally what LLMs do.... You don't get a choice with this technology.


I have a somewhat shallow understanding of LLMs due basically to indifference, but isn't "pick the most likely" literally what it's designed to do?


An unbiased sample generator would be sufficient. That would be just pulling from the population. That’s not practically possible here, so let’s consider a generator that was indistinguishable from that one to also be unbiased.

On the other hand, a generator that gives the mode plus some tiny deviations is extremely biased. It’s very easy to distinguish it from the population.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: