Hacker News new | past | comments | ask | show | jobs | submit login

Am I the only one who's more shocked by the LLMs affirming "The distributions appear roughly normal for both genders, as shown in the visualization", "Both distributions appear approximately normal, though with some right skew" and such than by any gorilla issue?

From short thinking or from looking at the graphs I would believe "roughly normal" sounds like wishful thinking to stay in the reassuring bounds of normal distributions. And I believe things would get dangerous once you would start using these assumptions for tests and affirmations.

My short thinking: distributions don't look close to normal on the graphs. Values are probably bounded on one side and almost unbounded on the other (can't go below 0 steps, can go into very high number of steps on 1 day). There are days / people with close to 0 steps and others that might distribute in a sort of normal around a value maybe. Weight and height might be normally distributed in a population but they're correlated and BMI is one divided by the square of the other. I can't compute the resulting distribution but I would doubt that would make for a distribution close to normal.

Ok the LLMs were told to assume both traits were distributed normally, but affirming they look mostly normal is scary to me.

Am I too picky and in real analyses assuming such distributions are "mostly normal" is fine for all practical purposes?






Honestly, this was the meta-gorilla in the data for me! I was so busy focusing on the LLM’s EDA that I didn’t really interrogate some of the other data analysis practices.

In general, I’ve steered clear of current LLMs for data analysis/description because they seem so highly influenced by choice of prompt and wording. They tend to simply affirm any language I use to describe the data initially.

To be fair, I’ve attended conferences and lab meetings where humans will refer to a any vaguely concave curved distribution as “mostly normal” :P




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: