GPT can't "see" the results of the scatterplot (unless prompted with an image), ...

dartos · 2025-02-08T18:07:08 1739038028

Before seeing Claude’s response, did you see where the author said

> I asked the model to closely look at the plot, and also uploaded a png of the plot it had generated.

Kuinox · 2025-02-08T18:10:47 1739038247

I sent the plot to ChatGPT 4o. Here is the conversation:

what do you see

ChatGPT said:

This is a scatter plot with the variables "steps" on the x-axis and "bmi" on the y-axis. The data points are colored by "gender" (red for female and blue for male). Interestingly, the arrangement of the points appears to form a drawing resembling a cartoonish figure or character, likely added for artistic or humorous effect.

If you'd like, I can help analyze or discuss the data further!

You said:

Wha the cartoonish figure looks like ?

ChatGPT said:

The cartoonish figure in the plot resembles a stylized gorilla or monkey-like character. It appears to be waving with one hand raised, while the other arm is resting downward. The face is expressive, with distinct eyes, a nose, and a slightly frowning mouth. The overall pose gives it a somewhat playful or cheeky vibe.

dartos · 2025-02-08T18:23:44 1739039024

Sure whatever.

OC seemed to think that Claude did that with just the data and not the image of the scatterplot it’s.

smusamashah · 2025-02-08T18:35:45 1739039745

LLM responses are random. One's failure is other's success. When evaluating we all should do rerurns and see how many times it fails or succeeds.

Without number of rerurns, the result is as good as random.

dartos · 2025-02-08T21:40:36 1739050836

Okay?

OC was saying that the article said that Claude recognized the “artistic” lines of the image from just the scatter plot data.

That isn’t what happened.

The author added a png of the plot to the conversation.

Idk why I need to explain that twice.

johnfn · 2025-02-08T18:11:47 1739038307

Hm, interesting. The way I tried it was by pasting an image into Claude directly as the start of the conversation, plus a simple prompt ("What do you see here?"). It got the specific image wrong (it thought it was baby yoda, lol), but it did understand that it was an image.

I wonder if the author got different results because they had been talking a lot about a data set before showing the image, which possibly predisposed AI to think that it was a normal data set. In any case, I think that "Your Ai Can't See Gorillas" isn't really a valid conclusion.

vunderba · 2025-02-08T18:15:10 1739038510

Please read TFA. The conclusion of the article isn't nearly so simplistic, they're just suggesting that you have to be aware of the natural strengths and weaknesses of LLMs, even multi modal ones particularly around visual pattern recognition vs quantitative pattern recognition.

And yes, the idea that the initial context can sometimes predispose the LLM to consider things in a more narrow manner than a user might otherwise want is definitely well known.

johnfn · 2025-02-08T18:27:06 1739039226

The title of the article is "Your AI Can't See Gorillas". That seems demonstrably false.

The article says:

> Furthermore, their data analysis capabilities seem to focus much more on quantitative metrics and summary statistics, and less on the visual structure of the data

Again, this seems false - or, at best, misleading. I had no problem getting AI to focus on visual structure of the data without any tricks. A more fair statement would be "If you ask an AI a bunch of questions about summary statistics and then show it a scatterplot with an image, then it might continue to focus on summary statistics". But that's not what the concluding paragraph states, and it's not what the title states, either.

8note · 2025-02-08T18:47:21 1739040441

you knew that there was a visual gag in there before asking it to.

if you didnt know it was there, and took a look at only the text output, the llm would not have found it to tell you its there

genewitch · 2025-02-09T19:07:10 1739128030

Yeah the people "it gives you the answer when you give it the answer" have kind of ruined my morning. Oh well.

SequoiaHope · 2025-02-08T20:04:43 1739045083

What you refer to as the article’s conclusion is in fact the article’s title. The article’s conclusion (under “Thoughts” at the end) may be well summarized by its first sentence: “As the idea of using LLMs/agents to perform different scientific and technical tasks becomes more mainstream, it will be important to understand their strengths and weaknesses.”

The conclusion is quite reasonable and the article was IMO well written. It shares details of an experiment and then provides a thoughtful analysis. I don’t believe the analysis is overly broad.

KeplerBoy · 2025-02-08T18:05:52 1739037952

Does ChatGPT even have access to the raw data points or does it just know the path to some CSV?

The contents of the CSV might be entirely unknown at inference time.