OpenAI – Deep Research

dchuk · 2025-02-03T01:40:41 1738546841

Looks amazing, but the limitations section seems concerning given the nature of the tool:

“Limitations Deep research unlocks significant new capabilities, but it’s still early and has limitations. It can sometimes hallucinate facts in responses or make incorrect inferences, though at a notably lower rate than existing ChatGPT models, according to internal evaluations. It may struggle with distinguishing authoritative information from rumors, and currently shows weakness in confidence calibration, often failing to convey uncertainty accurately. At launch, there may be minor formatting errors in reports and citations, and tasks may take longer to kick off. We expect all these issues to quickly improve with more usage and time.”

Given the point of the agent is to replace hours of manual research and factual distillation/summarizing, the risk of hallucinations and confident falsehoods in the data sort of undermines the whole shebang. If you want rock solid data but can’t afford to spend the time getting it, seems like a terrible trade off to save time but increase risk of bad data.

kingkongjaffa · 2025-02-03T13:27:07 1738589227

Interesting!

Some ideas that came to mind while reading this comment:

I wonder if this will make paywalls more common?

One moat could be to gatekeep information and specifically highlight "this data can not be found in your LLM powered tools".

If everyone is using LLM's to solve a problem, access to information becomes a competitive asymmetry since you can presumably solve it with better data.

someone could make a "trusted dataset" and sell that as a plug in that you give to your LLM so it can access higher quality data than the generally available internet (plus all the pirated content that OpenAI trained their stuff on allegedly.)

falcor84 · 2025-02-03T00:20:20 1738542020

While this seems to be a legitimately strong technical achievement pushing the SotA, I can't help but feel that it's cheapened by the use of the word "Deep" and the way in which their "Deep research" button seems to be a UI copy of Deepseek's "DeepThink" button.

Am I right that at least from a marketing standpoint, it appears that OpenAI are now on the defensive?

ggm · 2025-02-03T00:52:29 1738543949

I think you are right. Having somebody release service with claims of 10x efficiency rather knocks a hole in messaging about your super-large capex to deliver things. (I thought to the contrary but it's been pointed out to me you probably cannot deliver 10x (financial) outcome on that investment.)

I harp on a lot in HN about the value of adjectives, and words as simile. I look at "deep" and thinks its value is close to "nice" -it doesn't actually mean much more than "please like me"

I'd almost say belief that this is deep, is a .. "hallucination"

I think if you treat "deep" as a noun: It's "my company" flagging. This is OpenAI research. It's depth, is not defined as "deep"

irs · 2025-02-03T00:26:44 1738542404

Apparently, even google used a similar term for Gemini back in December. https://techcrunch.com/2024/12/11/gemini-can-now-research-de...

InkCanon · 2025-02-03T02:06:40 1738548400

I think they definitely are. To customers, there's close to zero perceptible benefit when X model scores significantly higher on some benchmark they never heard of (or even know what is about). So they're desperately trying to make it different in some way.