Hacker News new | past | comments | ask | show | jobs | submit login

That what I find so funny. Again, UX innovation over LLMs is what makes ChatGPT so hot right now, like Hansel, but I mean, the product is tragically flawed as like all LLMs at the moment.



I believe that’s because people are using it wrong. Asking for facts is its weakness. Aiding creativity and more narrowly productivity (by way of common sense reasoning) is its greatest strength.


The product Microsoft is showing off is a fact-finding engine. Just look at the demo, they have built the AI model into the search experience and the demo shows it used exclusively to provide (supposedly) factual information [0]. It's not the users' fault that companies are building the wrong product.

[0]: https://www.youtube.com/watch?v=FLsr_sUVgrA


“Give me a meal plan for next week for my family of four that has vegetarian options and caters to those who don’t like nuts”

“Create a summary of this itinerary in an email that I can send to my family”

“65-inch TV” -> refine with “Which is the best for gaming?”

Seems like more than a fact-finding engine.


I don’t not include Microsoft here in the audience of “using it wrong.”

It’s also very tempting when you get coherent text out to believe it. Hopefully the underlying tech will get better and/or people will understand it’s weaknesses… except the inability to ascertain clear misinformation gives me pause.


I think there's also potential value in being able to give feedback on results. If I try to search for something on Google right now and it doesn't give me what I want, my only options are to try a different query or give up. This puts the onus on me to learn how to "ask" properly. On the other hand, using something like ChatGPT and asking it a question gives me the option to tell it "no, you got this part wrong, try again". This isn't necessarily useful for all queries, but some queries might have answers that you can verify easily.

Over the weekend, I was shopping for laptops and tried searching "laptops AMD GPU at least 2560x1440 resolution at least 16 GB RAM", and of course Google gave all sorts of results that didn't fit those criteria. I could use quotes around "16 GB RAM", but then some useful results might get excluded (e.g. a table with "RAM" or even "Memory" in one column and "16 GB" in another, or a laptop with a higher resolution like 4K), and I'd still get many incorrect results (e.g. an Amazon page for a laptop with 1920x1080 resolution and then a different laptop in "similar options" with 2560x1440 resolution but an Nvidia GPU). I decided to try using ChatGPT to list me some laptops with those criteria; it immediately listed five correct models. I asked for five more, and it gave one correct option and four incorrect ones, but when I pointed out the mistakes and asked for 10 more results that did fit my criteria, it was able to correctly do this. Because I can easily verify externally if a given laptop fits my criteria or not, I'm not at risk of acting on false information. The only limitation is that ChatGPT currently won't search the internet and has data limited to 2021 and earlier. If it had access to current data, I think there would be a lot of places that it would be useful, especially given that it wouldn't necessarily replace existing search engines, but complement them.


I would argue this would be better done but google or someone else specializing faceted search over structured data. GPT may smooth over results that are coded as near misses (eg USB vs USB3),but as you said it gave you nearly half wrong incorrect data. There are also ways with toolformer it could call the right APIs and maybe interpret the data, but as is LLMs aren’t the right tech to fetch data from like this.


Most of OPs dissatisfaction with shopping for laptops on Google stem from query understanding fails. Google needs to understand that the challenge is for them to evolve their UX so that users can intuitively tune their searches in a natural manner.

A pure LLM approach is going to quickly lose its novelty as reprompting is frustrating and it is difficult to maintain a long lived conversation context that actually 'learns' how to interact with the user (as a person would).


Maybe, but I think the point of all this discussion is that Google _hasn't_ done something like this. It's not an unreasonable take that their lack of progress on this front is exactly why solutions like this are noticeable improvements in the first place. Sure, Bing AI isn't better than Google with ChatGPT, but the fact that it's a discussion at all is a sign of how far Google as fallen; if we're setting the bar at the same place for both Microsoft and Google for search products, then Google has already lost their lead, and that's a story on its own.


Agree, areas where correctness can be sub 90%. When a BA is making business docs, do they want creativity in querying factual data for a report they are creating? How tempting is it to not just use it for "creative" tasks?


Have you used it?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: