Hacker News new | past | comments | ask | show | jobs | submit login

People constantly recommend Bard to me, but it returns false and/or misleading info almost every time



I had the opposite experience. I was trying to figure out what this weird lightbulb was that I had to replace, and it had no writing on it. I uploaded the description and picture to gpt4, and it just got it clearly wrong over and over. Tried bard, got it right in the first try, with links to the product. I was extremely impressed.


I had a similar experience with Google lens recently. I've gotten used to Yandex image search being better than Google's for many searches, but I needed to figure out what model of faucet I had on my sink, and Google nailed it. My hunch is that because of all the work they did on Google shopping gathering labeled product images and things like that, they have an excellent internal data set when it comes to things like your light bulb and my sink.


I use Google Lens at least once a week to find where I can buy a certain jacket, shoes, etc. that I see. It is one of the only 'AI' products I can say I trust the results.


The vast knowledge trove of Google can't be understated, even if sometimes the model isn't as competent at certain tasks as OpenAI's GPT models.


If there's one thing that's becoming clear in the open source LLM world, it's that the dataset really is the 'secret sauce' for LLMs. There are endless combinations of various datasets plus foundation model plus training approach, and by far the key determinant of end model performance seems to be the dataset used.


> it's that the dataset really is the 'secret sauce'

alwayshasbeen.jpg

There have been articles about how "data is the new oil" for a couple of decades now, with the first reference I could find being from British mathematician Clive Humby in 2006 [0]. The fact that it rings even more true in the age of LLMs is simply just another transformation of the fundamental data underneath.

[0] https://en.wikipedia.org/wiki/Clive_Humby#cite_ref-10



> There have been articles about how "data is the new oil" for a couple of decades now, with the first reference I could find being from British mathematician Clive Humby in 2006

I am specifically referring to the phrase I quoted, not some more abstract sentiment.


The best answer to this is https://www.youtube.com/watch?v=ab6GyR_5N6c :)


Isn't there just a comment today on HN saying Google had an institutional reluctance to use certain data sets like libgen? I honestly don't think Google used everything they had to train their LLM.

https://news.ycombinator.com/item?id=38194107


This is almost certainly just delegating to Google Lens, which indeed works great.

Bard's probably just a middle man here.


I would think that ChatGPT is also delegating to some other subsystem.


Right, a glance at the new Assistant API docs, which seems to mirror ChatGPT in functionality, suggests that the "Assistant" determines which tool to use, or which models to use (code or chat) to generate a response message. The API is limited in that it can't use Vision or generate images, but I imagine that those are just "Tools" too that the assistant has access to.


with a poorer dataset than google


Unless wired to Bing, probably not?


I mean was a GPT even involved really? If you gave just Lens a shot, I'm sure it also would've picked it up and given you a link to some page with it on.


Ah that’s interesting, did know bard had multi modal inputs.


I asked Bard the question "Who is Elon Musk?"

The response: "I'm a text-based AI, and that is outside of my capabilities."


That's interesting. Did you have anything else in the thread, or was it just the first question?

For me it returns a seemingly accurate answer [1], albeit missing his involvement with Twitter/X. But LLMs are intrinsically stochastic, so YMMV.

[1] https://g.co/bard/share/378c65b56aea


It was the first question in the thread, and I've been testing queries along these lines on it for a while. Interestingly, it started writing a response initially and then replaced it with that. It used to refuse to answer who Donald Trump is as well, but it seems that one has been fixed.

Another interesting line of inquiry (potentially revealing some biases) is to ask it whether someone is a supervillain. For certain people it will rule it out entirely, and for others it will tend to entertain the possibility by outlining reasons why they might be a supervillain, and adding something like "it is impossible to say definitively whether he is a supervillain" at the end.


That's because these GPTs are trained to complete text in human language, but unfortunately the training data set includes human language + human culture.

I really think they need to train on the wider dataset, then fine tune with some training on a machine specific dataset, then the model can reference data sources rather than have them baked in.

A lot of the general purposeness but also sometimes says weird things and makes specific references is pretty much down to this I reckon...it's trained on globs of human data from people in all walks of life with every kind of opinion there is so it doesn't really result in a clean model.


Sure, but it's been known for a long time that models can have these types of weaknesses & biases. They can be tested for & ironed out.

If you ask the same questions to ChatGPT you tend to get much more refined answers.


True, but I think the learning methods are similar enough to how we learn for the most part and the theory that people are products of their environments really does hold true (although humans can constantly adjust and overcome biases etc if they are willing to).

Ironing out is definitely the part where they're tweaking the model after the fact, but I wonder if we don't still need to separate language from culture.

It could help really, since we want a model that can speak a language, then apply a local culture on top. There's already been all sorts of issues arise with the current way of doing it, the Internet is very America/English centric and therefore most models are the same.


So Elon is the engine behind Bard?


What makes you think ChatGPT isn't also returning false and/or misleading info? Maybe you just haven't noticed...

Personally, I struggle with anything even slightly technical from all of the current LLM's. You really have to know enough about the topic to detect BS when you see it... which is a significant problem for those using it as a learning tool.


This is my problem with chatgpt and why I won't use it; I've seen it confidently return incorrect information enough times that I just cannot trust it.


The version with search will give you links to the references it bases its answers on.


You still have to go read the references and comprehend the material to determine if the GPT answer was correct or not.

I don't know the name for the effect, but it's similar to when you listen/watch the news. When the news is about a topic you know an awful lot about, it's plainly obvious how wrong they are. Yet... when you know little about the topic, you just trust what you hear even though they're as likely to be wrong about that topic as well.

The problem is people (myself included) try to use GPT as a guided research/learning tool, but it's filled with constant BS. When you don't know much about the topic, you're not going to understand what is BS and what is not.


In my particular case, the fact that it returns bullshit is kind of useful.

Obviously they need to fix that for realistic usage, but I use it as a studying technique. Usually when I ask it to give me some detailed information about stuff that I know a bit about, it will get some details about it wrong. Then I will argue with it until it admits that it was mistaken.

Why is this useful? Because it gets "just close enough to right" that it can be an excellent study technique. It forces me to think about why it's wrong, how to explain why it's wrong, and how to utilize research papers to get a better understanding.


> You still have to go read the references and comprehend the material

Like...it always has been?


How many will actually do that when presented with convincing, accurate-sounding information?

There's the problem... and it defeats the entire purpose of using a tool like GPT.


The Gell-Mann Amnesia Effect


I get information from unreliable sources all the time. In fact all my sources are unreliable.

I just ignore how confident ChatGPT sounds.


True, it often returns solutions that may work but are illogical. Or solutions that use tutorial style code and fall apart once you tinker a bit with it.


Gpt-4 has the lowest hallucination rates:

> OpenAI’s technologies had the lowest rate, around 3 percent. Systems from Meta, which owns Facebook and Instagram, hovered around 5 percent. The Claude 2 system offered by Anthropic, an OpenAI rival also based in San Francisco, topped 8 percent. A Google system, Palm chat, had the highest rate at 27 percent.

https://www.nytimes.com/2023/11/06/technology/chatbots-hallu...


I would wonder how they're measuring this, and I'd suspect it varies a lot depending on field.


i just skimmed through the article - it seems that the numbers are quoted from a company called Vectara. Would be interesting to see how they are getting to this estimate


Sounds like Bard is catching up with ChatGPT in features then.


I noticed that too, it just doesn't seem that good


Bard is weird. It started embedding these weird AI generated images inline alongside text responses, making it very hard to read because of fragmentation. Does anyone know how to turn it off?




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: