Looks a bit like snakeoil to me. A lot of companies now spinning up simple demos with opaque backends, making huge claims they’ve solved X hard problem for/with AI, then saying “trust us” and “join our waitlist” without hard details or facts to show for it. If you could detect hallucinations/biases etc that easily, don’t you think OpenAI would’ve worked on something like this?
> don’t you think OpenAI would’ve worked on something like this?
Along this line of thought: was it a massive oversight for them to not train the model to say "math detected, let me pass that to a solver" instead of trying to guess what token should come next in a math problem?
Why is that a killer feature? Humans are quite good at asking different people different questions. If I need to do a simple math problem I'll just prefix "calculate" and pop it into Google, whereas if I want an intro to a named thing I'll prefix "wikipedia". That's not hard.
GPT is quite useful, but not because it solves the problem of "I don't know where the question I have is answerable by a calculator"
I think part of the problem is that it's technically correct to say "my product does X" even if it does X extremely poorly. I'm not sure if this can be changed because any line for "does-X vs does-not-adequately-do-X" is going to necessarily be subjective.
So personally I think the problem is that people see "this product does X" and interpret that to mean that it does X well. I don't think it's necessarily bad that we're seeing an explosion of AI tools that are a bit underwhelming if people understood it as such -- we're on, after all, a site with a heavy startup focus and saying "your product doesn't do everything that I want" is a bit antithetical to that.
But yeah specifically for this one there are arguments that "X is not even possible, especially not with this approach" so it's a bit more egregious.
This isn't new it's just more obvious with this tech. Every sales team at nearly every company has been performing this dance for like hundreds of years.
It's good to have third parties (apart from Open AI) that assess the quality of Open AI results. It's the way audits work, it has to be independent...
Also, third parties are essential to compare the results from ChatGPT with the results of other LLMs. These are important checks to assess the robustness of OpenAI results!
I can't help but notice your accounts only activity before this post was praising another giskard.ai submission a few months ago. Anything you'd like to disclose?
He didn't say it's not important. He is just pointing out that black-box third party verification is not worth much when you can't independently verify the verifiers.
The OPs point is that it’s likely impossible to do what is claimed here in general. Imagine the LLM says something like Fermat’s Last Theorem. To verify it, you’d have to either 1) have a proof assistant powerful enough to construct a proof 2) use a second ML model to guess truthfulness. The former is technically challenging and the latter is another model, with its own biases and factual inconsistencies.
And for a large swats of things, how can it possibly work? It’s not possible to say if or if not it is hallucinating code for almost all code and apis, for instance. And I see similar issues with many fields outside pure facts. With privacy issues as well.
It would appear that this is not automated monitoring but more like a second stage of human reinforcement learning or perhaps a classifier. It seems that you create input/output examples and the LLM responses are examined by a secondary system (which I’m guessing is probably NOT an LLM, otherwise it would be vulnerable to attacks) and perhaps force regenerates the LLM response if it doesn’t meet the classification threshold.
At least, that sounds more believable to me than someone claiming they’ve fixed the inherent flaws in LLMs.
We are a team of engineers & researchers on AI Alignment & Safety, we're investigating multiple methods, including metamorphic testing, human feedback, benchmarks with external data sources, and LLM explainability methods.
Currently, fact checking works on straight facts. It does a Google Search and uses LLMs to shorten it. Once it has the short version, it will compare the short results with the answer provided by ChatGPT itself. Premium tiers would get better fact checking sources than just google. We're investigating various data sources and comparison methods.
Note that fact checking / hallucinations is just one of the types of satety issues we'd like to tackle. Many of these are still open questions in the research community, so we're looking to build and develop the right methods for the rights problems. We also think it's super important to have independent third-party evaluations to make sure these models are safe.
This is a new tool we're building in the open, and we're interested in your feedback to prioritize!
> Currently, fact checking works on straight facts.
Wow, you guys have a database of all the facts?
> It does a Google Search and uses LLMs to shorten it.
Oh...
...actually, this is an empirical fact checker. I wouldn't call it "fact-based", as it's epistemologically an absurd statement, but "empirical fact checking" sounds good and presents an idea that is very close to how humans verify information in the first place - by checking multiple sources and searching for correlation.
For what it's worth, I think your approach makes sense. Good luck.
> Currently, fact checking works on straight facts. It does a Google Search and uses LLMs to shorten it.
So your fact-checking LLM is also vulnerable to injection and unethical prompting then when it ingests website text. And a Google search is far, far away from fact checking, particularly for the subtle errors that GPT-4 is prone to making.
Someone I know thought that LLaMA was unbiased because they 'read' the paper and clearly didn't know what anything meant. A great example of "a little knowledge is a dangerous thing".
I've thought about it for four seconds now and I still agree with him. Maybe if you shared an example instead of a unsubstantiated put down it would help.
It doesn't. Mean anything, I mean. Language isn't well defined, so its accuracy is also undefined and a non-uniform deviation from that accuracy is super undefined.
This is like saying, "I've developed a new compass for a deep space probe to help it find North!"
Our society is actively declaring that falsehoods are truth, and should be celebrated. We're hallucinating ourselves. All this software does is make sure LLMs hallucinate with us.
Before long, we could end up with left-leaning and right-leaning AIs autonomously fighting the 'culture war' over social media, much more advanced than simple bots spamming copy+paste comments.
Combined with ever-improving ways to fake video and voices, things could get even uglier than they've been over the last few years.
Well I think that its hallucinations are a good demonstration that none of the post-truth subjectivist philosophies were ever things that many people took very seriously. ChatGPT is the real thing as it relates to extreme relativism: it really cannot tell the difference between true and false and doesn't really care, either. By contrast, the apostles of post-truth were really only trolling for a response... no one really lives by its principles because it's so impractical and disastrous. To really believe in post-truth, you must perceive no difference between raisin bran with cyanide and raisin bran.
It's impossible to answer this without getting political. Instead, let's just say every previous generation has been critically wrong about some things. Statistically, we're unlikely to be the outlier.
Some of the ugliest episodes in human history were caused by people who believed their political positions were not political positions, but unarguable statements of the True and Good.
Eeehhh? Im not sure truth* exists, but there are things that we accept as true and things that are so fundamental that it doesn't occur to us to question them - these things are inherently political. Just to be clear - not using that word to refer to the specific species of polarized discourse that we got in the states, talking about the nature of power and the human condition.
Curious what you consider to be true though? I'm coming at it from the perspective that even in physics where we can isolate so nicely we still aren't divining any truths, just making models with increasing explanatory powers.
Personally, I've been reaching more towards 'shared values' than 'truth', this is likely the pedant in me but truth doesn't feel tractable whereas shared values feels like it has less baggage?
Does shared values here just mean definitions? Such as the number of carbon atoms in a mole, 5+9 in base 10, the average number of protons in a carbon atom is a specific value, and leptons exist?
No, shared values is referring to the moral/emotional stuff, I find it more useful when trying to bridge the gap in a pretty politically charged environment to reconnect on simple things like wanting other people to be happy and healthy.
Are those true things? Good candidates, I like 'leptons exist'. Do you mind if we just gently ignore the math one? Feels like inviting the whole 'is math invented or discovered' thing.
1) carbon atoms in a mol - a mol is a counting number so it seems tautological to declare this one a truth
2) pass :)
3) this seems like a good candidate but it also seems to reduce truth to just the things we measure and only to the extent that we can be accurate (I'm also assuming you meant neutrons, protons are static by specie). Purely hypothetically there could be a whole heap of unusually heavy or light carbon out there that would disprove one or another of our theories. To put it another way; is the average number of apples that a trees grows in a year 'true'? It'll change year after year after all. I'm fine with a definition of truth that implies error bars and best efforts but I feel it falls short of the colloquial definition.
4) I think the pure observation that a thing somewhere exists is probably the closest to true, the rebuttals against that would all be self consuming anyway. The specific claim that leptons exist seems a little more fraught though - we could conceivably come to another conclusion if that better fit the facts.
So, can we call these things true if our concept is potentially incomplete or incorrect?
I’m a little confused on the statement “truth isn’t political”. This kind of goes against what I understand politics to be, which is the negotiation of a broader societal trend, which doesn’t itself have to do with whether or not the societal trend has a factual basis. The truth may be that cigarettes cause cancer, but the politics are obviously that acknowledging this would encourage society to implement top down policies to limit cigarette use. In this way, “cigarettes is a carcinogen” is a truth with significant political weight, which is what I understand a political truth to mean. Is my understanding different from yours?
If I'm understanding the parent comment correctly: a fact may have political implications but it doesn't depend on politics. In other words reality is independent of our interpretation of it (i.e. philosophical realism). The rub of course being that coming to know facts about most things is a highly social process filtered through interpretation and biases. Everything can be political if it needs to be decided upon by a group.
EDIT: I have avoided using "truth" here because it's a more general term than "fact" which has the connotation of being in reference to something concrete.
Generally when people talk like this nowadays they mean trans people, or the LGBT community in general. Sometimes Jews, though those types don't say that part out loud on HN too often.
There were lots of good examples during COVID. Remember that you don't need to wear masks, because washing your hands is enough (and we need to save the masks for doctors, but we're afraid to say that, because it will cause a run on masks). Remember that staying 6 feet apart is a magic distance over which COVID cannot cross (or maybe 1 meter, truth depends on the country you live in). Remember that you can't eat inside a restaurant, but you can eat outside, and it's okay for the restaurant to build partial walls around their outdoor spaces to make them more comfortable. Remember that COVID definitely could not have come from a lab leak, and it's racist to even suggest it might have happened. Never mind that the scientist who started the anti-lab-leak open letter was himself heavily funded for GoF research, and he refused to sign his own letter for political reasons.
I don't claim to know the answers to all of the questions (and I certainly don't know where COVID came from), but clearly there are plenty of cases where dubious statements were strongly enshrined as "True" in a way that required major online players to suppress alternative beliefs as "False".
A big difficulty is the conflation of fact with judgement. 'Vaccines work', 'masks don't work', 'a lab leak is impossible', etc. are judgements, not facts. They are not even hypotheses, in that there is no clear criteria by which they can be falsified. Hence fact-checking presents obvious problems, as in practice it will be judgement-checking.
These are the kinds of things I can see taking off, for better or worse. I know Adobe's product is worse than Midjourney's, for example, but once the hype meets reality, companies are going to want to be safe when they start using AI formally.
This is exciting to see, as I am concerned about the hallucinations, biases, privacy, licensing, etc issues. I imagine the results are minimal at the moment, but perhaps soon they will be useful
What does it even mean to detect hallucinations. The AI doesn't say something trivially false. While using GPT4 I have observed that it lies on simple things I didn't expect it to, while complex things it does very well on.
TLDR: It lies on fact based information which is mentioned in very very few places on the internet and not repeated too much. Short of having a human with the context, how do you even detect it.
Example: Ask it to describe a "Will and Grace" episode with some guest appearance. It will always make up everything including the episode number and the plot, and the plot seems very believable. If you have not watched and can't find a summary online, it is hard to say that it is a lie.
There are many ways to detect hallucinations. Basically, either you have the ground truth answers in external database, in that case you compare to ground truths. Or you don’t have the ground truth. In that case, you need to do metamorphic testing. See this article on it: https://www.giskard.ai/knowledge/how-to-test-ml-models-4-met...
But GPT4 doesn't hallucinate on things which are popular enough to be replicated enough times on the web as knowledge. It hallucinates on things which are very less likely to be repeated many times. That rules out an external database with true answers. Unless the external database is supposed to contain all info queryable in all ways, in which case the database is just a better version of GPT-X.
The metamorphic testing approach is interesting and might work.
I've been playing with GPT4 summarization of hard knowledge that has an external database with true answers that GPT knows about, and it's still hallucinating regularly.
Metamorphic testing seems to try to map an output of a model to a ground truth, which I guess is great if you have a database of all the known truths in the universe.
Not exactly, metamorphic testing does not need an oracle. That’s actually the reason of its popularity in ML testing. It works by perturbing the input in a way that will produce a predictable variation of the output (or possibly no variation).
Take for example a credit scoring model: you can reasonably expect that if you increase the liquidity, the credit score should not decrease. In general it is relatively easy to come up with a set of assumptions on the effect of perturbation, which allows evaluating the robustness of a model without knowing the exact ground truth.
That is beside the point. My point is that detecting hallucinations seems like a very very hard problem.
The utility of it is there and has nothing to do with making up episodes instead of quoting the current one. Like you can ask it to write new episodes with specific settings and specific constraints. Hallucination is not the value add. Nobody is excited because it hallucinates. People are excited despite it since the other value add is too much.
But deliberately requesting and receiving content generation is altogether different from requesting a factual answer and receiving plausible-seeming nonsense. Or at least, it's different to the person asking; it's the same thing as far as the model is concerned.