Hacker News new | past | comments | ask | show | jobs | submit login

I think all of these can be summarized into three items

1. Search engine - Words like "teach" or "learn" used to be slapped on Google once upon a time. One real great thing about LLMs here is that they do save time. The internet these days is unbelievably crappy and choppy. It often takes more time to click through the first item in the Google result and read it than to simply ask an LLM and wait for its slowish answer.

2. Pattern matching and analysis - LLMs are probably the most advanced technology for recognizing well-known facts and patterns from text, but they do make quite some errors especially with numbers. I believe that a properly fine-tuned small LLMs would easily beat gigantic models for this purpose.

3. Interleaving knowledge - this is the biggest punch that LLMs have, and also the main source of all the over-hype (which does still exist). It can produce something valuable by synthesizing multiple facts, like writing complex answers and programs. But this is where hallucination happens most frequently, so it's critical that you review the output carefully.




With number 3.

The problem is that AI is being sold to multiple industries as the cure for their data woes.

I work in education, and every piece of software now has AI insights added. Multiple companies are selling their version as hallucination free.

The problem is the data sets they evaluate are so large and complicated for a college that there is literally no way for humans to verify the insights.

It's actually kind of scary. Choices are being made about the future of human people based on trust in New Software.


My experience is that LLMs can't actually do 3 at all. The intersection of knowledge has to already be in the training data. It hallucinates if the intersection of knowledge is original. That is exactly what should expect though given the architecture.


Super interested in hearing more about why you think this -

> I believe that a properly fine-tuned small LLMs would easily beat gigantic models for this purpose.

I've long felt that vertical search engines should be able to beat the pants off Google. I even built one (years ago) to search for manufacturing suppliers that was, IMO, superior to Google's. But the only way I could get traffic or monetize was as middleware to clean up google, in a sense.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: