Hacker News new | past | comments | ask | show | jobs | submit login
Good old-fashioned AI remains viable in spite of the rise of LLMs (techcrunch.com)
82 points by webmaven 10 months ago | hide | past | favorite | 44 comments



As with most software development, modern AI work is all about knowing your tools and when it's appropriate to use them. If you have tabular data, even good LLMs will have trouble beating a gradient-boost tree algorithm, but if you're working with anything involving text data, you'll save yourself a lot of hassle and likely get better results if you go directly to a pretrained-LLM-generated text embeddings.


"As with most software development, modern AI work is all about knowing your tools and when it's appropriate to use them." 100% agree. Ditto with just using the easiest to access models as initial proof-of-concept/dev/etc. to get started.

(I do agree with the overall sentiment of the TC article, although as noted by others below, there's some mashing of terminology in the article. E.g., I, too, associate GOFAI with symbolic AI and planning.)

There's another dimension, too, not mentioned in the article: Even with general purpose LLMs, for production applications, it's still required to have labeled data to produce uncertainty estimates. (There's a sense in which any well-defined and tested production application is a 'single-task' setting, in it its own way.) One of the reasons on-device/edge AI has gotten so interesting, in my opinion, is that we now know how to derive reliable uncertainty estimates with the neural models (more or less independent of scale). As long as prediction uncertainty is sufficiently low, there's no particular reason to go to a larger model. That can lead to non-trivial cost/resource savings, as well as the other benefits of keeping things on-device.


Can link to any methods for deriving reliable uncertainty estimates? Sounds useful.


In my biased opinion, I'd recommend using Reexpress to get reliable uncertainty quantification from LLMs. The video Tutorial 2 on our website demonstrates how to derive uncertainty estimates from the Mistral 7b model, but the basic principle is the same for any LLM: Simply include the output logits when uploading a document via the JSON lines format: https://re.express/


Well, LLMs don't consistently beat purpose-trained BERT-type models just yet, for example in [1] the authors show RoBERTa (fine-tuned for each individual task separately of course) going basically toe-to-toe with GPT-4 on quite a few somewhat conventional NLP tasks, while open-source LLaMA 2 models are getting severely outclassed.

[1] https://arxiv.org/abs/2308.10092


You'll get far more than 80% of the way with just running a pretrained text embeddings model.

There's always room for optimization (e.g. finetuning on your own data) but it's an incredible baseline.


I've found the best LLMs still fall a bit short when it comes to being truly best in class, SOTA in most useful tasks is still dominated by smaller models with focused approaches and innovative loss mechanisms.


Yeah, but you can talk to an LLM. I think the most compelling use for LLMs is “tool use.”

Imagine if everyone could write simple automation scripts in literal English. We’re there.


Yeah, embedding are where it's at for text. Most problems are really about clustering or classification, when you like, get down to it, man. Semantic similarity is one hell of a drug.


Using fasttext as a first pass is probably the best idea for production work.


fasttext/word2vec/training-your-own-word-embeddings-from-scratch requires very optimistic assumptions about your data that rarely hold well in practice, such as not having idiosyncratic syntax from user input. I've learnt this the hard way.

Transformers-based LLMs with a BPE tokenizer compensate for this well.


One of my former coworkers at Rad AI always used to joke that most problems people try to solve with AI could be solved better with simple linear regression, and I really do think that input was one of the reasons why that company is doing so well right now. When you're trying to build a product you need to look at the best tools to solve the problems you're working on. Sometimes that truly is an LLM, and other times it isn't. Throwing a technology into things just because it's the cool thing to do at the moment is almost always a mistake though.


That makes sense, but we often ignore and overlook the marketing side for both attracting investors and customers. If you need runway, it’s “AI”


Following fashions can bring you a bit of short term cash, but it won’t build a long-term sustainable business.


Ah yes. Customer service. Who needs LLM? Just use a simple linear regression.


That's just about the silliest interpretation of my comment that could have been made.


There are still tons of domains were GOFAI is very useful, like planning and scheduling, and also in cases where you don't have access to lots of training data or pretrained models.


We’ve been seeing this in vision too. I’m really excited about multimodal LLMs but it seems like they’re going to be most useful for solving new types of (qualitative) problems vs taking over (quantitative) problems where small, fast CNNs have excelled.

Though they can also play well together[1] given _creating_ quantitative models is a qualitative problem.

[1] https://github.com/autodistill/autodistill


> We’ve been seeing this in vision too.

It's always amazing to me that __researchers__ still ask if there are uses for GANs, thinking diffusion killed them. I have a hard time taking anyone's claim that they are an expert in synthetic image synthesis but it seems top companies hire them. I see the same thing with ResNets and I just don't get it. There's a tendency to not just railroad, but to actively build it.


LLMs popularized zero-shot learning, or “prompt engineering” which is drastically easier to use and more effective than labeling data.

You can also retrofit “prompt engineering” onto good old fashion ML like text classifiers. I wrote a library to do just that here: https://github.com/lamini-ai/llm-classifier

IMO, it’s a short matter of time before this takes over all of what used to be called “deep learning”.

Supervised learned, which required Herculean efforts to label big datasets was important to prove that deep learning worked, but we have been engineering it to be easier and more effective ever since and there is no going back.


Thanks for sharing. This looks interesting.


That is not what people typically mean by "GOFAI," which is a term more commonly understood to refer to classical symbolic AI as practiced by the MIT AI lab starting in the late 60s. From the title I thought they were claiming that the OLD old fashioned AI was still viable--they are not. They're still talking about trained neural network models; they're distinguishing between foundational models and single-task models.


This. Symbolic AI (AKA GOFAI) died about 1985. But task-oriented AI will never die, because the need for intelligent agents with cutting-edge experise in niche skill areas costs too much to embed all the available info on Earth into every foundational AI model, much less, keep it up-to-date.

As the old saw says, the only perfect model of the universe is the universe itself.


It didn't die per se I would say, it mainly got subsumed into more general programming and problem solving patterns (search, heuristics, A*, that sort of thing). What probably "died" is the dream that those would be sufficient to achieve general intelligence.


"Symbolic" AI is very much used wherever you have theorem provers e.g. z3, no?


Yes, and closely related are various forms of constraint solvers, things that LLMS are still fantastically poor at.


SMT Solvers are used extensively in certain parts of the industry, especially when working in security-mindful fields.


Nothing existing only before 1985, that is, before the invention of the open source, was alive in the first place. The code for the classic programs is nowhere to be found. And if found, it is written in some interlisp dead language.


> OLD old fashioned AI was still viable--they are not

False. Theorem provers exist and are widely used, sometimes even in deep learning.

https://arxiv.org/pdf/2304.10558.pdf


You misread the statement.

> I thought they were claiming that the OLD old fashioned AI was still viable--they are not.

Electroly is saying something about the authors' claim, not about the viability of GOFAI.


You are right. My bad. Unfortunately, can't edit the comment now :(


Good old linear classifiers still solve many tasks close to optimally. Probabilistic AI is not just neural networks.


Generative symbolic AI occurs all the time in data integration, when you have to generate new identifiers (e.g. create new targets for foreign keys that don't exist in any source), recursively merge identified entities together along their attributes, and so on. But now that this process is mathematically well-understood it is no longer called "AI" but rather "logic programming".


LLMs themselves are made up of "good old fashioned" components. The public conversation about generative AI seems to assume that it's doing something fundamentally different than what came before. Of course, LLMs are still just deep NNs trained through back propagation to "classify" strings of text into one of many categories corresponding to each possible token. The only thing they do differently is to compute auto-correlation coefficients in each layer and zero some of them out. And of course they take advantage of massive amounts of training data by self-supervising.

LLMs are only a slight variation on the good old fashioned models that already existed at the time they showed up.


What they bring in my mind that is deeply surprising is the realization that from such a simple objective function (predicting the next word) could emerge such a wide array of general capabilities. Before LLMs, neural networks accomplishing "narrow" tasks were understood to be powerful, but not in such a general way.


Predicting the next word as an objective is straightforward, not simple. Think about what it would mean to predict internet scale text accurately. To for instance, know the results of scientific papers before they are uttered.

There's a difference between an objective and what you need to do to fulfill it.


My understanding is that at some point the model needs to gain an understanding of how the world works if it needs to know the right set of words that come next to keep reducing perplexity. But I don't think this can scale to AGI but I have nothing to back it up


"Good old-fashioned AI" (GOFAI) is a term of art that refers to symbolic AI, in contrast to deep NNs.

Maybe we need to rename it if deep learning is already "old fashioned"!

(EDIT: to be clear, I have no idea what TechCrunch thinks GOFAI means.)


The AI Treadmill:

Looking backward: GOFAI (Good Old Fashioned AI) becomes BOFAI (Bad Old Fashioned AI), then later becomes ROFAI (Retro Old Fashioned AI) then eventually AOFAI (Antique Old Fashioned AI)...

Looking forward: AI (Artificial Intelligence) progresses to AGI (Artificial General Intelligence) progresses to AGIC (Artificial General Intelligence Consciousness) progresses to MAGIC (Miraculous Artificial General Intelligence Consciousness)...


Yeah, that's what I was trying to indicate with the quotations. But thanks for adding the note.


One of the most common design patterns for using LLMs, Retrieval Augmented Generation (RAG), is precisely a "good old fashioned AI" problem to solve.

The quality of your application depends on the quality of your data, how you organize it, how you understand results you get from real-world usage, which model is used to compute semantic embeddings, how you handle tricky retrieval problems (e.g., "show me bananas" vs. "show me not bananas"), and so on.

All this means to me is that the possibilities around what can be built is increasing, as are the needs for people who understand both the "old fashioned" AI world and the new one that we're stepping into.


Also. Good oldfashioned statistics remains viable in spite of AI.


Article title is misleading. It is not about "Good old-fashioned AI" (GOFAI) at all. That's a term of art referring to symbolic AI like planning and explicit inference. All the stuff covered in Norvig's classic "AI: A modern approach."


If anything, LLMs just heighten their abilities. I can generate synthetic datasets of size and quality for training BERT models and the like that would’ve taken ages before. Similarly for knowledge distillation from large models to small. It’s never been more exciting times for ML.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: