Thanks for mentioning my books! I wrote my first Common Lisp book about the same time Peter wrote his book and I met him at the first Lisp Users meetup in San Diego. His Python short Python programs and notebooks are also fantastic. Off topic, but: I wonder if the new Python++ language Mojo that sits on top of MLIR will end up as the top AI language. Maybe?
This might be my favorite programming book of all time.
It's the book that got me hooked on Common Lisp (from Scheme) too. Focusing less on the elegance of the language definition and more on writing elegant programs.
(English isn't as elegantly defined as Lojban and but that didn't hold back Tolkien.)
Lojban is elegant from I guess a logical/mathematical point of view, but I don't think from a human point of view. Iirc "in the land of invented languages", the author pointed out many issues with the language in that even it's biggest fans can't really speak it fluently...so what's the point? Esperanto is probably a better bar here, which I just started reading the Hobbit in lol. I will say that Lojbahn is super cool though. The attitudinals are particularly interesting.
Me too! I came looking for this sort of meta-comment. I've re-read the paper book a few times now.
What makes it my favorite is how clear Norvig's writing is. It's easy to follow (both when reading it in English, and when following its execution if you're a programmer), and it introduces important ideas so effortlessly that, years later, it will give you a chuckle.
Anyone interested in clearly communicating about technical topics, and with a knowledge of Lisp's nature and some idea of what programming in 1991 looked like, might be tickled to read Chapter 1; even its first few paragraphs are refreshing.
I think everyone would be better off surveying the AI classics before diving head first into ML. It's the nuclear option for problem solving; and sometimes you can get away with simpler approaches that are easier to implement, reason about and debug.
Currently, statistical/data-driven approaches work best, and that's what you will be expected to use whether you are building your own products, or working for an employer. Most people don't care about the GOFAI approaches anymore, seeing them as outmoded in all respects.
However, if you are curious and want to understand more of the history of approaches we have tried, and learn some really interesting algorithms along the way, I think studying the old school problems and their solutions can be both intellectually stimulating, and potentially increase your depth of understanding. After all, it's only once you've tried to solve a problem and failed miserably that you start to appreciate the depth of its complexity.
That depth of appreciation is sorely lacking in today's new cohorts, who are basically blinded by the incredibly convincing outputs of our cream-of-the-crop LLMs.
I'm a huge fan of classical AI, and adore PAIP, but this isn't really true if your goal is anything other than a deep understanding of AI in the most general sense.
While it would be great if everyone interested in the topic was well versed in the fundamentals, the truth is if you want to do anything from building something cool over the weekend to getting an actual job doing AI work, you're much better off starting not only with ML, but specifically with current SotA neural networks.
If you really want to get started in AI I highly recommend building even a trivial implementation of Stable Diffusion on your own. Not just because it's cool, but because at its heart it is an excellent demonstration of how current differentiable programming works. Diffusion models involve chaining together 3 separate models into an entire system that learns to solve a complex task. Once you understand this deeply, you can now solve a very broad range of tricky problems and are really approaching what we think of when we think of AI.
Differentiable programming is really the current pathway to any sort of AI solution to a problem.
I say this as the token "have you tried logistic regression?" guy in my org.
The downside of differential programming is the absolutely massive amounts of training data and time required. Several orders of magnitude over boosted decision trees or even SVMs. If your function’s domain is fairly well understood, save yourself a few weeks and a few thousand dollars.
You can implement SVM, gradient boosted decisions trees, and almost all classical models using the techniques of differentiable programming and it will have 0 impact on the amount of data required.
Massive Neural Nets do require a lot of data and are often not the best solution, but differentiable programming in general does not have higher data requirements than manually computing your derivatives or using OLS. You can still approach classical ML from the perspective of differentiable programming (and likely end up with a better sense of our how your models work in the end).
> I think everyone would be better off surveying the AI classics before diving head first into ML
this is untrue. ml algorithms have nothing to do with gofai algorithms. if there is something one "would be better off surveying before diving head first into ML" it would be mathematical analysis, statistics, and probability
There's nothing intrinsically nuclear about ML and it includes a bunch of simple approaches too.
The advice to spend your limited time and attention on outdated approaches seems counterproductive. The things in this book aren't just old - they ended up being a dead end in research. So if it's 2023 and you have 20 hours to learn something new, you can do much better than this book.
This attitude is not just ignorant; it's dangerous.
I'm seeing rampant use of ML now for problems we already know how to solve in much simpler ways: linear control theory, bayesian statistics, Kalman filters, etc. "Oh hey, no need to study those old, dry topics. Just throw a bunch of training data at this GPU-bound black box and it will probably work."
That's right, it will probably work. Until it doesn't. And then you won't be able to debug it. More important: You won't be able to predict when the system will fail, because it's a black box. And if it's controlling a high-consequence system, when it fails people could die.
The moral is that if your problem falls into one of the already known easy-to-solve domains, you should use the old techniques. It will probably need at least 1/10^6 the CPU resources as an ML approach and you'll be able to characterize its failure regimes in advance.
> I'm seeing rampant use of ML now for problems we already know how to solve in much simpler ways: linear control theory, bayesian statistics, Kalman filters, etc.
How many of these techniques are in the book in the original post?
I'm not saying the we should throw ML at everything, I'm saying the Norvig's book isn't useful in 2023.
Norvig's book is extremely useful for a whole different reason. It remains one of the best advanced books on general software design. Don't get distracted by the title
Those are all in his other textbook -- presumably he'd agree that AIMA is much more up to date than PAIP. I do think PAIP is still great for programmers wanting to improve their craft.
Neural networks were known as being "always second best" for many decades not that long ago...
Its very possible (highly likely even) that elements of gofai end up being implemented into some of the upcoming RL/GNN combo based architectures. I highly doubt that the transformer will be the end-all-be-all for generating representations.
At the very least, many of those 'in-the-know' around these GNNs realize that sheaf-NNs are much more expressive and can yield far better general results if improved properly for long range dependences - perhaps with a performer or longformer -like addition.
Ultimately, some of the best researchers in the field (Velockovic is one of the best, and heavily focused on dynamic programming for example) are not just focused on the transformer or any of the hype around it right now. In order to improve, you have to look elsewhere. Understanding old methods is typically a great resource to draw that inspiration / algorithm from.
I don't deny that AI from the lisp days is out of favour today but I recall Neural Networks sat in a niche corner of maths mostly dormant for quite a while too...
I'm not an expert in either but am confident that progress is non linear. Are there any ideas that you think are definitely bad (or even possibly good) from the lisp days?
Try using neural networks on a computer from 30 years ago, or even 10 years ago, and you'll quickly realize why they were not feasible for almost any problem when this book was written.
I was studying evolutionary computation back in the 80s. Yes, you did not have the large ANNs you have today, but you also broke down your problem into manageable, computable bits (no pun intended). Working through Koza's book on Genetic Programming in the 90s, or Mark Watson's book on ANNS, chaos theory, and a bunch of other stuff, was very illuminating. I particularly liked the artificial life stuff out of the Santa Fe Institute. I still have a VHS from the 2nd proceedings. I could have rode the ML/DL wave to a high salary job, but I took another path building real-world objects and machines, and I am very happy I did. I remember when WordPerfect was popular and people spoke of careers in wordprocessing. I always saw it as evolution of the cube farm, and of course, it is. Now I have friends who work in industry with ML/DL, and they joke that it's the same but with a lot more things to track and keep running. I like GOFAI for the simple matter that sometimes 90% is good enough, and 125% is a waste of time, water in the blender. When robotics was taking off I saw the two schools of thought: Rodney Brooke's subsumption architecture making it simple to create emergent behavior, and Mark Tilden's analog circuits and discrete components. I found Tilden's work more interesting. It spoke to me about continuous or biological-like processes. I am now into researching neurocomputing and following the developments in designing neurocomputing hardware. I was so happy to have received free tickets to watch Terminator 2 back in the day, but now I feel like I have a front-row seat to it!
Do you remember the names of topics from Sant Fe's "artifical life stuff"? Was it agent-based modeling?
They have a bunch of courses on https://www.complexityexplorer.org, but it's mostly Game Theory, Chaos, Complexity...
plenty of game programming has ingested "agents with actions" kind of programming directions; I believe that procedural code with some understanding of state-machines, is all that "artificial life" content was.. minus some details.
It's "out of favor" because it completely failed as a research program. Let's not equivocate about this; it's nice to understand heuristic search, and there was a time when things like compilation were poorly understood enough to seem like AI. But as a path towards machines that succeed at cognitive tasks, these approaches are like climbing taller and taller trees in the hopes of getting to the moon.
You should really know that there's more to ML than neural networks. Those are the simpler approaches I was referring to (like linear models, ensemble models etc). They are
People seem to say Norvig's "other" book is still relevant. I assume that's "Artificial Intelligence: a Modern Approach".
My version is from when I went to school 20 years ago. I assume it's been greatly updated over the past couple of decades. I wonder if it's worth taking a spin through the new edition.
> The 4rth edition is from 2019. My understanding is that the world of AI has changed about a quadrillion times since those days, right ;) ?
The world isn't moving that fast. Transformers and LLMs are built on neural networks and lots of data and fast computers. You could jump straight to that point, but even the course you've pointed to starts off with more foundational ANN topics before getting to transformers. Much of which is at least in the TOC for the current edition of AIMA. Ought to be complementary texts.
Also, only fools ignore history, "classical" AI and topics also covered in the book are still applicable. ANNs aren't going to solve all the world's problems. Other techniques that fall under the category of "AI" are still applicable and very effective for a large number of real-world problems (and much more efficient than LLMs).
It’s always quite ironic to see hipsters today praise GOFAI systems and belittle the deep learning ´hype’, given that they were massively overhyped at the time and delivered next to nothing outside of some niches.
Even funnier to see how someone is always quick to explain that ´NN are not real AI’ when GOFAI was literally all about parsing, basic logic and search trees.
I've been keeping my eye on the so called "GOFAI" for a long time but with recent advances in ANN methods (DL, LLM), does it even make sense to further pursue the former?
Personally, those "old" methods in the 80s make a lot more sense to me than recent statistical methods.
The company I work for makes a killing on applying a 1980s-era ML technique to a really tough modern business problem -- the resulting product is probably the best in the world at what it's doing.
Old techniques have several things going for them, with one of the more important ones for us being explainability. A random person off the street could hypothetically, with an hour or two of training, diagnose problems just by looking at the structure of the model. That's very helpful for adapting to market needs.
Generality is another big plus. Since the model encodes intuitive ideas there's a lot of room for using it in innovative ways.
Older techniques also tend to produce better results with less data, because big data wasn't as much of a thing back then.
Unless you have a crazy amount of resources, I think it's far better to be bleeding edge in as few things as possible. Solving a new business problem? Perhaps don't spend too much time on also solving all the childhood diseases of a new technology.
Cost is also important. LLMs are expensive to run.
I think that going forward, we'll see a mix of "normal" programming, LLMs, and simpler machine learning techniques all combined together, because of economic reasons.
Hm. I'm not sure about cost. I haven't done any sort of analysis but I would guess with 75 % certainty that the techniques we use need more watts per byte of data than modern techniques. Modern techniques are very efficient in terms of their operation -- it's just that they achieve performance by needing to do so much operation!
What 1980s-era ML technique are you using and why is it better suited for your application than something modern? Is it „just“ the ability to explain solutions, or is there something else like guarantees for XYZ?
The core part of it is doing dimensionality reduction with shallow neural networks, but essentially all user-facing functionality we get by controlling the training methods and hooking into the neural network and querying it all sorts of ways.
We switched from more modern techniques primarily because they needed too much data to work well, but the other things I mentioned are the benefits we noted along the way. I don't know if that answers your question.
This book is a good intro to Lisp itself, which is worth knowing in any case- and symbolic AI is probably one of the best domains for it!
As for GOFAI in the age of DL/LLM, yes, you should know it, for a couple of reasons. A lot of these techniques aren't really considered "AI" anymore, they're just regular CS algorithms everybody should know: graph search, backtracking, optimization, parsing, etc. The other is that a lot of newer DL/LLM is actually going back to these old problems, but bringing all the new techniques to deal with limitations of the classical algorithms.
> Personally, those "old" methods in the 80s make a lot more sense to me than recent statistical methods.
Same for me.
> I've been keeping my eye on the so called "GOFAI" for a long time but with recent advances in ANN methods (DL, LLM), does it even make sense to further pursue the former?
I think it still matters. Plenty of examples in the tech industry where "old" tech/paradigms became the new "hype". They say it's all a cycle.
Think of this as a "learn to program better" book, even if you don't care about GOFAI. It has a lot of very nice code, and commentary without blather. I've said this about as long as I knew of it, so this isn't just revisionism.
It depends on what you were expecting from GOFAI. The future state is likely to be a combination of approaches where each makes sense rather than a single one. We're still in the honeymoon phase with deep learning and disillusionment with its ultimate limitations is likely still to come.
With modern and trendy music, should we listen to old and classical music. Should we forget the history? I think not. My favourite genres are Jazz and classical.
Sorry if you don't get the metaphor, but it's like so.
If it's for work or for building something useful, the book is a waste of time. If it's for your personal interest, you can read and learn anything you want.
Personally I avoid books like this one (similar to how I avoid very esoteric languages) because I want to spend my time on things are interesting and useful instead of only interesting.
Does anyone have any other free Common Lisp book recommendations?
I decided to give CL a try after reading about REPL-driven development, especially CL’s interactive condition/debugging experience. I’m almost done going through Practical Common Lisp. It’s been a fun experience so far!
Every once in a while I find even the language standard insufficiently clear (e.g. how to change the exponent character when printing double precision floats), then I dig again into Common Lisp the Language (2nd ed.) by Guy Steele available at https://www.cs.cmu.edu/Groups/AI/html/cltl/clm/clm.html .
Everyone says SICP is this profound thing but I just couldn't get into it or LISP at all. Am I really missing out on anything after a good CS education and practical experience?
I love SICP and am one of those fanboys who say everyone should read it.
Though, really, a lot of what it covers is much more common knowledge now than it was when I first read it. High-order functions and functions-as-first-class-citizens (ch1) are ubiquitous and most programmers I know are comfortable with them (which wasn't the case in the early 2000s, at least in my circle). Lists, maps, pairs, and symbolic structures are covered in ch2, but most people are comfortable thinking in such terms now. Ch3 covers handling state, and I think there are good ideas in that chapter that still haven't been broadly disseminated.
But where the book really shines is the last 2 chapters - I haven't seen much of those ideas (virtual register machine and writing a compiler for it) covered elsewhere. It's still a great way to get exposed to some fundamentals of computing from a pure software environment. But I think you could be quite a capable programmer without ever doing that.
>Am I really missing out on anything after a good CS education
I don't think you can have a good software education, in the sense of having a complete one, without studying LISP. It's a foundational paradigm of programming. It's kind of like studying physics and skipping Maxwell's equations, paraphrasing Alan Kay.
It's a good way to nail down a lot of the fundamentals, on top of which a good CS education can be built.
So if you already have a solid CS education it's not really necessary, except maybe for the enjoyment of reading a really well thought through pedagogical work. Which can sometimes help coalesce concepts and the connections between them.
You could give Concrete Abstractions* a try, I consider it to be the intermediate part between the Little Seasoned Schemer and SICP. As for whether you're really missing out on anything depends on how good your good CS education was.
Just read it like a novel. Ignore the exercises. What struck me that it took 8 or 9 chapters until they even presented a loop, and it was none the worse for doing that. Actually helped my imperative lizard brain evolve.
Yes, its a book that uses lisp, but its not a lisp book. Its about programming techniques, i felt.
Of all the books that are usually recommended to be read and nobody actually does, this is the one that i actually read and liked.
I bought this book in paper time ago, put it aside (I bought half a dozen at the same time) and when I finally got to read it, I found out that I had brought the wrong version. Too late to return it...
A question: is any of the currently hot software written also in Lisp? I mean the LLMs, SD, etc.
As someone learning Common Lisp for fun and planning to use it in the web, I’m a little disturbed by the “we manually force garbage collection periodically” part of that article. I haven’t fully digested the commentary, so maybe I’m more concerned than necessary…
Looks like they wanted to configure a much larger heap than they needed just in case, but then to have the system pretend it doesn't exist, and do GC cycles at a much lower threshold. I'm surprised SBCL doesn't have parameters to be just tuned that way; if not, that could be upstreamed.
Going longer between garbage collection cycles could be worse in terms of caching and paging. Marking the live objects allocated in that generation is about the same, since that doesn't grow, but there is more garbage to visit and sweep. Sweeping a smaller memory area more frequently is going to be faster than sweeping a larger area less frequently, from the point or view of caches and TLB.
(Regarding my remark about live objects; with larger collection cycles, they could be spread in a wider VM footprint, even if their quantity doesn't grow. The image has a kind of "GC working set"; longer intervals mean that there is a longer working set. The data being transformed cycles through more memory locations.)
I would say so, specially when you plan to use it for web dev (welcome!). Shinmera has released a game in CL heavily depending on CLOS (object system), and he says the GC is barely a matter.
I recognize your user name. Great work on the Lisp Journey site! Several times I’ve had a question and then found the exact same thought expressed on your site (along with an answer).
That's nice to hear, thanks for the feedback. I've been puzzled many times when starting out (or taking a not-so crowded path), so I'm glad the ones after me are having a better time.
Now, your time to build cool things and share in the process ;)
If you are interested in machine learning, check out Gabor Melis's library: https://github.com/melisgl/mgl. It's not an area I'm super familiar with, so I can't speak to it's feature set, but I believe he used it to win a machine learning competition some years ago.
I don't think anyone's written a transformer or diffusion model with it, could be a fun challenge.
https://markwatson.com/#books
Mark Watson on HN:
https://news.ycombinator.com/user?id=mark_l_watson
Edit: I should have said that you can also pay for his books:
https://leanpub.com/u/markwatson