After learning PGMs, I find that I've almost completely eschewed first order logic for my own personal everyday reasoning. Arguments based on logical formulations require that the propositions are not leaky abstractions, and for most problem domains (i.e. not physics), there are going to be so many exceptions that I find very few cases where I can rely on first order logic. The softness of PGMs, and ideas like "explaining away" [1] come in quite handy. And after learning some of Pearl's (and others) formulation of causality as graphical models, I understand much better why counterfactual reasoning is so error-prone.
Further, PGMs have the advantage over deep networks in that they are highly explainable, and you can go back and look at the chain of reasoning. For some problem domains, this part is more important than prediction accuracy.
The power of logic is that a few well chosen domain specific clauses can reduce the problem dimensionality dramatically.
If you are building a robot, even if the mechanics are not really newtonian, modeling the system mechanically can get a model much closer to the underlying manifold, reduce training set size and improve generalizability. So I don't think the old way of doing things should be thrown out. They got pretty near the right answers, and we should use newer methods to just fill in the gap between theory and practice.
E.g. pre-train a DBN using an analytical model and later adjust it on real data.
I learned mostly by doing, through undergraduate research, and then in graduate school. This was far more effective than lectures or books, though those are very helpful for getting started. I had one class that covered an overview of lots of the technology, and was lucky enough to have access to preprints of Koller & Friedman's book as a reference to fill in any gaps. I also read Judea Pearl's book on Bayesian Networks fairly early on.
As far as everyday reasoning, it made me somewhat more skeptical of long chains of A --> B, !B therefore !A, type of thing. It's easy enough to model this type of logic as a special case of PGMs. And the causal stuff is extremely useful for making me skeptical of arguments of the sort "If we did X, then Y would happen," and also how and when correlation is causality. Don't have any pat examples though, it's just something that infuses my thinking, such as learning about biological evolution.
It's definitely the hardest one I've taken there. Most of the difficulty comes from the density of the lectures. She moves fast and takes it for granted that you're piecing everything together as you go. You're probably not, but at least you can go back and watch it again if necessary!
Hinton's Neural Network class was very challenging for me too, mostly because many of the concepts were unfamiliar to me. But again, I could re-watch whatever I needed to in order to get it.
Logic based AI is definitely taking a backseat to data driven methods in the current environment but dead is a gross exaggeration. There are a large class of problems for which heuristic search in logic domains is the most performant technique, and a significant class of problems where SAT solvers are feasible solutions. Many of them are real world examples, rather than academic problems. Also, many techniques are evolved or emerging for systems of logic that can handle uncertainty. I've done work on hybrid systems combining rule-based systems with data-based systems (the typical process takes a rule-based system as a starting point and evolves it towards a pure data system as the data sets get large enough). However, starting with a rule-based system is actually a good approach for most start-ups when you don't have enough data to get performant models.
I am starting to get a (what I guess you could call mostly logic-based) AI system off the ground. The first thing I needed to do was create a serious environment that could host it. The necessary attributes of this environment are these:
1) It is fully web-based, and thus it is technically "on" the web, and accessible by everyone
2) It does not deal with any "modern web appy"-type meta-frameworks, and thus it is not, so-to-speak, "of" [what most of today's web developers would call] the web, and it therefore has no dependency issues to hold back its development
3) It is essentially a working Unix-like development environment, complete with a standard(ish) shell.
I have worked with Prolog a bit so FOL is somewhat familiar (I wouldn't call myself an expert by any means). FOL is quite an amazing tool to reduce the problem space in well defined environments. I enjoy board games and rules based FOL AIs are pretty well suited in that domain. Modelling non-trivial domains as a set of rules is pretty tough though (+Gödel applies). Creating game like structures for everyday stuff is one of my remaining AI research interests (the idea being that expert knowledge can somehow be modelled as AIs that compete in the game and thus be made comparable).
The "Inductive Logic Programming" chapter in "Prolog Programming for AI" (best intro Prolog book imo) is very interesting and has lead to a couple of entries in my todo list :)
Non-Standard logics are also very fascinating.
I love "AI A Modern Approach" but the chapter on PGMs wasn't the best in my opinion. I think the dentist example just bothered me/it wasn't all that obvious how useful they really are. Thankfully the book is amazing and they provide plenty of references to move on :)
That being said I think PGMs are immensely powerful and my gut says this approach is the one that I like the best.
I do not think logic-based learning is dead. It just smells a bit funny.
In the vein of the papers "From machine learning to machine reasoning" and "Text understanding from scratch" I expect a "First-order logic understanding from scratch" to follow naturally.
Anyone interested in Logic and Probability should take the time to read through (at least) chapters 1 & 2 of Jaynes' Probability: the Logic of Science [0]. Jaynes' is the arch-Bayesian and in these chapters mathematically develops what is essentially an alternate Universe model of probability which, in his view, arrives as the natural extension of Aristotlean logic. There's no "coin flipping" in these chapters, and when he finally derives the method calculating probabilities the fact that his model matches with coin-flipping models is written off almost as a happy accident. If you're familiar with Bayesian analysis but have not read Jaynes it is very likely that you aren't familiar with quite how (delightfully) extreme his views are.
Jaynes' fundamental metaphor through the book is building a "reasoning robot" so anyone interested in the intersection of logic, probability and AI will get many interesting insights from this book.
You should really look into the emerging field of probabilistic programming. Avi Pfeffer has a nice book out on it, Practical Probabilistic Programming (or at least, you can get PDFs by pre-ordering). It basically expands the PGM way of reasoning to Turing-complete domains, and "hides" the problem of coding custom inference algorithms by making them parts of the language runtime.
My personal prediction is that once we get good at learning whole probabilistic programs from data rather than just inferring free numerical parameters from data, this is going to become the dominant mode of machine reasoning.
Don't under-estimate the power of single layer neural networks--classifiers. They're much cheaper to train effectively and avoid over-fitting. Also, I've had good results using multiple classifiers that essentially cast votes and adding on hand-crafted heuristics to look through the top vote getters.
The hard part is defining "why". Machine learning methods can produce a model which fits the data very well, and you can easily prove that it fits the data, but understanding "why" is much harder.
There is a tool called Eureqa which was specifically designed to produce understandable models, in the form of mathematical equations. A biologist used it on some data from an experiment of his, and it produced a very simple equation that fit the data perfectly. But he couldn't publish it because be couldn't understand or explained why the equation worked or what it meant.
With PGM you just look at the graph to see how each node is weighted.
That is one of the advantages of PGM, it tells you why it thinks something. Combining this with domain experts is a killer advantage of PGM. For the soundbite: PGM's help the domain expert figure out where to go next.
An interesting thing about probability is that it's essentially a "softened" version of logical reasoning, so maybe it's more fair to say that logic-based AI was generalized:
Further, PGMs have the advantage over deep networks in that they are highly explainable, and you can go back and look at the chain of reasoning. For some problem domains, this part is more important than prediction accuracy.
[1] http://www.cs.ubc.ca/~murphyk/Bayes/bnintro.html#explainaway