Probabilistic models. Recent research often focuses on Bayesian models.
Probabilistic models have never really gone away. This presentation by LeCun actually suggests embedding neural networks inside of various types of probabilistic models: factor graphs and conditional random fields. This is, for example, how speech recognition works: the output of a neural network is fed into a probabilistic model (a hidden Markov model).
However combining learning features with other systems is a very powerful approach and combining SVM's on top of the learned features of a Neural Network I would say is common. I personally am more interested in approaches like Deep Fried Convnets (http://arxiv.org/abs/1412.7149) that combine kernel methods as part of the Neural Networks themselves.
Not to nitpick. I just want people to realize there are actually recursive nets that rely on a parser to be built (this is the recursive net that relies on backpropagation through structure). Then there is the recurrent net (LSTMs,multimodal) that rely on backpropagation through time.
Talking to some of the users of Recursive nets, they will be renaming them to tree rnns which should help clear up confusion a bit.
I know that Andrew Ng and colleagues say that they don't use HMMs. I haven't spoken with them (I haven't seen them at speech conferences) so I do not know whether they actually believe this themselves.
I believe the best comparison between "CTC" (which is billed as recurrent neural networks without the HMMs) and the traditional approach is by people at Google, Sak et al, "Learning Acoustic frame labeling for speech recognition with recurrent neural networks", ICASSP 2015. (I can't find a PDF online.)
In the case of vision, there were a lot of things going on, but support vector machines, sliding window search, descriptors with invariances to different transforms like SIFT and HOG, spatial pyramids, deformable parts models, and mixtures of gaussians are all hot topics that you'll regularly see in papers from the early 2000's. There was a lot of work on improving runtime for these techniques going on, since training SVM experts and evaluating anything over a sliding window search space is very expensive. You'll also see a lot of work on approaches rooted in graph theory like conditional random fields, different flavors of Markov models, min-cut/max-flow based algorithms, and other types of probabilistic graphical models. Most of this stuff is still in widespread use; AI is a big field.
I really like the book "Pattern Recognition and Machine Learning" by Christopher Bishop. It's packed full of all the latest-and-greatest algorithms.
WRT your question, an interesting "feature" of that book is that it was published just before deep neural networks started taking off, so there's no mention of DNNs in the book. You can see what the world was like right when they started taking off.
I know symbolic AI was big in 60s, and 80s, but not sure about recent past.