Totally agree with your point on HOG + SVM, I think it is obsoleted by convolutional neural networks.
I wrote a realtime human detection library [1] for a robotics project that used HOG + a simple neural net for classification. While it worked okay, I wasn't happy with the precision (around 90%) and decided to try out a simple convnet from Torch (doing the classication on depth images instead of HOG descriptors). The Torch version was slightly slower on a CPU, but both the precision and recall jumped up drastically.
I wrote a realtime human detection library [1] for a robotics project that used HOG + a simple neural net for classification. While it worked okay, I wasn't happy with the precision (around 90%) and decided to try out a simple convnet from Torch (doing the classication on depth images instead of HOG descriptors). The Torch version was slightly slower on a CPU, but both the precision and recall jumped up drastically.
[1]: https://github.com/seemk/FastHumanDetection