Stay away, in my opinion. I spent a year supporting a SVM in a production machin...

_pastel · on April 30, 2020

100% agree. What's the use case for SVMs?

Linear models are simpler. GBMs are more powerful, more flexible, and faster.

Every ML course I took had 3 weeks of problem sets on VC dimension and convex quadratic optimization in Lagrangian dual-space, while decision tree ensembles were lucky to get a mention. Meanwhile GBMs continue to win almost all the competitions where neural nets don't dominate.

I suspect my professors just preferred the nice theoretical motivation and fancy math.

bearzoo · on April 30, 2020

Svms are, by default, linear models. The decision boundary in the Svm problem is linear and since it’s the max margin we may enjoy nice generalization properties (as you probably know).

You probably also know that decision tree boundaries are non Linear And piecewise. It’s not so straightforward to find splits on continuous features.

Ie If the data is linearly separable then why not. Even using hinge loss with nns is not uncommon.

You probably see gbms winning a lot of competitions compared to svms because a lot of competitions may have a lot of data and non linear decision boundaries. some problems don’t have these characteristics.

Der_Einzige · on April 30, 2020

Kernel function is simple - Are you in a high dimensional space? If so, choose linear kernel. Else? Choose the most non-linear one you can (usually a guassian or RBF). I suppose quadratic and the other kernals are useful if what your modeling looks like that but in practice that is rare.

Prediction is not that slow with linear SVMs especially not compared to something like K-NN. The main hyperparamaters which matter are the "C" value and maybe class weights if you have recall or precision requirements. The C value is something that should be grid-searched, but you might as well be grid-searching everything that matters on every ML algorithm and in this regard SVMs are fast to iterate over (because the C value is all that matters).

Applying categorical and continuous features is not difficult if you choose to do it in anything more sophisticated than sklearn. Also, pd.get_dummies() exists (though it may lead to that slow prediction you're concerned about)

You're most likely right with GBM or Random Forests - though they can have all sorts of issues with parallelism if you're not on the right kind of system. You talk about linear models but SVMs are usually using linear kernals anyway and are a generalization of linear models (including lasso and ridge regression models).

smbrian · on April 30, 2020

Agreed -- linear SVMs, especially in text processing applications, is the one area where they are a natural fit. All their attributes complement the domain. Linear SVMs also have desirable performance characteristics.

But at that point, they also have a lot in common with linear models. Those also seem practical in that domain (though I have less experience here, tbh). And performant, when using SGD + feature hashing like e.g. vowpal wabbit.

My beef with non-linear kernels and structured data is a longer discussion, but I find kernel methods for structured data (which is usually high-dimension but low-rank -- lots of shared structure between features, shared structure between missingness of features) to be highly problematic.

snovv_crash · on April 30, 2020

> Prediction is not that slow with linear SVMs especially not compared to something like K-NN.

Provided your structural dimensionality is below about 10 (ie. 10 dominant eigenvalues for your features), then KNN can be O(log(N)) for prediction via a well designed Kd-Tree.

KNN is also really simple to understand, and to design features for. It also never really tends to throw up surprises, which for production is the kind of thing you want. Most importantly, the failures tend to 'make sense' to humans, so you stay out of the uncanny valley.

exegete · on May 1, 2020

I’d agree on the training time but your serialized model should be small on disk since only the support vectors are needed for inference. At least with my experience that has been true.

MaxBarraclough · on April 30, 2020

So you're saying to stay away from SVMs, rather than to stay away from this particular tutorial?

smbrian · on April 30, 2020

Sorry, I should've been clearer! Beginner to ML? Stay away from SVMs.

This tutorial looks good, and well written.

rangerranvir · on May 1, 2020

Thanks for saying that. Means a lot.