So? Machine learning is not some buzzword fad. Regression is part of ML for sure.
Some people seem to think ML means fancy deep learning GPT-3 transformer running on a TPU farm. Actually ML is a discipline that has existed for several decades and has various theoretical results too, including VC theory etc.
It is also not the same as statistics. They are adjacent but different fields.
Yeah, but no. It definitely is a buzzword that randomly gets attached to every method of data analysis these days. If you disagree, then please give me a definition that is both distinct from statistics but also means that Regression is an ML technique.
BTW, that doesn't mean there isn't also very substantial work done under the umbrella term.
Regression analysis was first introduced by Legendre in the early 19th century. I never would have dreamed to pretend it is Machine Learning 10 years ago when the term ML was still used somewhat more sparingly, and carried a bit more meaning...
But "I will use ML to analyse the data" gets more funding than "I will run a regression on the data".
I won't type up a chapter about this now but ML is about learning from data. It is concerned with training/validation/test data, the generalization error etc.
Statisticians care more about "modeling", actually estimating parameters that stand in for something real, to check assumptions and hypotheses. The cultures are very different. What makes total sense to one may baffle the other as lunacy (there's more convergence now though, by realizing the complementary nature of the two approaches). The terminology is different too: in stats, they call logistic regression regression, while in ML it would have been called classification (but the name is now stuck). ML also tends to be more Bayesian than frequentist, as opposed to classical stats.
I could write more but having taken ML courses before the big hype times, I can assure you ML doesn't just mean hyped up startup fluff vaporware.
Open up Kevin Murphy's ML book and compare the TOC with a Statistics textbook. There is overlap but it's certainly different in its goals and mentality.
It seems like some bitter bickering from the side of stats people that they didn't manage to hype up their field as much. Yeah they did have many of the same tools at their disposal but didn't use them in this way.
I am a physicist by training. I have neither an applied math/stats nor a ML/comp sci background, but I am increasingly using these methods. So I am far from an expert, but I also am not coming at this from some start-up perspective. ML-the-buzzword is now penetrating into all parts of STEM research and research funding. Some of the potential applications are exciting, too.
My conceptualization of the field is simply that ML is the design of universal function approximators to be used as part of statistical modeling/analysis. The key insight seems to be that a complex set of adapted network architectures, together with stochastic gradient descent are unreasonably effective. Further the effectiveness is a very non-linear function of size. As far as I can tell there is not much known about why this is the case. But as far as I can tell there really isn't anything done with these models that isn't statistical inference.
You're setting up a strawman of statistics. You're describing a small part of a large field. Stats has been more than what you're describing for longer than ML has been a phrase.
The only real useful definition these days is that stats is learning from data that happens in the stats department and ML is learning from data that happens in the CS department.
I'm fine with that. I would also hope for more collab cross-departments, but the incentives in academia are unfortunately not exactly set up the right way. People become protective of "their turf".
I think it's good if people are aware of the origins of the methods they use. Regression wasn't invented under the umbrella of ML, but is analyzed from a particular angle in ML more than in stats.
The generally accepted definition in the community is Tom Mitchell's:
> A computer program is said to learn from experience E with respect to some task T and performance measure P, if its performance at task T, as measured by P, improves with experience E.
Statistical estimation methods are one way to achieve this, but not the only way, especially when an exact function can be learned, i.e. a typical layer 2 switch is a learning device. You don't program in the mapping from connected device MAC addresses to switch port, the switch itself learns this from receiving a message from a MAC on a port and then records the mapping. That is a very simple form of non-statistical machine learning.
I'm not really sure how you can start here but then say regression is not a form of machine learning. "Regression" is a pretty broad class of techniques that just means using labeled examples to estimate a function with a continuous response, typically contrasted with "classification," where the response is discrete. The method you use to do the function approximation may or may not be statistical. A genetic algorithm is not, for instance. I'm not sure least squares, which is what Legendre invented, should really be considered statistical, either. The original purpose was for approximating the solution to an overdetermined system of equations, before statistics was even formalized. It certainly became a preferred method in mathematical statistics, but mathematical statistics wasn't even formalized until later. It didn't start being called "regression" until Galton published his paper showing children of exceptionally tall or short people tended to regress to the mean, which was 80 years later. But you're performing regression analysis whether you use the normal equations, LU decomposition, QR decomposition, a genetic algorithm, gradient descent, stochastic gradient descent, stochastic gradient descent with dropout. Doesn't matter. As long as you're doing function approximation with a continuous response, it's still regression. Whether or not it can also be considered "machine" learning just depends on whether you're doing it by hand or via machine.
Though sure, typically people tend to imagine the more exotic and newer techniques that scale to very large data sets and reduce overfitting and deal with noise automatically and involve hyperparameters, i.e. not least squares.
I agree ML is a buzzword and using a sentence like that would sound silly but I can elaborate. First, though, regression does not need a separate ML definition to make it distinct from statistics... there's tons of overlap in the field of mathematics.
Machine learning is the study of computer algorithms that improve automatically through experience and by the use of data.
This definition would include something as simple as linear regression.
Linear regression attempts to model the relationship between two variables by fitting a linear equation to observed data.
The purpose of a neural network is exactly the same as a line of best fit. That is, you are just approximating an unknown function based on input/output data. The only difference is that a NN can better approximate a nonlinear function.
> ML is a discipline that has existed for several decades
If you're going to say regression is part of ML, you can't say ML has only existed for decades. If you define it with that expansive scope, it's existed for centuries.
ML uses regression in a particular way with particular aims and has developed prior approaches like OLS and Ridge regression, lasso, kriging etc. into a particular theoretical framework and analyses them from the angle of learning, i.e. heavy focus on generalization from training data to test data.
The truth is, these are tools that many communities have used in parallel and these communities are in constant flux regarding what promising directions they find and exploit and when they make a big splash in one place, others take notice and incorporate those ideas etc. There are no rigidly predefined disciplines and fields over long timescales.
When electrical engineers and signal processing communities worked on similar things they called it "pattern recognition".
Machine learning as a field started out with more ambition than mere regression and classification (and indeed covers a lot more ground), but it turns out that supervised learning has had the most practical success. But it's a research program and community that goes beyond that.
Similarly, there are parallels and similar equations between control theory and reinforcement learning. And indeed some controllers can be expressed as reinforcement learning agents. But the aims of the two communities are not the same.
Maybe people would be happier if "statistical learning" (which is also used) was used more instead of "machine learning"? But indeed as another comment points out, ML as a paradigm does not necessarily require learning of a statistical kind.
Labels grow and wane in popularity, it doesn't mean it's the same thing repackaged, rather that the aims and the focus changes depending on what we find most productive and fruitful.
For example many of these things were also called "soft computing" a few years ago, but that term is rarely seen nowadays.
This sounds like in the end you agree that a lot of ML is applied stats?
The problem in my mind is not that ML is using a lot of stats (obviously), it's that foundational mathematical concepts get labelled as ML techniques. This is why the title of the post is so annoying. This totally obscures the structure of the field. E.g. I wouldn't call linear algebra a quantum mechanics technique. I would say that QM uses (and spurred the development of) a lot of LinAlg.
Well, if ML people didn't use the word "regression" and named their use of it differently that would also upset stats people.
The point is, when you listen to an ML person introduce regression in a lecture it will look and feel different from when a stats person does it. ML-type regression is part of ML. Stats type regression doesn't cut it. They care about different aspects, flesh out stuff that's not very relevant for ML and ignore parts that are more important for ML.