Hacker News new | past | comments | ask | show | jobs | submit login
Using machine learning to predict basketball scores (sigopt.com)
51 points by prasoon2211 on Jan 1, 2016 | hide | past | favorite | 21 comments



The well known bettors will place bets knowing others will emulate it, moving the odds in favor of their real bet they have yet to place. I wouldn't be surprised if some of the big Vegas books are in on it since they make their money long term on vig, and don't have to worry about short term losses (thus enjoying any increase in bets those schemes generate).

There is an older 60 minutes episode on Billy Walters that is a good watch for anyone interested in people trying to beat sports books.


Vegas always gets their cut as you point out. You need to reliably win 52.5% of the time to beat the bet $110 to win $100 edge.

In general, betting schemes tend to stop working in the long term, but can have large gains in the short term [1]. We created this example to encourage people to think about model tuning in the applications where they are domain experts and how it could benefit them (and hopefully the world in return).

[1]: http://www.wsj.com/articles/a-fantasy-sports-wizards-winning...


Kind of sort of. The way the Vegas books are made are to make consistent profit while mitigating as much risk as possible. Which means, although not quite like the zero-sum markets which exist in securities, you still have to have more or less need the scales to weigh out.

What you can capitalize on is human emotional -- i.e. the Dallas Cowboys are notorious for being bet heavily in favor of, even if statistically they're not going to win because of sentiment or what-not. Let's say statistically they're underdogs by 10.5 pts. Let's not talk about covering the spread or parlays or anything complicated - just pretend the market is limited to solely 'bet to win'. If it were the Bengals as the underdog at the same 10.5 against the Raiders (just teams where the fan base doesn't skew the book), you might see -170 to win 100 , and conversely if the Raiders win, it might be bet +140 to win 100. (I haven't bet on sports in years, so I'm not sure how the actual numerics work out). Either way, as underdogs, since so many people will be betting on the Cowboys even though they're predicted to lose by greater than 10.5, the line will only be at -120 due to the volume of bets being taken. Vegas has to move the line accordingly to adjust for the contingency that the Cowboys actually do win, paying out 170 on those upsets could bankrupt them.

Being successful at fantasy sports is probably a lot easier than being successful at trading crude oil, because John and his buddies will all throw down 40 bucks each for the Cowboys to win even though the odds are heavily skewed against them. And unlike in Vegas, you can consistently be profitable and not get black-listed. (Counting cards should be legal if you are not colluding with anyone, but I digress).

Interestingly enough, 'betting schemes' might not work for the average player, but there's a reason why the same 10 players make it to the final table of the World Series of Poker. In fact that's how those daily betting fantasy poker gaming sites are in business. It doesn't violate the unlawful gaming act because games like poker[1] and fantasy football clearly have consistent winners and losers, even though those attributes often can't be quantified via any sort of metric, it's enough for judges to rule that certain forms of gambling are games of skill.

[1] http://www.npr.org/2012/08/22/159833145/judge-rules-poker-is... I've seen the same ruling at the highest circuit of tons of states, and I'm guessing it must be federally recognized as the fantasy daily line-up games are making literally billions off it now.


Post author and co-founder of SigOpt (YC W15) here. Thanks for all the great questions and comments, I'll be around all day answering questions. Feel free to ask anything about the post or what we do at SigOpt (or how).

All of the code can be found here: https://github.com/sigopt/sigopt-examples

More about how SigOpt works here: https://sigopt.com/research

Other discussions here: https://www.reddit.com/r/MachineLearning/comments/3yy0vp/usi...

and here: https://news.ycombinator.com/item?id=10819170


Vegas odds aren't even the most accurate odds to beat. I.e you can already beat Vegas odds by looking at odds pinnacle, 5 dimes, bet 188 and consider them the true odds. It's called arbitrage betting. If you have a model that can truly beat the market, then it is the players I mentioned that you want to beat.


Arbitrage betting is utilizing differences in money line payoffs to place two or more bets that cannot lose regardless of the result of the sporting event. These opportunities do exist, however sports books absolutely hate it and some of the offshore ones will either ban you from playing or in some cases simply refuse to pay off your bet.


You can do it with single bets to. If the following: bookie X offers you odds x, bookie Y offers you odds y, x > y, you are sure bookie Y is a better punter than bookie X ,then betting on odds x is a free bet which will generate profit in the long run.


In college I did extensive research on this amongst Bodog, 5dimes, and another third party vendor considered 'reliable'. I'd scrape sites hourly because lines move like any market will, and I did this for all 17 weeks. I don't think there was one instance where there was an arbitrage opportunity (my algorithm would go 4 game-pairs deep, so it wasn't an exhaustive analysis but rigorous enough). Even if theres large disparity between bookies, the fact that the industry accepts -115/100 as fair would require a huge mispricing by a bookie for you to arbitrage successfully.


I'd wager it is harder for sports such as baseball and hockey. But opportunities does happen, at least in football. But ime they are ephemeral so you have to be very quick to catch them. Updating once per hour isn't nearly fast enough. For example if news get out that a player is injured and will miss a game, then the odds will shift but some bookies will be slower to react than others. That can give you a window to arbitrage for 20 minutes or so. You can do it a few times, then the bookies ban you and void all your bets.


Great points. Stay tuned for a post sometime soon where we do something similar with Wall Street, where you can build hedging and adversarial trades into the model itself.


Baseball feels like a next step up. About 2,400 games a year. More games, data points.. more everything


More very old, very dangerous sharks swimming in the same pool. Betting to win long term is not something most people are mathematically or emotionally equipped to do and a toy demonstration showing a handful of results means very little.


Great points. This was just a cool demonstration of the underlying technology hoping to inspire people to use it on their models in domains other than prop betting. Potentially it could help those sharks already swimming in these pools, but our goal is to help every expert in every field make better models.


Great idea! You could reuse most of the code [1] to give this a shot. In future posts we may extend this to other sports as well.

[1]: https://github.com/sigopt/sigopt-examples


It's not clear from the article what the input data is. I had a look in the repo, it wasn't clear from that either. Is it the normal fare you'd expect?

I've heard of prop betting firms using stuff like "distance travelled by the away team" and other logical things like that.


We give a high level overview in footnote #4 of the post, but more detail can be found in the code [1]. We tried to pick a relatively small set of features (and kept those picks constant) in order to isolate the model tuning gains from pure feature selection. You can fork the code and try any other interesting features you think would make it better (there is a ton of data we didn't have the chance to look at).

[1]: https://github.com/sigopt/sigopt-examples/blob/master/sigopt...


I love this, but is anyone else a bit put off by someone appending PhD in their name _on a blog post?_

I'm not salty or anything, and it wouldn't have influenced me into reading the article or not. It just puts a bad taste in my mouth.


There is a TED-talk on this: https://www.youtube.com/watch?v=66ko_cWSHBU (The Math Behind Basketball's Wildest Moves)


If you can reliably beat Vegas, why not put real money into it and make lots of money?


They say it would be illegal and unethical for them: https://www.reddit.com/r/MachineLearning/comments/3yy0vp/usi...


We're also really passionate about building optimization platforms vs betting schemes. We believe in the long term we can have a better positive impact on the world by helping every expert cut the trial and error out of their workflow (like the time consuming and expensive step of parameter tuning machine learning models).




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: