Consider doing this "Moneyball" experiment using Football Manager [1] games. Those games are made for soccer/football management simulation. And your "Moneyball" kind of playthrough is called LLM (Lower League Management) in the FM community.
This is the ultimate simulation game. This is the one game that taught me so much about simulations and my current job and interest in simulators.
The number of ways I could affect a match was insane. It was great to understand the game and second guess how to beat the opponent (which meant studying the opponent's form, tactics, key players, injuries besides your own).
I created excel sheets of players that I would scout and make averages of key attributes (like Pace, Dribbling, Heading, Finishing). These averages decided my player purchase decisions.
With FM2007, I think I took Hull City from being a scrappy team to a team of Real Madrid's caliber within 5 years and then maintained that till season 2018. I could've gone further but memory leaks caused the game to go slow.
I spent years of my childhood playing this. Could I have learned something more productive in that same time period? Yes.
Do I regret studying this game inside out? No way.
I can totally relate to the impact of FM on my life. I'm CTO of Workshape.io and some of our UI elements were inspired by the interface that was present in some of the older versions of Football Manager and Championship Manager. The concept of the radial plot for comparing player's skills was an idea we applied to our matching service.
To echo the parent poster's comment. It would be very cool to extend this blog into a series and see an analysis like this applied to the most recent version of Football Manager.
In my teens, I used to play the original 80's Football Manager [0] on my BBC Micro for hours, in fact days on end....and I can't stand football :)
I know it's not the same game, and was never as sophisticated as modern FM, but damn it was addictive, even for a non-footy fan. I might have to take a look at modern era FM.
I think this is a good concept, but like all games, it just succumbs to exploiting imbalances. In Football Manager, its just finding all the Uber Wunderkids who can easily be bought for less than the lowest rated players in their prime. Then you just invest all of your money in coaching/facilities who can also be bought for significantly cheaper than adult players and in a few years, once the players hit like 18-19, you have a world class team that only continues to improve as they reach 25-26. Give it 5-6 years and your team is like a team of Ronaldos and Messi's at every position.
Presumably the unrealistic part is the how well it is possible to predict who will excel to a world class level.
You take base athleticism and technical ability and you can clearly produce national league quality players very predictably and consistently. One would assume it is more difficult to judge future world-class potential and even more so the likelihood that potential will be lived up to.
Some of them sort of do or are trying to do this already.
Arsenal actually bought StatDNA, a football data analytics company, a couple of years ago. So time will tell how they will use this to their advantage.
In real life, "finding all the Uber Wunderkids who can easily be bought" is actually extremely hard, as there are too many external factors that could determine if a wunderkid becomes a Ronaldo, as demonstrated by several talented athletes that achieved nothing.
Plus, a lot of those factors are hard for an algorithm to "handle", e.g. luck in finding/identifying a wunderkid, getting him motivated and committed, him having enough maturity to work on improving and the psychological resilience to withstand what comes with the job, etc etc...
What are the odds any football team would produce a single Ronaldo or Messi from a youth prospect? What are the odds they would produce a team full of them at every position?
Or "Out Of The Park Baseball," which is a management game that is about to release their new version, which includes over 100 years of minor league rosters. Now that's Moneyball-friendly data.
Came here to check if someone mentioned Football Manager already.
FIFA is basically child's play compared to depth of Football Manager where your transfers actually mean something and are in fact hard to pull off - even if you have the budget, often there's no way of attracting recognizable names or any talent at all without a certain level of reputation. I'd consider it a true challenge.
Weird aside, even though it's fictional I'm pleased to see that young Ryan Gauld (who went to my school) in his Moneyball'd team. It's rare for Scottish players to play on the continent - he's currently making his way through the ranks at Sporting Lisbon. It feels like our brightest prospects either settle down at Celtic/Rangers (big fish in a small pond, less so nowadays since Rangers' relegation), languish in mid-table Premiership teams or switch nationality and play for Ireland!
Sporting Lisbon have a good school for young players, but it's also very competitive. Probably next season he will be playing in some small/medium portuguese club to gain more experience.
In case anyone is wondering why the goal was to "win the Champions League with Accrington Stanley" in particular, it's because they've been a byword for being complete nobodies since this milk advert came out in 1989:
The kids are fans of Liverpool. Liverpool's star player drinks milk and says that if you don't drink it you'll only be good enough to play for Accrington Stanley. The kids then want to drink milk because they want to play for a better team.
Does anyone have a recommendation on an online course or tutorial that walks you through doing this type of applied analysis in R? I prefer learning using actual problems, where the complexity of analysis increases gradually.
The Analytics Edge[1] on edX might be what you're looking for. In one of the lessons, they do some rudimentary recreations of the analysis described in Moneyball.
Another course is Sabermetrics 101 https://courses.edx.org/courses/course-v1%3ABUx%2BSABR101x%2.... It's obviously focused on baseball and you have to understand baseball to get much out of it, but many of the lessons on how to map performance statistics to actual game results can probably be applied, at least conceptually, to other sports.
Interesting that EA distributes importance of skill differently for right and left sided players (eg, passing 57% for right mids, and 41% for left mids, while dribbling is more important for left mids than right mids (54% v 38%)).
Might reflect impact of actual players (Robben being a right-sided left footed impact winger).
I often played FIFA on my phone, and the trick was just to hire to best possible scout and then constantly send him out, and your team would be unstoppable pretty quickly. Scouts found outrageously good players with unrealistic frequency.
This is a great experiment for computer vs computer simulations. When I played FIFA, I tried to buy cheap players with properties that worked well with my playing abilities. For me, at least until FIFA14 (did not try beyond that), a player's speed had the most net positive in my scoring ability and I used to buy the fastest yet cheaper players instead of buying a really skilled but slow players[1].
Awesome write up, but one thing I'm wondering, since I don't play FIFA, does the game heavily weight victory in favor of the player? I have to imagine the game would do that in order to make players happy, and if they do it would confuse the numbers here.
As far as my 10 year experience with FIFA is concerned, no, it does not.
The dynamics are very weird and sometimes downright stupid. There is a concept of home and away matches in the league and even with a great team beating everyone easily at home, I have found myself losing consistently away from home to even relegation contenders.
I hence stopped simulating away games as FIFA seems to weight the home advantage for the away team much more than their form, team strength and tactics.
Never seen it taken this far, but I do the same thing in Pro Evolution. I've had a few players with bad ratings that suit my gameplay, players with unused skills (fast defenders who can shoot), etc. My biggest gripe with any of these games is that they don't do more with their own data in terms of immersion. I'd trade the recorded commentary for generated text fed through a TTS engine.
There's a bunch of interesting tactics I've heard of people employing in the FIFA economy, including cornering certain low value cards as well as "investing" in players like Eric Dier who are likely to be upgraded in the next update so that they can sell an overrated silver who is now only attainable as a midlevel gold.
Their scouts aren't that great and their approach wasn't the best either. Trying to replace a player like Luis Suarez, that single-handedly carried the team, with a player like Daniel Sturridge, doesn't really work unfortunately.
As interesting as it is to run a few regressions... in all honesty, it's pretty easy to take the shittiest team in the game to win the CL in a few seasons. It's pretty much my standard playthrough every year, when losing 0 games in a few seasons in a row on the top difficulty gets boring.
Multiplayer is where the challenge is at, the AI is tons of fun but if you've played football games for a while they're not too hard.
Ah I didn't even catch that, thanks. Makes it a lot more interesting, although again these games are geared towards winning, like most games are. If you simulate every pokemon matchup and let them take random actions, eventually your pokemon level up such that they can beat anyone. There's no real way to lose, the only way for your pokemon is to get better.
Fifa is much like it, buy any young player who shows potential (for which you really don't need to run these regressions), let them play, and they become amazing a lot of the time, even completely unknown players. The game's built like that, you don't need to do anything fancy.
I'd be most interested if he ran a control group experiment, i.e. just pick players himself instead of letting the model pick em, or hell even pick random players, and compare how much better he does with his model. He'll surely do better with the model, but he'll also surely win the CL with the control group in a few seasons, which in and of itself is not much of an achievement.
As others have mentioned, doing this in FM would be a lot more fun.
[1]: https://en.wikipedia.org/wiki/Football_Manager