Hacker News new | past | comments | ask | show | jobs | submit login
PageRank Algorithm Reveals Soccer Teams' Strategies (technologyreview.com)
132 points by Anon84 on July 3, 2012 | hide | past | favorite | 32 comments



Yeah, I'm not sold on this one. This sounds like it's trying to make a story up after the fact. Of course Xavi gets the ball the most, he plays the point of a diamond midfield, or the "trequatrista" role. Since the point of soccer is to move forward, he is in the position most likely to touch the ball the most (middle of the pitch). The left back (Jordi Alba in this year's Euro) won't touch the ball too much ... because he plays left back.

Similarly, for Italy, I don't need any graph theory to know that Pirlo gets the ball the most because (A) he's their best player and (B) because he plays in a deep position in the midfield so it's easy for him to get the ball from the goalkeeper, defenders, or other midfielders.

Long story short - teams set themselves up so their best players get the ball the most. You don't need to be a mathemagician to dig that nugget up.


Actually, in the final Pirlo did not have the ball the most for Italy because he was closely marked by Xavi. A central defender, Barzagli got more touches than him. It would be interesting to see the graph of that match.

Also your final assertion that teams set themselves up so their best players get the ball the most seems a bit off to me. Consider FC Barcelona. Lionel Messi is arguably their best player but you would be hard pressed to find a game in which he has the ball the most since he plays high up on the pitch as an attacker.

I agree with you in that single match analysis using this technique doesn't really produce any new insights. I think this analysis would be very interesting when comparing between different games however. e.g. comparing Italy's match against England where Pirlo was effectively given free reign of the pitch to their match against Spain where he was shut down for most of the match. I would love to get my hands on some raw data from Optasports!


On a tangent.. I'd love to see a heat map of Messi's movement for Barcelona (and Argentina). Whenever I watch him he's all over the place. And to my eyes, he's most dangerous when picking the ball up either from the midfield line or on the touchline half way up the field. In both cases, when runs at pace with the ball he's diabolically good. Any data/visualization to track that would be fantastic to see, especially filtering only on those instances where he causes real damage to the opponent (e.g. a shot on goal for him or a teammate)...


For any game from a major league, you can go to the soccernet game summary and click on GameCast to see a heat map for any player. Here's Messi in the Barcelona 3 - 1 Real Madrid from December: http://i.imgur.com/XEvpW.jpg

(From http://soccernet.espn.go.com/gamecast?id=323887&cc=5901 )



A trequartista is someone between an attacking midfielder and a second striker. Xavi's more of a central midfielder. He certainly wasn't at the tip of the midfield diamond--out of Spain's normal midfield six, Xavi plays ahead of Busquets and Xabi Alonso, but behind Iniesta, Fabregas, and Silva.

If anyone the closest Spain have to a trequartista is Fabregas, though he was really more of a false nine. The idea of a false nine is that you play an attacking midfielder or second striker, but there's no striker in front of him. This pulls the defenders out of position so you can get around them.

You might say that Iniesta, Fabregas, and Silva were listed as forwards. Actually, Spain didn't usually play any forwards. It might have been listed as a 4-3-3, but it was more of a 4-6-0 with three attacking midfielders. There's two reasons for this: one was that David Villa is injured and Fernando Torres has had shaky form for the past two years, and the other is that Spain has plenty of great attacking midfielders. Individually, David Silva, Andres Iniesta, Cesc Fabregas, and Juan Mata are all major stars and are (or were in the case of Cesc Fabregas and Arsenal) the main creative force for their respective clubs.


I would be skeptical about the usefulness of this data to soccer. It can answer questions like "how many passes" or how connected a player is but it doesn't tell you how to play like Spain or how to beat Spain. It just confirms what you can tell by watching Spain on tv: they pass lot, and they're really skillful. The number of completed passes, distance ran, assists, shots on/off target, goals, etc is already heavily tracked (http://www.optasports.com/sports/football.html).

As an aside, the reason for the prevalence of the tiki-taka philosophy is largely because of the legendary Barcelona youth training camp - La Masia (http://en.wikipedia.org/wiki/La_Masia). The whole way of thinking about soccer differs massively in the British Isles compared to continental Europe - in England and Ireland it's the outdated 'kick-and rush' style football (aka "hoof it up to the big striker") versus the more patient possession-focused, passing game in continental Europe.


Its also a factor of a team making best of its constraints. The best players in spain are diminutive and not as physically strong as the other teams.They cant compete with the physical game style of England or the stronger players from Germany(who by the way are also technically gifted, thanks to a total overhaul of their footballing structure which started 10 years ago.[1]). And hence the tiki-taka philosophy of football which deprives the other team of possession. On average this year at the euros Spain had possession of the ball about 60% of the time in all their games(except the final). They conceded just 1 goal in the entire tournament.

[1]http://www.guardian.co.uk/football/blog/2010/jul/02/world-cu...


>"..in England and Ireland it's the outdated 'kick-and rush' style football (aka "hoof it up to the big striker") versus the more patient possession-focused, passing game in continental Europe. "

This is the very reason as to why England fails to perform in the big competitions. Lack of creativity.

> "...but it doesn't tell you how to play like Spain or how to beat Spain."

Exactly this data might provide some statistics on the formation and player agility but a team's performance is largely dependent on the gameplay and the strategy of the manager.

I find this site: http://www.zonalmarking.net/ to be much more useful when it comes to tactical analysis.


It was going to be interesting to see how long in this thread before someone mentioned Zonal Marking.

It is interesting that the analysis doesn't mention Barcelona. That might be because reactive, kick long football has beaten Barca twice in Champion's League in the past 4 years. Mind you, Barca are one of the best club teams in the past 30 years at least.

Inter Milan and Chelsea had dramatically less of the possession and passing than the tikki-takka of Barca but both managed to get over the top of Barca.

A team's performance is also dependent on the personnel. The Barca core that is also the core of Spain and they are phenomenal. Who knows when another midfield like Xavi, Busquets and Ienesta (and Messi for Barca) will appear.

Or as someone said about Italy vs Spain, Spain won because they had 3 players at least as good Pirlo.


Zonal Marking is a fantastic site for tactical minded football fans. Highly recommended. Not overly surprised that its referenced 2 or 3 times on HN. Switzerland is the closest example of Chelsea/Inter in the case of Spain. Switzerland shut down the midfield pretty effectively, however Spain were particularly awful in that game, and Spain did remedy that situation later on.

In watching Euro2012, I have found it amusing about Spain being called boring by plenty of authors. The only way teams have found to stop Spain has been to park the bus, and let Spain have the ball for long periods. The negativity of opposing teams was reason for the lack of activity in the final third of the field. The result Italy in the final was a great example of what happens when a team tries to attack Spain. They get beaten soundly.


You'd think that tiki-taka relies on unattainable levels of coordination and high training between players, but it's not that simple. Barcelona's believed in a high-possession, fluid mentality since Johan Cruyff, but their tactic and style under Frank Rijkaard was far more flashy and flair-driven than Guardiola's patient, possession-driving short-passing game.

It also turns out that a high-possession short passing style is within the grasp of many teams. Look at Swansea in the Premier League this past season.

And it oversimplifies things to say that England is all kick-and-rush and continental Europe is more focused on possession. For one, some English clubs have favored possession for decades, including Arsene Wenger's Arsenal. For another, counterattacking, as opposed to patient, possession-driven build up play, is very popular in Italy. If you saw much of Italy before the final, they did appear to "hoof it up to the big striker"--but really, they were making pinpoint passes down the length of the field right as Balotelli broke the onside trap. And the teams that do beat Barcelona, continental and otherwise, seem to do so by abandoning possession and "parking the bus"--Jose Mourinho's Inter would even clear the ball as soon as they managed to win it as to hold their defensive shape, rather than try to make attacking moves and be drawn out of position so they were vulnerable to Barcelona's attack.


I think this is only one dimension. Player height, weight, left-footedness or right-footedness might also be crucial in figuring out how to beat Spain. At the very least, you know that Xavi is a key player in the entire network - of course, from mere observation, many have that assumption already but from the graph, you'd know who are the top 3 passers to stop.

I'd also take into account the positioning of the players on the fields, the runs that they make and etc. It'd be interesting if someone can build a simple simulation of how Spain's positional play (measuring distances between the players) based on tracking the last few years of the Del Bosque's management.


You wouldn't miss Xavi, but I'd say it would be a lot easier to miss Busquets. His style of play doesn't get as much attention, because he does very little, but what little he does has the biggest impact: http://www.youtube.com/watch?v=ijDdpNxPyPU


I write for MLSsoccer.com and I do some very similar types of analysis.

Here is something recent including similar network graphics and player average position. http://www.mlssoccer.com/news/article/2012/04/10/central-win...

I've done a bunch of stuff with centrality and other stuff, but that isn't exactly my target audience at the moment.


For someone who follows soccer closely, none of the revelations were surprising, apart from the centrality of Capdevilla, which again can possibly be explained by the fact that Spain play a passing game right down to the goalkeeper. So while the goalkeeper of other teams hoof the ball down the pitch, Spain's goalkeeper prefers to pass it to one of his available defenders. Capdevilla being the left back is a usual supect. I am sure the right back would also have a high network centrality. Taking nothing away from the analysis though, its still pretty cool.


Capdevilla was fantastic, but I'd say Jordi Alba managed to improve on him this euros. Incredible LB.

Villa and Puyol haven't been properly replaced though. Playing Ramos as CB left Arbeloa on RB and he's not at the same level.


Villa would come back for the world cup and Jordi Alba was just plain awesome. The bigger concern for me would be how to replace Xavi who would be 34 by the time of the world cup.


Pirlo for Italy was 34 for this years Euro's, and look at the tournament he had. Its possible if Xavi is rested appropriately that Spain can have the same effect in Brazil. Scary actually.

Stats are important to Football, but pure data is useless to sport in general without a way of applying the data. Interesting is how the data is applied more than anything else. Can comparisons across time be drawn to teams across era's to statistically determine who the best team of all time are?


One does not simply replace a player of the quality of Xavi. He and Iniesta are the big difference makers for Spain (and Barcelona for that matter). They've had a historic run the last 5 years for club and country. It might be asking too much to continue it another 2 years, but as it is, it's been great to watch.

Anyway the first 15 minutes of the Euro final were mesmerizing by Iniesta. When the game was still a contest, He and Xavi were lethal.


Spain's success is not about passing accuracy alone, nor can one reduce team quality to measuring the number of successful passes. Sure it's generally a good statistic, as is possession, but many other factors count and completely different styles can be equally successful.

People tend to overreact to the latest result.

Spain's biggest success was being able to accommodate to a system without a reference striker (Villa being injured). Puyol was also out and Sergio Ramos played very well as CB, leaving Arbeloa as RB despite being a dead spot supporting the attack. Xabi Alonso had a lot fewer of his long pin-point passes (his best quality) and had a more defensive role, Iniesta held the ball more to link Xavi and Silva. All in all it was very surprising that they won despite missing two of their very best players for the tournament. However, had they lost to Portugal in the penalty shoot-out (and they play a COMPLETELY different style with much fewer passes) then we wouldn't have this article in FP right now.


I have strong doubts about the effectiveness of this method: Why do I need complex algorithms to know Xavi is the pivot that drives Spain forward? That can be ascertained by either watching the game, or looking at his passes received/made stats.

But if we already found ourselves talking about soccer, I would like to shamelessly plug a wonderful soccer tactics website: http://www.zonalmarking.net/2012/04/17/bayern-munich-2-1-rea... (I linked to what I think is the best recent article)


Because complex algorithms can reveal information that isn't immediately obvious from basic stats, let alone "gut feelings" that you get from merely watching a game. This type of thinking is what leads to disparities between who the best players actually are and who the fans think the best players are (see: baseball All-Star voting).


I wholeheartedly agree with you on the gut feeling vs. stats issue, but this complex algorithm in and of itself doesn't seem to reveal any information that isn't obvious from basic stats: Xavi is clearly seen as the "dynamo" from the basic stats perspective (in fact he routinely breaks passes received/made records). If one would come with a complex algorithm that would reveal something new (to me, to the media), that would be really exciting for me.

Since you seem to have an interest in such situations (complex techniques claiming discrepancy between consensus and actual performance), I'll go off on a tangent and shamelessly plug (again! don't people have any shame these days?) what I think is a very good example: http://wagesofwins.com/wins-produced/how-to-calculate-wins-p... http://wagesofwins.com/faq/


I'm all for deep analysis of stats - for example, tracking which players are most often involved in the buildup to a goal or a chance, or determining which players are the best at receiving long passes. But this particular method doesn't seem to provide any fresh insight.


The network shows Torres upfront for Spain despite his not appearing in the final until the second overtime (106th minute), but shows Van Bronkhorst who was substituted in the 105th minute?

From all that analysis, it doesn't really follow that the 2010 World Cup Final was Spain 1-0 Netherlands with the sole goal coming in the 116th minute after each team used all their allotted substitutes. Though Spain had more of each, corner kicks, shots and shots on target were similar for both teams.

The real telling statistic is fouls - which ultimately led to the Dutch being a man down (Heitinga) when the Spanish scored, through the center back position which he played.


Semi-tangent: Are there startups who do work like this, analyzing sports games or developing technology for sports (other than fitness trackers)? I'd be interested in learning more about them if so.


Yes, there's a whole world of startups devoted to that.

The one I know is a Uruguayan startup called Kizanaro.

http://www.kizanaro.com/web/index.php?lang=en

They work with the Uruguayan national team (American champions and #2 ranked squad behind Spain), although for now much of their analytics are human-assisted.

The MIT Sloan Sports Analytics conference is the mecca of this kind of companies:

http://www.sloansportsconference.com/


Check out StatDNA


Interesting approach but I'm not sure how significant these graph theoretical methods are when the networks are so very small. This would be more revealing on a metrized graph.


In other sports both teams are expected to score tens, even hundreds of times. In soccer one goal often is enough. Collecting stats for analysis seems entertaining for the observers, but how is that useful for the teams, in a sport that is so dependent on luck?


Hundreds of times? 100 points in an NBA game is about 50 times on average--you miss 1 out of 2 free throws about as often as you get a three point play, probably. 10 runs is high for a baseball game. 50 points in an NFL game is about ten scoring opportunities, if you figure five touchdowns and five field goals. And yet statistics are meaningful for all of these games, just as they are for soccer.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: