What’s the idea?
Several years ago, as the United States started to have players in multiple top-tier leagues across Europe, I became frustrated with the fact that I had no idea how impressed we should be with a guy playing for the worst team in the Dutch Eredivisie, or the best team in Ligue 2 in France. Then even if we could figure out roughly how the various European leagues compared, how do we know whether a guy playing ten minutes every other game in Hungary is actually more impressive than the guy starting every game for the Portland Timbers? I wanted to come up with a way to compare what our players were doing on the field, for their clubs, across the globe. The hope wasn’t that it would be some perfect answer as to who is the best American player. I was just looking for a short-hand that could give us an idea, at a glance, of how these guys compared.
How do we fix this?
Initially, I had really ambitious ideas. I was going to create this very complicated formula that used goals and assists and playing time, and combined that with scores for the both teams playing in the game, and the competition the game was a part of, and it was going to be glorious and game-changing. Not only that, but I would need some kind of decay regression that would make the individual game scores count less as they faded into the past, which would give me a constantly evolving rating. I quickly realized I didn’t have the math skills to even begin to figure out how to accomplish this, and even if I did I couldn’t figure out where to get the data to feed into the non-existent formula. I would dust off the idea from time to time, but I couldn’t ever crack it, and I would set it aside in frustration. Then, on February 25, 2020, HalfSpaces.com published a post that broke the idea open for me.
In this post, it was like HalfSpaces had been reading my mind. He came up with a really simple formula. First, you find how Bob’s club rates according to FiveThirtyEight.com’s Global Club Soccer Rankings. Then, you determine what percentage of the available minutes Bob has played for his club. Then, you create a composite player grade score by averaging Bob’s ratings at WhoScored.com, FotMob.com, and SofaScore.com. Multiply those three numbers together, and you’ve got a new number for Bob in the Bundesliga that you can easily compare to Mitch in MLS, and you’ve got a snapshot of how the two match up.
Clearly, HalfSpaces was a genius. However, the more I thought about it, the more I thought the concept needed tweaking. For one, playing for Celtic is great, and if you’re starting for them against Manchester City in the Champions League, you should get a ton of credit. But what if you’re at Celtic and you only ever play against Ross County in mid-week games? Surely there should be some way to distinguish between the two scenarios. Second, the original formula was pretty static. You’d have to pick how far back you were going to go in terms of the percentage of minutes played, and the club rating was just telling you what FiveThirtyEight thought of the team on the one day you pulled it. We needed a formula that would differentiate between individual games, and would constantly be looking to the changes in the team ratings to keep things fresh. So I decided to work on my own, tweaked version.
Formula 1.0
For rating the player’s individual performances, I couldn’t come up with a better version than HalfSpace’s original idea. The only change is that I would average the scores for each game. So the first element was this average, which would still be on a ten point scale, with most players scoring between six and seven points in most games.
Similarly, I couldn’t improve on the FiveThirtyEight rankings. I looked around to see if there was some similar, competing model that I could average in the same way I was averaging the player scores, but I couldn’t find anything out there that really worked across the major leagues. However, I wanted to capture the difference between a matchup of continental juggernauts and a mid-week early-round domestic cup game. My initial thought was to simply average the ratings of the two teams involved in the game, but eventually settled on making the game score two-thirds the player’s team score and one-third the opponent’s team score.
Next, I kept the playing percentage score, but obviously looking at the individual game rather than over a specific time period. After creating several test scores by multiplying those three elements together, I decided to divide the final numbers by five because it made most of the scores two digit numbers, and I just liked the way that looked.
Finally, I had to decide what to do with those scores for individual games. On the one hand, I liked the idea of capturing a player’s proven ability when he is given the opportunity to actually play. On the other, I liked the idea of “punishing” a player who never played, or who only ever got one minute stoppage time appearances. As Bill Parcells (supposedly) said, your best ability is your availability. How can the USMNT count on you to start against Mexico if you rarely ever see game action for your club? This led me to track two different averages.
The RAW average would take the last 25 club games that Bob actually played in, and average them together to give his RAW score. The idea behind RAW is this is the average club performance Bob has turned in over roughly the last nine months worth of games. Bob will get “punished” for games where he only plays a few minutes, but if he doesn’t appear at all for any reason, that game is excluded from the RAW average.
The HOT average only looks at the last 10 club games, but it includes every game, even if Bob missed all 10. HOT is the “what have you done for me lately” score. The idea here is that, heading into a specific qualifying window or national competition, who has actually been performing for their club teams.
Once I had these initial ideas down, I built some spreadsheets, went back to January 1st, 2020, and started recording individual game scores for the top players in the pool.
Formula 2.0
Once I got enough data to start sharing the averages and showing how players compared, the biggest constructive critique I saw was that the minutes played component was punishing players too harshly. Many people suggested ditching the distinction entirely, but I considered that a non-starter. Eliminating playing time entirely would result in too much lost information. Eventually, I had a brainstorming session with one of my smarter, more math-y friends, and eventually he helped me come up with an elegant change. Rather than using the percentage of minutes played in a single match, we decided to use the square root of the minutes played, and then divide that by 9. Using the square root allows the “punishment” to slowly curve down, rather than just going straight down at 45 degrees. At the time, I thought of this as the Christian Pulisic tweak, because he was playing for one of the top clubs in the world, Chelsea, but we was on the bench pretty frequently, and the original formula was dragging him down below even part-time MLS starters.
Less frequently, people would say the score format was non-sensical. A score of 100 was very good, but it wasn’t actually perfect, and at the time Sergino Dest would regularly score above 100 in individual games for Barcelona. I decided to re-work the scale entirely, adjusting it so that an “ideal” game where Bob started and played the entire game for Bayern Munich against Chelsea, where Bob averaged an 8.0 rating across the three player rating sites, would result in a score of 1.000. As an example, this took Gio Reyna’s scores as a Borussia Dortmund starter from something like 65.5 to their current .554. It wasn’t a substantive change, but I liked the way it looked and felt like it made the “tier” groupings pop.
Formula 3.0 (UPDATED 6.26.23)
538 stopped updating their team ratings on 6.14.23 (presumably related to their layoffs and restructuring). A year ago, that would have been the end of the formula unless I had gotten really creative. Luckily, early this year, Opta debuted their own take, and they cover FAR more clubs than 538 did (over 13,000 as of today). Starting the second week of February 2023, I’ve been recording the 2.0 scores using 538, and a 3.0 score using Opta. My play had been to transition everything over to Opta as the European leagues started back up this fall, but 538’s move forced me to make the change now.
The basic idea is the exact same, but I’ve obviously swapped Opta’s scores for the two teams in for 538’s, and I tweaked the modifier slightly to get the resulting 3.0 scores in the same ballpark as the 2.0 scores.
The immediate changes you’ll notice with the Monday Update are that non-Top 5 scores have bumped up a bit, and Top 5 scores have dropped a bit. You’ll also notice that several of our guys at the biggest clubs this past year took a huge hit. That’s really an issue with lack of data, as guys like Chris Richards or Christian Pulisic lost credit for games before the second week of February. You’ll see those scores jump back up quickly as soon as they start playing club games this fall. The same data issues forced four guys out of the 500 Club. Again, I expect them all to quickly jump back in when the new season kicks off.
One final note: Opta is much closer to what I was imagining when I started this whole journey. I would highly recommend reading about their methodology, and playing around with their rankings. You can find them here: https://theanalyst.com/na/2023/06/who-are-the-best-football-team-in-the-world-opta-power-rankings/
Substack
Up to now, I’ve mainly shared these ratings on Reddit. I keep up with it because I’m interested, but I’ve always assumed there was at least a small audience out there who might also get something out of it. Now I’ve decided to formalize things a bit. The plan is to post the weekly Monday update here on Substack. If you want to subscribe, it will come directly to your mailbox. Those Monday updates will always be free. Eventually, if there’s a big enough audience and people start asking for more content, I might add a paid tier, but the basic Monday updates would stay free. I also plan to post the link to those Monday posts to Reddit, so that people who don’t want to subscribe can still have access.
Thanks for reading, and let’s all cheer on the USMNT.
RAW and HOT aren't acronyms, they're just names. Simple version is that RAW is the last 25 club games that player actually touched the field, HOT is the last 10 games the club played.
RAW is trying to show what the player has actually accomplished on the field for the last season-ish worth of games he's played in, HOT is trying to show what he's accomplished, or failed to accomplish, in the last couple of months.
Longer explanation here: https://usmntteamsheet.substack.com/p/formula-background
I’m glad you’re enjoying it. I was already doing it for my own curiosity, figured I might as well put it out there for other like-minded folks.