Guest Article: The State of Smash and Rankings

Claim: The Smash community needs a specialized ranking algorithm to be developed for the community’s growth to continue.

Introduction

Hello fellow Smashers, I’m THuGz, a long-time smasher from the Midwest. I’ve been an active member of the community for about 7 years, having lived in Nebraska, Michigan, and NorCal for that time. I attended the University of Michigan where I became good friends with Juggleguy and the rest of the MI community, regularly attending tournaments and sometimes helping run them. More importantly, I’ve studied mathematics my entire life, graduating last year with a degree in Pure Math. I believe this background gives me some unique insight into certain issues in the community and how we might resolve them. Let’s start with the issue of rankings in Smash.

Why We Need Better Rankings

Recently, the need for an objective, reliable ranking system in Smash has become evident for a number of reasons. Leffen recently tweeted regarding the need for top players having the privilege to skip through the pools phase of large tournaments, helping them avoid having to play too many games. Further, Leffen’s visa troubles were also exacerbated by our lack of a centralized ranking system, leaving him with no easy way to prove that he is an internationally known top player who ought to be considered a professional athlete. Add to this the age-old problem of seeding large tournaments, compounded with our ever-growing scene size, and it’s clear that many issues in the Smash community could be relieved with an automatic ranking system.

Aside from the many practical applications of a centralized ranking system, there is also a lack of agreement upon how we rank players currently. After all, the ultimate goal of tournaments is to determine who the best players are, yet our tournaments yield much more data than is ever used for ranking purposes. Currently, tournament data is mostly only relevant for the top tier of players, culminating in an annual MIOM top 100 ranking list. However, many have pointed out the obvious shortcomings of this list, such as the strong bias towards American players, or even more specifically the perceived bias for members of the California Smash community. Of course such biases are impossible to avoid, and it has been pointed out that a lack of European or Japanese players in the MIOM top 100 is a result of a lack of overlapping data between the regions, yet it is a shortcoming nonetheless. Moreover, the top 100 is generated once per year and as such cannot always be an accurate reflection of the perceived skill of our top players. For example, it was widely considered during a significant period of last summer that Leffen was a top 2 player, if not the best in the world; however, the top 100 list was not produced during this period, and so there are no rankings reflecting this point in history.

Outside of the top 100, there are few well-regarded lists for ranking players. We as a community usually default to the regional PRs listed on Smashboards or other media, but those are generated by a handful of panelists and are not without their own controversies and biases. It is often realized after a tournament is seeded that one of the regional PRs used was faulty or outdated, which can significantly affect the quality of tournaments for mid level players.

Why Better Rankings are Hard

As many of you may know, there have been several attempts at such objective ranking algorithms for use in the Smash community. The largest scale attempt was a site called SSBPD, which relied on TOs to report the results of their tournaments, and an Elo rating system was used to generate a score corresponding to the rank of each player. While the tool succeeded in being objective and including a large portion of the community, some faults became immediately obvious. Leaving aside its reliance on TOs to report the results (many of whom were simply not motivated to do so), the actual algorithm itself did not succeed in being an accurate representation of community consensus of skill. The most obvious example of this was Kels placement on SSBPD; while Kels is a skilled player, his Elo rating was inflated past what most would consider reasonable (you can see for yourself on the SSBPD wiki). Kels peaked at 5th place (if my memory is correct), and was ranked 8th as of the last active date of the site. The reason for this inaccuracy is that Kels rarely traveled out of state, yet was fairly active in his own region. He “farmed” many local tournaments as the best player in that region. The sheer volume of tournaments he won contributed to an inflated score, ranking him higher than players such as S2J and Lovage, widely considered to be top players at the time.

The underlying issue is one of regionalization. The Smash community is unique from other games in that it can only be played in its true form in person. This introduces an incredible amount of regionalization in the competitive data of Smash, far different from the data that you would see in online competitive games, such as StarCraft or League, which have fairly successful and accurate MMR algorithms. These point based ranking systems are limited in that the only input taken by the algorithm are the scores of the players and who wins a game. If a 1600 ranked player player beats a 1300 ranked player, the amount of points gained or lost by either player is exactly the same, no matter their region. This works accurately because there is essentially only one region in online play, and matchmaking treats all players of similar skill as essentially the same. The competitive data for these online games is very evenly distributed as a result. (Note: some online matchmaking algorithms take into account region for connectivity reasons, but this level of regionalization is much coarser than that of Smash. Moreover, some games handle this by having separate rankings for each continent.)

Such regionalization in the competitive data for Smash makes it impossible to ever use a point-based ranking algorithm such as Elo, Glicko, or TrueSkill. While the results listed in SSBPD may seem relatively accurate, the results of a tournament could be largely skewed by even one extremely over- or under-seeded top player. Moreover, with the extreme lack of data overlapping all but the very top of NA and European players (not to mention Japan), an objective score-based ranking algorithm will likely not be possible as long as physical tournaments are our primary mode of gathering data.

The Resolution

The reason that an algorithm which resolves these issues has yet to be invented is not necessarily one of difficulty, but a lack of expertise and necessity. The simple fact is that most relevant E-Sports of today are largely based online, and the few tournaments held are for only the very top tier of players. There has not been a compelling financial reason thus far for someone to develop an algorithm which takes into account more than a simple score. Further, there isn’t exactly a wealth of mathematicians in the world who have the free time to develop such an algorithm, not to mention the small overlap of such mathematicians who are familiar with regionalized competitive games.

Realizing this, I decided to spend my last semester at the University of Michigan researching ranking algorithms and developing one which can be used specifically in the case of regionalized competitive data; that is, a ranking algorithm specifically tailored for the Smash community. Through this research I devised an approach to ranking which views competitive data as a network of players (ie. a directed graph of players) which allows us to make estimates of differences in skill between any two players. Though a final algorithm has yet to be completed, I can assure you that the approach is sound and is in fact not very complicated. There is certainly hope for an algorithm to be developed that can meet these needs and for Smash to be accurately ranked.

The other pressing issue is a lack of a centralized database of game results. There are currently certain methods employed to scrounge together sufficient data to accomplish tasks such as making regional PRs or TafoStats, but there is no truly central and automatic means of preparing data that could be digested without much effort by an algorithm. The aforementioned SSBPD relied on TO community effort, something which is less and less likely in our growing scene. I personally ran into this issue, spending a significant amount of time hand-coding results (or calls to the Challonge API) to get enough data to be able to run any kind of tests. Before any final algorithm is developed by me or anyone else, there must be a more straightforward and centralized data repository of tournament results.

Conclusion

I hope that this post has demonstrated the need for an accurate, objective ranking algorithm, but also the difficulty in doing such. Key to achieving any goal on a large scale is first awareness, and so I hope some good discussion can begin around this subject. Aside from the mathematical challenges that I will continue to work on in my spare time, the organizational and programming challenges are admittedly not in my area of expertise. There is a need for TOs and web developers to begin tackling the challenges presented in centralizing the competitive data behind Smash. I believe this is key to driving the continual growth of Smash. We can see from the legacy of data-driven sports (such as baseball) that statistics and amassing data are very powerful tools in encouraging discussion, the lifeblood of any competitive community.

Though I did not discuss the specifics of my approach to solving the ranking problem, I will do so in an upcoming article on MIOM, as well as some suggestions on how we can help to centralize the flow of data to make automatic ranking an achievable reality.

Thanks for reading!

7 Comments

Austin Uhr June 8, 2016 at 12:26 pmLog in to Reply

Sounds great! Excited to hear about the rest of it!
Max June 8, 2016 at 1:01 pmLog in to Reply

Awesome work! It’s so indicative of smash’s diverse community that a math major would be passionate enough to take up this work. Thank you so much! I hope that the repository you dream of one day coalesces, your algorithm seems to useful to be left without data. We appreciate your commitment to the scene.
Bernal June 8, 2016 at 1:38 pmLog in to Reply

A suggestion for ensuring TOs are punctual and motivated in submitting results to a database would be to provide incentive for them to do so. Perhaps MIOM, as the premier community news hub, could demand of TOs that they hand,complete and detailed results of their tournaments into a database if they wish to have their tournament advertised or featured in an article. The punishment for not doing so could be not having your tournaments promoted on MIOM for a certain period. I feel like having your regional/national advertised on MIOM is a great advantage for any TO and entirely worth the inconvenience of having to upload the results promptly.

Just a random idea.
- Arche June 8, 2016 at 6:35 pmLog in to Reply
  
  While this is not a bad idea, it is one which won’t affect a large part of the competitive scene. Small or recluse tournaments won’t be promoted on MIOM, and as such, the TOs won’t have much of an incentive (at least not this one) to report the results.
Dylan Tate June 8, 2016 at 5:05 pmLog in to Reply

Thank God we have geniuses like this dude in our community. I hope you succeed in figuring out the ins-and-outs to make this ranking system a reality!
Dhruv June 9, 2016 at 12:48 pmLog in to Reply

What do you think of a simpler, tennis style placement-based ranking?
Evan June 15, 2016 at 12:26 amLog in to Reply

Just some thoughts from an incoming freshman CS Major
-Although there isn’t really a centralized place for “official” score reporting, it seems like between ssbwiki.com and wiki.teamliquid.net, most tournaments of significance are reported. Couldn’t it be feasible to regularly (daily or so) scrape these platforms for tournament data?
-Any universal database of smashers would have to face the problem of name uniqueness. This would need some kind of id associated with each smasher. It could be made simpler that each name immediately gets an ID of #1 and others are only added if necessary. (i.e. Mew2King#1, Armada#1, Hungrybox#1 and if someone else wants to have the tag Hungrybox, they would default to Hungrybox#2, etc.)
-Are you at the point where you’re looking for actual help from coders/web developers/tournament organizers and the like? When you are at that point, will you reach out to people you know or ask the community for volunteers? I’d love to play a part in this