> Most games aren't chess -- where the only variance is picking who's black and who's white -- in fact, they might include dozens of RNG mechanics (from critical strikes to ability rolls, to spawn points). These mechanics (while fun and well-designed) might pollute your "idealized" model. There's also the problem of RPS (rock-paper-scissors) mechanics or pick-counter-pick mechanics which will also heavily skew win rates. For instance, given a slow combo Magic deck, you will most likely auto-concede to mono red aggro (regardless of skill level). If you're using Elo, this will pollute your model. (Hint: you shouldn't be using Elo.)
None of which matters? All that means is that the results of individual games are a bit higher variance. Elo handles that by design. If you lose a certain proportion of Magic games to less-skilled players then this should be considered a reflection of your skill, because the only reasonable definition of skill at the came is the rate at which you actually win it; anything else can be gamed and so should be ignored.
> Most games also don't have chess' high skill ceiling. Chess has such a high skill ceiling for a number of reasons -- it's one of the oldest games still being actively played, for one. Suppose your "game" is simply the flip of a coin (everyone wins 50% of the time). Zero skill involved. Trying to model win-loss-ratios using a sigmoid curve is silly. Obviously, no game is going to be a coin flip, but there's a world of difference between chess and DOTA.
That's also something that Elo handles just fine? If every game is a coin flip then everyone will end up with the same Elo. If player A has x more Elo points than player B, then they win y% of their games. If your game has a skill ceiling where even a complete beginner always wins, say, 20% of their games, then that just means no-one will ever be able to rise above a corresponding Elo rating.
> That's also something that Elo handles just fine? If every game is a coin flip then everyone will end up with the same Elo. If player A has x more Elo points than player B, then they win y% of their games. If your game has a skill ceiling where even a complete beginner always wins, say, 20% of their games, then that just means no-one will ever be able to rise above a corresponding Elo rating.
That's not how it works. The distribution you end up with will not be uniform, it will look like this (just ran Elo with a coinflip; 11 players, 1000 matches): https://imgur.com/9O82pRj
On the long term, I think this will tend to a geometric distribution with a low p value.
If you're matchmaking players against equal-ranked players, then each match is just +/- 50 points, you'll get a binomial distribution which tends to normal as n gets large (assuming a large player pool so each player's results are independent). If players play players with different ratings then that will tend to push their rating back towards neutral. You certainly don't get a geometric distribution because the rating algorithm is completely symmetric.
This only happens in the rare cases where you're matching players against (exactly) equally-ranked players. You can mitigate this by always trying to match as "close as possible," but it's only a mitigation. Try simulating random matchmaking with Elo, and you'll get something like this: https://i.imgur.com/1Y08jUB.png (1000 players, 100,000 games). In my simulation, I set k (the Elo constant) = 50.
So you've patched this library somehow? Because when I run your code I get a result that's just full of 0 ratings.
But in any case I'm not at all convinced that your charts don't just show the normal distribution that we'd expect, just in some weird way. (Did you test your plotting methodology against some simpler rating system before using it to draw conclusions about Elo?). Plot a normal histogram, or a density plot if you're feeling fancy: https://towardsdatascience.com/histograms-and-density-plots-... . I'm betting the result is just the bell curve that we'd want and expect.
Author decided to do something fancy which will only work when number of players is less than 1/2 * starting Elo rating.
> But in any case I'm not at all convinced that your charts don't just show the normal distribution that we'd expect, just in some weird way.
As mentioned, you end up with a geometric distribution. I covered a similar phenomenon in a blog post I wrote last year[1]. See Theorem 3.3 in this paper: https://kconrad.math.uconn.edu/blurbs/analysis/entropypost.p... But in short, the geometric distribution has maximal entropy over (0,∞) given a known mean (in our case, the mean will always be 1000).
> As mentioned, you end up with a geometric distribution. I covered a similar phenomenon in a blog post I wrote last year[1]. See Theorem 3.3 in this paper: https://kconrad.math.uconn.edu/blurbs/analysis/entropypost.p.... But in short, the geometric distribution has maximal entropy over (0,∞) given a known mean (in our case, the mean will always be 1000).
Another reply already told you that's irrelevant to Elo, because Elo can go negative (and if it couldn't then the mean wouldn't always be 1000). It's probably going to be normal, and drawing an actual histogram of a simulation like yours comes out looking pretty much like a bell curve: https://imgur.com/YBDp4uI .
As far as I can see none of your claims about Elo stand up. Why do you think you've shown the things that you're claiming?
None of which matters? All that means is that the results of individual games are a bit higher variance. Elo handles that by design. If you lose a certain proportion of Magic games to less-skilled players then this should be considered a reflection of your skill, because the only reasonable definition of skill at the came is the rate at which you actually win it; anything else can be gamed and so should be ignored.
> Most games also don't have chess' high skill ceiling. Chess has such a high skill ceiling for a number of reasons -- it's one of the oldest games still being actively played, for one. Suppose your "game" is simply the flip of a coin (everyone wins 50% of the time). Zero skill involved. Trying to model win-loss-ratios using a sigmoid curve is silly. Obviously, no game is going to be a coin flip, but there's a world of difference between chess and DOTA.
That's also something that Elo handles just fine? If every game is a coin flip then everyone will end up with the same Elo. If player A has x more Elo points than player B, then they win y% of their games. If your game has a skill ceiling where even a complete beginner always wins, say, 20% of their games, then that just means no-one will ever be able to rise above a corresponding Elo rating.