Streaks & Tilt in Heads-Up SNGs Pt.4



Quantifying Variance is a biweekly column in which we’ll take a look at some of the math underlying poker, with the goal of understanding just how probable or improbable various occurrences actually are, and how to tell the difference between what is random and what is not.


Since this series started, my goal has been to find clear statistical evidence for what I’m calling long-term agential variance, which is basically a fancy way of saying “tilt,” although it also potentially includes “reverse tilt,” or confidence effects.


Last time around, we took three high-volume heads-up players as case studies, and found that their win rates did fluctuate to a statistically significant extent over the course of a winning or losing streak. The trouble was that these fluctuations appeared quite individual; for each of them, the effects cropped up to different extents, after streaks of varying length, and sometimes even affected them in opposite ways.


My next question was whether, if you looked at a group of players together, there would be some tendency displayed in the aggregate. That is, I wondered whether it’s fair to say that people in general tilt at a certain point, even if an individual player might actually be clutching just then, instead. It turns out that the answer is yes.


The relevant metric

In doing the case studies, I was looking at the players actual win rates after streaks of a given length, since that’s ultimately what we mean by agential variance: the component of variance that comes from variations in how the person is playing. In order to calculate the win rate, I had to convert from exact streaks (as given by Sharkscope) to cumulative streaks.


To be clear, the difference is that an exact streak of say 10 wins means that we have a loss, followed by exactly 10 wins, followed by another loss. A cumulative streak of 10 wins includes all the times that the player won at least 10 games in a row. The reason I call this cumulative is because it is equal to the player’s exact 10-game streaks, plus their exact 11-game streaks, plus their exact 12-game streaks, etc.


In collecting and analyzing the data, I noticed one interesting fact, which is that most winning players have significantly more (exact) 2-game losing streaks than 2-game winning streaks. This is initially surprising, because with more wins than losses overall, you’d intuitively expect more streaks of all lengths. The reason this is not the case is fairly obvious when you think about it, however: a winning player is more likely to extend a winning streak than a losing streak, so a greater proportion of their wins than their losses will be come in the course of streaks longer than two games.


Once I’d thought about it like that, I realized that one of the reasons that it has seemed surprisingly hard to get a clear look at tilt is that the exact streaks presented by Sharkscope are inherently deceiving. When a player has fewer streaks of a given length than you expect, there are two possible and contradictory reasons for this: the first is that they might be reaching that point less often than you’d expect, but the other is that they might be extending the streak at that point more often than you’d expect, rather than terminating it there. Without distinguishing one possibility from the other, then even when we can spot a disproportionate number of streaks of a given length, it’s hard to tell whether that means the player is actually running hot, or if they’re running cold.


By looking at cumulative streaks instead, we eliminate this confusion by including those streaks which were extended, and everything becomes much clearer. If the player has more cumulative streaks than you’d expect, then they’re reaching that point more often. If they have fewer, then they’re terminating their streaks earlier.

And when we look at a group of 15 high-volume players through that lens, we find that there is a pretty clear picture indeed.


Tilt, plain as day

Note: For all the graphs we’re going to look at, the vertical axis is the percentile difference between the player’s actual streaks of a given length or longer, and what is predicted by a computer model. So, for instance, if the player was predicted to have 100 streaks of a given length or longer, +10% means they actually had 110, while -10% means they actually had 90.