Pokeit Paper #1 – Quantifying the Bias in Observed Hands
We are all familiar with idea that the distribution of hands shown at showdown is different from the distribution of all hands dealt. The specter of this bias has confounded the poker botting community and lead many to eschew estimating hand ranges all together. While my scouring of the pokerai.org archives was not totally exhaustive, I don’t believe that anyone else has tried to quantify the bias in showed hands in a systematic way. In this analysis we will use a two econometric models, one created from a dataset revealing every hand, and the other from a dataset limited to hands showed at showdown, to predict a player’s hand range distribution in several different game states. By comparing the showdown equity of the match-up between an arbitrary hand and the ‘all hands’ and ’showed’ hand range distributions, we can estimate the bias in terms of its affect on showdown equity. In a meta-analysis of +45,000 game states, equity estimates derived from the dataset limited to hands showed at showdown were -1.34% ± 2.1% lower on average than those derived from the full dataset – indicating a slight upward bias in the strength of observed hands.
A naïve look at the bias
Observed hands are subject to selection. Players fold the majority of hands before they actually reach a showdown, and even then, the loser can muck his hand if he’s beaten. In a sample of 1,225,010 games of low limit NL-Holdem 6-max, only 69,187 hands (5.68%) were actually observed at showdown. Because players get to choose which hands they want to play, on average they will select to play hands with a higher expected value over those with a lower expected value. If hands with higher expected values are more likely to be played then they are also more likely to be showed at showdown. Because this selection is non-random, it introduces a bias.
To refine what we mean by better or worse hands, we will use the Sklansky Hand groups to order all of the starting hands in Texas Hold’em into 9 ranked categories. While the Sklansky groups aren’t perfect, they have the admirable attribute of being well known:

Using my buddy Joe’s personal database, we can identify the size of the selection bias by comparing the frequency of observing each hand group in the full dataset of 219,680 hands to the frequency of observing each hand group in the limited dataset of 8,329 hands showed at showdown. Plotted out, the frequencies for each category give us the hand range distributions for showed hands vs. all hands:

Figure 1: A plot comparing the frequency of observing a hand in each of the 9 Sklansky Hand Groupings for all hands dealt vs. only those showed at a showdown. The probability distribution for all hands is indicated by the black line, while the probability distribution for shown hands is indicated by the blue line. The yellow bars show the net difference between the frequencies of each group.
The observed hand selection bias appears to be quite large. Crappy group 9 hands represent 61% of all hands dealt, but only 20% of hands that are observed at showdown. Likewise you are more than five times as likely to observe a group 1 hand like Aces or Kings in a sample of observed hands than you are in a population with all hands revealed. The showed hands distribution is roughly flat before dipping around group 6 – 8 and rising up for group 9. Meanwhile, the revealed hands distribution is heavily left skewed with the majority of hands coming from the garbage group 9 category.
Have these results just shot a massive hole through our plan of modeling opponent hand range distributions? In a word, no.
The above chart is a bit misleading. While our revealed hand dataset includes hands that are showed and hands that are folded somewhere along the way, it also includes hands that are immediately folded pre-flop. When trying to model and opponent’s hand range from their actions & the game state, we’re not really concerned about the hand range of a player who just folded. Slimming down our dataset to only those hands that are played is the first step towards looking specifically at the bias in the conditional probability generated by using only showed hands.
Conditional probability bias
Defining a hand range is all about conditional probability. For example, if we hold Kings in the cut-off, we may want to know the conditional probability that the guy who just raised x3bb under the gun has Aces. If P[A] is the likelihood that the player has aces, and P[B] is our set of 2 game state variables – his bet size of x3bb, and his position, UTG, then P(A|B) is the conditional probability that our opponent has Aces, given that he raise x3bb UTG.
The set of B game state factors is whatever you decide to put in your model. It’s limited only by your creativity, grasp of the game, available data, and computing power. My simple pre-flop model (described at detail here and here) uses the following inputs to model the hand range:
♦ Player action on each pre-flop ‘round’ of betting (call/check or raise)
♦ Action behind the player (call, raise, 3-bet, etc.)
♦ A variable that combines position with # of players at the table
♦ Amount bet on each pre-flop ‘round’
♦ An interaction between player action and action behind
♦ An interaction between player action and position/# players
♦ An interaction between player action and amount bet
And the output variable is a categorical variable 1-9 corresponding to groups in the Sklansky starting hand rankings. Since the model is based off of the hands of one particular player, Joe, no player type variables are needed.
A database of 219,680 revealed hands was used to produce a dataset of 45,039 hands which were not folded pre-flop, and a dataset of 8,329 of hands which were showed at showdown. Comparing the results produced by the ‘all hands’ dataset to the results of the ‘showed’ dataset should show us just how biased showed hands are.
Example situations
We will examine this bias by modeling the hand ranges of a few example game states of 6-max No-Limit Hold’em:
♦ CO limps
♦ CO calls MP’s x3bb raise
♦ UTG raises x3bb
♦ Button 3-bets MP’s x3bb raise
♦ A weighted average of all game states
The hand ranges derived from the ‘all hands’ and ‘showed’ datasets will be plotted for each game state with the net bias being the displacement between the two distributions.
Estimating an opponent’s hand range is only the first step towards quantifying the bias in showed hands. What ultimately matters when we’re at the table is how our hand holds up against our opponent’s range – on average. Multiplying the probability of our opponent holding each Sklansky hand group by the probability of our hand winning against each Sklansky hand group gives us our equity against our opponent’s hand range. From now on we’ll call this the range-equity. We can quantify the selection bias for any given game state by comparing the range-equity produced by the ‘all hands’ model to the range-equity produced by ‘showed’ model.
CO limps

Here the game state is CO limping after 2 folds behind. This is a fairly weak play as evidenced by the majority of hands coming from the lower ranked hand groups. The ‘all hands’ distribution has peaks at group 9 for 34% and at group 7 for 20.9%. The ‘showed’ model is also left skewed, but group 9 is lower by 14% and the middle peak is spread across groups 5-7. This is the first evidence that the conditional probability is biased, and it confirms our suspicion that the bias is shifted towards stronger hands. We can obtain range-equity estimates for our ‘all hands’ and ‘showed’ models by multiplying the hand-range distribution by the equity a particular hand has against each Sklansky group.

To illustrate the breadth of the bias, we matched the distributions with the hand that generated the greatest bias in range-equity (max), the hand that generated the smallest bias (min), and an average of every hand’s equity against each Skalansky group (average). For CO limps, the range equity of the showed dataset would be biased down by a max of -1.89% if you held 44s. That the bias is negative indicates that the ‘showed’ model predicts a stronger hand for the CO and a lower corresponding equity for your hand. Taking a look at the equities and it’s not hard to see why 32o generates the lowest bias. If you held 32o, your equity against each group is uniformly bad. Biases in the hand range between the groups changes the range-equity very little. Meanwhile, for 44s, group 9 has a greater equity (63%) than any other group. Since, the greatest bias (14%) is found in group 9, this should have a large impact on the range-equity estimates. The average bias, or the range-equity bias you would get if you took an average of every hand’s equity against each Sklansky group, is -1.07%.
CO calls MP’s x3bb raise

The most likely holdings for an opponent cold-calling a raise in this situation are drawing hand and playable hands that aren’t good enough for a 3-bet. Groups 3&4 contain mid-pairs (99, 88) big unsuited aces (AQ, AJ) and some nice suited overs (JTs, QJs, KJs, T9s, QTs, 98s, J9s, KTs). The smaller peak around groups 6&7 also contain drawing hands like the pairs 66-22, dangerous overs (AT, KT, QT) and suited connectors (86s, 76s, 54s). If we compare the distribution of the CO cold-call to the distribution of the CO limp we see that the hump around groups 6&7 is present in both but that our opponent is now more likely to be playing groups 3&4 over the chaff in category 9.
The shift of the ‘showed’ hands distribution towards stronger hands is also quite clear for this game state. It’s almost as if the blue ‘showed’ distribution was nudged one group over from the black ‘all hands’ distribution. The hand match-up resulting in the largest bias is Q9o with -1.95%, the lowest is 32o with -0.03%, and the average is -0.87%.
UTG raises x3bb

Here we have the classic UTG raise x3bb. This is a play that represents strength and our hand range estimates show that. The ‘all hands’ model indicates that 51% of our opponent’s likely hands are in the top 3 groups while the ‘showed’ model tells us that 61% of our opponent’s likely hands are in the top 3 groups. There is a second hump around groups 6-7 showing that our opponent may also be mixing it up with weaker cards when he raises x3bb UTG. The maximum bias of -3.89% comes from holding tens in this spot. This is to be expected since the bias is greatest in group 1, and the over pairs in that group pose the biggest problem for tens. The lowest bias of -0.47% comes from holding aces, and the average bias across all hands is -1.96%.
Button 3-bets MP’s x3bb raise

The Button 3-bet is essentially a more right skewed version of the x3bb UTG open. The ‘all hands’ model tells us there’s a 65% chance that the Button has a hand from groups 1-3 and that percentage rises to 71% if we use the ‘showed’ model. The max range-equity bias of -2.33% comes from holding tens, the minimum of -0.38% comes from holding Aces and the average across all hands is -1.04%.
Weighted average of all game states
Our primary goal here is to estimate the overall bias created by the selection in showed hands. The bias in the range-equity of showed hands is both a function of the game state and the hole cards used in the match-up. Game states occur with varying frequencies. For instance, you might see a ton of opponents limp, but only a few 5-bets pre-flop. In order to get a picture of the overall bias in showed hands, we can take a weighted average of the hand ranges generated by all the game states we observed.

The above chart shows the average hand range distribution of the 45,039 game states derived using the ‘all hands’ model and the ‘showed’ model. Multiplying each distribution by the equity of the max (TT), min (AA) and average equity of all hands gives us the range-equity for both models.[1]

The average range-equity bias of pocket tens is highest among hands at -2.38% ± 3.62%, while the lowest bias is for Aces at -0.26% ± 0.68% (using the 95% confidence interval, μ ± 2σ). The average equity of all hands matched with the weighted average of all the game states gives us range-equity estimates of 41.57% for ‘all hands’ and 40.83% for ‘showed’. The resulting range-equity bias for the average hand is -1.34% ± 2.1% and the distribution is displayed in the above chart.
Conclusion
What’s the take away you ask? Let’s begin with the caveats. This analysis was done on the hands of 1 player, and the hand range estimates derived speak only to his play in a particular situation. As of yet, it’s not clear how different strategies affect the bias of observed hands. Also, we do not know how the simplification made in assigning outcome hands to the Sklansky groups rather than each 169 pre-flop hands affects our estimates. Since the model used to predict hand ranges is fairly simple in its construction, further refinement may reduce estimated bias. These questions requires further research.
Assuming that these caveats don’t severely alter our results, our findings indicate that that the bias in observed hands is not disastrously large. Running the model on a weighted average of game states and showdown equities produced a bias of -1.34% ± 2.1%. The small size of the bias can be explained in a few ways. First, the bias in range-equity is a function of both the hand range distribution and the showdown equity of the hand matched up against the distribution. If the variation in the showdown equity is relatively low between hand groups, the bias in the hand range estimates has little effect on the resulting range-equity. Hands that come to mind are Aces – which does well against all other hands, and 32o which is uniformly bad. Second, while a good hand is more likely to go to showdown than a bad hand, a good hand is also played differently than a bad hand. It’s one thing to say players showdown better hands. It’s another thing to say that the hand range of a TAG who raises x3bb UTG against a loose table is biased since he is more likely to show down one part of his distribution over another AND a random hand’s equity against the biased parts of his distribution is sufficiently large to change a decision from being +EV to –EV. The implications of first statement are pretty intuitive. The implications of the second statement are not.
In the coming weeks, I’ll be working to make the model more robust which includes estimating for all 169 hand categories rather than the nine Sklansky groups, as well as layering on additional input variables. I’ve also got a few statistical tricks in my back pocket which may help correct for a large part of this bias.
Stay tuned…
[1]When we were only looking at one game state, we could subtract the ‘all-hands’ range-equity from the ‘showed’ range-equity to derive the net bias. Since our hand ranges are the product of an average of 45,039 game states, we calculate the bias in all 45,039 hands first and then take the mean to get the average bias derived from the model.