A Tale of Two Bootstraps OR How to significantly improve ACBL masterpoint allocations in a way that most everyone will hate…

Team Gawrys won the 2018 Spingold by a convincing 33 IMPS over 60 boards. While I am happy to join everyone else in congratulating Gawrys on their victory, I cannot help but note that I am only about 30% sure that they “should” have won.

There’s a lot of luck intrinsic to the game of bridge. Tournament organizers do as much as they can to minimize the role that luck plays in the game through creative ideas like “Duplicate” or “team” matches. Even so, I think that everyone reading this recognizes that some days the bridge gods smile on you. Others they don’t.

Here’s a practical example: Suppose that you are playing a strong cub system. You may very well expect that your board results will be better when you open your limited major suit openings than when you are forced to open a strong club. If you’re lucky and the card gods deal you a lot of 1M openings you might expect to score better than normal. Alternatively, if you get dealt way more strong club openings then you might expect your score to suffer.

For kicks and giggles, I decided to use a statistical technique called a bootstrap to analyze a couple matches from the 2018 Spingold. I chose the 60 board final that Gawrys played against Rosenthal because this had a fairly large spread between the two teams (Gawrys won by over .5 IMPs per board). I also chose the semi-final match between Rosenthal and Gupta which was very close. (Rosenthal won by 2 IMPs).

I used a bootstrap to run one million virtual matches that the same statistical properties as the two matches in question and analyzed the results. (I’ll discuss bootstraps at the close of this posting). I wanted to count the number of times each side “won” the match, the average margin of victory, as well as the standard deviation. For anyone who cares, the bootstrap took about 5 seconds to run on my MAC.

Gawrys versus Rosenthal

Gawrys win: 761,673 matchesTie: 6,882 matchesRosenthal win: 231,445Mean result: +31.94997 Gawrys (Vugraph records seem to be off by 1 IMP in R1)Standard deviation: 44.1962 IMPs

<Big take away, even when you have a “convincing” win as the one that we saw in the finals, we’d expect Rosenthal to win about 25% of the time and tie about 7%>

Gupta versus Rosenthal

Rosenthal win: 513,564Tie: 7,646Gupta Win: 478,790Mean Result: 2.028112Standard deviation: 50.98339

<Big takeaway, for all intents and purposes, that Rosenthal “win” was the result of a coin toss>

FWIW, here are a couple conclusions that I draw from this analysis

1. I’d argue that masterpoint allocations should be weighted by our certainty that the correct team won the match. For example, in the case of the Spingold final, Gawrys should receive ~77% of the 1st place award and ~23% of the second place award. Rosenthal should get the converse.2. The Gupta / Rosenthal match did not run long enough. For KO type formats, we should insist on a statistically significant margin of victory.

Background information on the Bootstrap.

A bootstrap is a statistical technique that uses sampling with replacement to construct new datasets that are not identical to original, however, the expected moments are the same.

In this example, I entered the board results for the Rosenthal – Gupta match. I sampled with replace 60 times from the set of board results. I created a new dataset that consists of nothing but board results from the original match and had the same length. I repeated this process a million times and calculated summary statistics.

R Code

#Enter match resultsfoo = c(-7,0,0,6,-1,-3,-1,8,12,4,2,-5,-13,1,-7,0,-13,-13,-1,0,6,1,-5,13,8,-2,5,8,12,-1,0,7,1,-9,0,0,-1,-13,0,12,-13,0,0,0,13,-5,1,3,0,0,-1,-1,0,2,-15,10,0,-2,-1,0)

Foo2 = c(11,0,-6,7,2,1,1,-3,0,7,0,12,-1,0,0,10,-1,-7,4,11,10,0,-11,0,0,4,-7,-13,0,0,0,0,0,7,-12,0,5,-1,13,-10,-3,-5,5,-7,0,0,-4,10,-3,0,0,0,4,-5,0,0,0,0,0,7)

#Check data entrycumsum(foo)

#bootstrap

my_data = matrix(0,1000000)

for (i in 1:1000000)

{

boot = sample(foo, size = length(foo), replace = TRUE)bar = cumsum(boot)

my_data[i] = bar[60]

}

mean(my_data)sd(my_data)length(my_data[my_data > 0])length(my_data[my_data == 0])length(my_data[my_data < 0])