Conditional probabilities and Bayes' Theorem
(Page of 10)

Recently there have been several threads that include debates about how to apply probability theory in certain situations. Since I am too lazy to give a full mathematical exposition in each such thread, I have decided to write this separate article instead. I will present some general mathematical tools and also some examples for their application.

This is only supposed to be a brief overview, so I will not bother to give detailed definitions for random experiments and other terminology - you will easily find definitions in mathematical textbooks on the subject. However, one concept which is crucial for everything that follows are so-called conditional probabilities, and it is necessary to address this concept first.

A conditional probability is the probability of an event Y under a condition X; we will henceforth use the notation P(Y|X). In simple language it means restricting our analysis to only those cases where the situation X is present and then discussing probabilities under this restriction. (In contrast, the "pure" probabilities P(Y) are derived from an initial state - in bridge only the assumption that the hand was shuffled and dealt properly - without further restrictions.)

It is important to understand that, when we speak of probabilities in bridge, we usually mean probabilities with respect to the state of knowledge at a given time, even if we do not say so. Our state of knowledge may be viewed as a sequence of conditions X_1, X_2, ..., where each X_i is more precise than the previous one X_(i-1). In that sense, the probability of an event actually depends on which X_i we base our analysis on.

In fact, the probability for an event Y evolves as a sequence P(Y|X_1), P(Y|X_2), and so on. The most accurate probability is always the one with the most recent state of knowledge. As a consequence, the odds for a certain event can change in time. A typical example is the application of the Theory of Vacant Spaces when we are interested in the location of a specific card: Initially the probability that one particular defender holds the card in question is 1/2, but if we learn more about the defenders' hands, the odds will change, sometimes significantly.

When we discuss probabilities in bridge hands, we tend to use sloppy language. This has often led to misconceptions and errors about probabilities in specific cases. In this article I will introduce some precise mathematical tools instead and explain how they can be applied to avoid such misconceptions.

I will describe the theory only from the perspective of declarer (the methods can essentially be used for the defenders as well). Suppose we have to make a critical decision at a certain point; let's say there are several possible layouts, and the success of our choice will depend on which of those layouts is present.

We denote the layouts by Y_1, Y_2, etc. What we are interested in are the conditional probabilities P(Y_i|X), where X is the state of the hand at the time the decision must be made; this state X reflects our knowledge about the hand we have accumulated. One key tool is Bayes' Theorem which tells us that we can compute the probabilities as follows:

P(Y_i|X) = P(X|Y_i) * P(Y_i) / P(X).

This does not seem like a major improvement because the right-hand side contains another conditional probability P(X|Y_i). This expression P(X|Y_i) can be understood as the probability that the play would have unfolded as it did if the layout Y_i were present. This probability does not come from the randomness of layouts due to the shuffling of cards, but from the choices the defenders would make.

The second critical element in the above formula is the pure probability P(X). It describes the probability that we would arrive at X following our line of play without any restrictions on the layout in the defenders' hands. Following the Law of Total Probabilities, in can be computed as follows:

P(X) = P(X|Y_1) * P(Y_1) + P(X|Y_2) * P(Y_2) + ...,

where the sum is taken over all possible layouts.

A few remarks about the wording "all possible layouts" are in order. There are 10400600 ways to distribute the 26 missing cards among the two defenders. It is not practical in the least to compute P(X) as a sum of ten million separate probabilities. Fortunately, the computation can be simplified.

Sometimes we are not interested in the full layout, only in the location of a few specific cards or the like, for instance some missing honors. A priori there are 2^n possible layouts Y_i for n cards if we ignore the rest of the hand. Now, technically we must bear in mind that not all these layouts are equally likely, so the terms P(Y_i) are not equal. However, if n is very small, the rounding errors are small, too, so it is plausible to just enter the same value for each P(Y_i). For example, for n=2 the Law of vacant Spaces tells us that either 1-1 break is more likely than either 2-0 break (26% opposite 24% each).

As the hand develops, the rounding errors can grow, so near the end of a hand one should reconsider if the above simplication is still sensible. Still, it is a reasonable starting point.

What is more important is that all the terms corresponding to layouts we have already found impossible in the earlier play can be removed from the equation. For example, after eight tricks there are only 10 cards left, and we have at best 84 remaining layouts, in practice much less, and those can usually be put in groups conveniently. The sum over all P(X|Y_i)*P(Y_i) is thus reduced to something manageable.

Another important aspect: We tend to denote spot cards simply by x's, and occasionally these x's are mishandled in the subsequent analysis. In general, the sound technical way to handle the spot cards is not to use x's in the first place. If we are dealing with a situation where the spots truly don't matter, that fact is reflected somewhere in the equation.

For instance, suppose we cash a winner and an opponent follows suit. Do not think of it as an x. Treat it is a specific card. If we are later discussing a layout Y_i where the player had three small cards, the factor 1/3 will occur in the term P(X|Y_i), because the player had a choice between three (presumedly) equal cards to play. Executing all the computations with x's instead of real cards may work in some cases (and typically appears easier), but sometimes it does not work and leads to the wrong numbers in the end.

Enough theory, let's do this for a practical example - the most popular example, in fact, namely the classic Restricted Choice scenario. Suppose we have A10964 in our hand, K852 in dummy (observe that I avoid using x's), and we want to play the suit for zero losers. When we cash the king, RHO follows with the three and LHO with the queen. We order a second round from dummy, and RHO plays the seven. Should we take the finesse or play from the drop?

A preliminary analysis shows that there are three relevant layouts where LHO drops an honor in the first round and our choice in the second round matters, namely LHO holding the stiff jack, the stiff queen or both honors. The "standard" argument now is that the second round finesse will win in two of these layouts and lose only in the third. Therefore, the odds favor the finesse by 2:1.

The argument is correct to some extent, but it is essentially based on the assumption that our LHO will pick the honor he plays to the first trick randomly if he has them both. The technically correct way to make the decision is the following:

Right before we make our decision - this is the state X in Bayes' formula - there are only two possible layouts left, namely (originally) Q opposite J73 or QJ opposite 73. The case of J opposite Q73 is no longer relevant because we have already seen LHO play a card which is inconsistent with this layout. We denote the first layout Q opposite J73 by Y_1 and the second layout QJ opposite 73 by Y_2.

Let us assume that LHO does in fact choose his card from QJ completely at random. In that case we have P(X|Y_2) = 1/2 because we would arrive in our situation X only half the time (in half the cases we would see LHO following with the jack instead). At the same time we have P(X|Y_1) = 1 since LHO does not have a choice to make at all with the stiff queen. All other layouts Y_3, Y_4, ... in this suit are no longer relevant; for example, we cannot arrive at X from an initial holding of Q7 opposite J3. In our formula this can be seen as P(X|Y_i)=0 for all further Y_i.

This leaves:

P(X) = 1 * P(Y_1) + 1/2 * P(Y_2)

and

P(Y_1|X) = 1 * P(Y_1) / (1 * P(Y_1) + 1/2 * P(Y_2)).

Now we make the previously mentioned simplification that all P(Y_i) are equal and obtain:

P(Y_1|X) = 1 / (1 + 1/2) = 2/3.

In words, the queen is bare in LHO's hand 2/3 of the time. As we can see, the original argument yields the correct result in this case.

But what happens if LHO always follows up the line? In that case we still have P(X|Y_1)=1, but now the second conditional probability is P(X|Y_2)=0 - LHO would never play the queen if he also had the jack. Hence we obtain:

P(X) = 1 * P(Y_1) + 0 * P(Y_1) = 1 * P(Y_1)

and

P(Y_1|X) = 1 * P(Y_1) / (1 * P(Y_1)) = 1.

Not surprisingly, the finesse is certain to work if LHO is known to follow up the line and plays the queen in the first round.

As you can imagine, LHO's strategy will also affect our probabilities if he plays the jack in the first round. Let us denote by Y_3 the holding of J opposite Q73. One can easily check that the finesse is still a 2:1 favorite if LHO chooses his card randomly and produces the jack; the computation is completely analogous (this time Y_1 is the impossible layout) and yields P(Y_3|X)=2/3.

However, if LHO follows up the line again, we now have P(X|Y_3)=1, so we get:

P(Y_3|X) = 1 * P(Y_3) / (1 * P(Y_2) + 1 * P(Y_3)) = 1/2,

assuming again for simplicity that all P(Y_i) are equal. This should not come as a surprise; if LHO always plays the jack if he has it, we cannot deduce anything about the queen, so we end up with a coin-flip decision.

If we believe our opponents are players who always play random cards from equals, the computation via Bayes' formula seems to be unnecessarily complicated. If the approach of counting initial holdings works as well, why do we take the long route? Well, apart from my belief that it is usually better to know the full theory than just half of it, it appears that the "shortcut" does not always work.

Consider the following scenario. With AJ1098 in dummy and 42 in hand, we play small towards dummy and finesse; East wins the king. What are the odds that the second finesse works? For the sake of completeness I must give you the cards LHO plays, so let us say he contributes the five to the first trick and the six to the second trick.

Using the shortcut we can argue as follows: The case that West has both honors can be ignored, and of the remaining three layouts (East holding the king, the queen or both), the second finesse works in two out of three cases. Therefore, the odds are 2:1 that LHO holds the queen. Note that this is no longer about finding bare honors in a hand since there are more outstanding spot cards than in the previous example; it is merely about the location of the two honors in the defenders' hands.

As before, the shortcut treats king and queen as equals, but in practice they might not be for RHO. Let us denote the honor holding as follows: Y_1 means RHO holds only the king, Y_2 means RHO holds the king and the queen, Y_3 means RHO holds only the queen. Now, RHO has no reason to duck with both honors, but he may be tempted to duck with only one honor.

Of course, it is necessary to see his holding in context of a full deal to decide how likely it is that RHO would duck. Imagine, for example, that we are playing a notrump contract, the given suit is an important source of tricks and dummy has no side entry. Ducking may be necessary from RHO's point of view to cut off the long tricks in dummy, so in contrast to the previous RC example we cannot be sure about P(X|Y_1)=1 or P(X|Y_3)=1.

Furthermore it may be more difficult for RHO to duck with the queen than with the king. Unless RHO must urgently win and shift to a different suit, ducking with Kxx is probably risk-free. On the other hand, ducking with Qxx is extremely risky; after all, declarer could have decided to play a first-round finesse with Kxx in his hand.

It is hard to give precise numbers, but P(X|Y_1) seems to be somewhat smaller than P(X|Y_3) and probably considerably smaller than 1. In such cases - when cards appear to be equal from declarer's point of view but may not be equals in the eyes of the defenders - merely counting initial holdings can be misleading.

The following example is taken from another post, but I have modified the deal slightly to make my point. (Originally declarer had only 12 tricks in six spades, and East had an important decision to make at trick two. I wanted to take this element out of the equation.)

West
J1098743
95
K1064
North
KJ63
K652
A83
A7
East
10984
J762
J9852
South
AQ752
AQ
KQ104
Q3
W
N
E
S
7NT
P
P
P
D
7NT South
NS: 0 EW: 0
J
2
2
A
3
1
0
Q
3
5
5
3
2
0
2
4
K
4
1
3
0
J
8
5
7
1
4
0
3
9
A
8
3
5
0
Q
4
6
10
3
6
0
7
6
7
8
3
7
0
3
10
A
9
1
8
0
K
J
Q
9
1
9
0
3
2
K
5
3
10
0
4
9
A
6
1
11
0
8
7
12

South is declarer in 7NT. Let us assume he has shown 5-2-4-2 shape in the bidding, so that the defenders know which suits they can safely discard. West leads a heart, and the play goes as shown in the diagram: Declarer cashes his winners in the majors and the ace of clubs; West gives up two clubs and three hearts on the run of the spades, East throws four rounds of clubs. No diamonds have been discarded.

After king and ace of diamonds at trick ten and eleven, a small diamond from dummy is led; East follows small again. At this point, South has seen 23 of the 26 cards the defenders held originally, and he knows that the last heart is in the West hand. What he does not know is whether West has the king of clubs and East the jack of diamonds (as shown above), or vice versa.

If you now start counting original holdings of Jxxx in diamonds, holdings of Kxxx in clubs and so on, or if you compute the percentages of East holding 4-0-4-5 shape as opposed to 4-0-3-6 shape, you can easily go astray. Initially there were 10400600 possible distributions of the 26 missing cards, and all the percentages you can come up with apply only to the initial state of knowledge with all 10400600 layouts still possible.

After East has followed to the small diamond from dummy, there are only two layouts left. Let Y_1 denote the layout shown in the diagram and Y_2 the other layout, with the jack of diamonds and the king of clubs exchanged. As before, the state of the hand that has been reached shall be denoted by X.

For all other layouts Y_i, we have P(X|Y_i)=0, so we can ignore them. Note that it is pointless to count holdings of, say, Jxxx in diamonds. There are no x's on the cards. West has shown up with 95 of diamonds, East with 762. The only holdings that are still possible are J95 opposite 762 and 95 opposite J762. The same holds for the club suit.

Since we are talking about complete layouts, in this case we have P(Y_1)=P(Y_2)=1/10400600, so it remains to determine P(X|Y_1) and P(X|Y_2). It was our initial assumption that both players know perfectly well which suits to discard. As long as they had to follow suit, they had no relevant choices to make. The player with the jack of diamonds had to keep it at all costs; the player with the king of clubs had to keep it at all costs. Therefore, we obtain:

P(X|Y_1) = P(X|Y_2) = 1

and

P(Y_1|X) = P(X|Y_1) / (P(X|Y_1) + P(X|Y_2)) = 1/2.

We change the layout in the diamond suit:

West
J1098743
105
K1064
North
KJ63
K652
A83
A7
East
10984
J762
J9852
South
AQ752
AQ
KQ94
Q3
W
N
E
S
7NT
P
P
P
D
7NT South
NS: 0 EW: 0
J
2
2
A
3
1
0
Q
3
5
5
3
2
0
2
4
K
4
1
3
0
J
8
5
7
1
4
0
3
9
A
8
3
5
0
Q
4
6
10
3
6
0
7
6
7
8
3
7
0
3
10
A
9
1
8
0
K
J
Q
9
1
9
0
3
2
K
5
3
10
0
4
10
A
6
1
11
0
8
7
12

The play goes exactly as before. At trick eleven, declarer can see the fall of the ten of diamonds, so he must decide if it was bare or from J10. As before, we denote by Y_1 the given layout and by Y_2 the alternative layout where West has the jack of diamonds and East the king of clubs in return.

Again we have P(X|Y_1), since neither defender had much of a choice up to this point. However, if West plays the jack or the ten of diamonds randomly, we obtain P(X|Y_2)=1/2. We obtain:

P(Y_1|X) = 1 / (1 + 1/2) = 2/3.

Hence the finesse is the better line with 2:1 odds. This is once more the typical RC result.

In the example on the previous two pages, I was always assuming that signalling is not an issue for the defenders. If one player sees fit to signal his holding in a critical suit (e.g. the possession of the king of clubs), this will clearly alter the odds.

Even under this assumption, you could say that my arguments about the players having no choices to make is wrong. When following suit, the defenders must decide in which order to play their spot cards. There is some truth in it.

For example, East had to play five clubs over the first nine tricks. If he knows those clubs to be equals, he can play them in 120 different orders, so the factor 1/120 should show up in each of the expressions P(X|Y_i). Fortunately, this factor will occur identically in every such expression - assuming that the possession of the unknown cards, in particular the king of clubs - has no impact on the player's decision in which order to play his cards.

If the factor 1/120 is the same in each P(X|Y_i), they will completely cancel each other out, so the rest of the computation remains unchanged. The same holds for the order of the play of spot cards in every suit.

Another modification:

West
5
J109873
95
Q1064
North
KJ63
K652
A83
AK
East
10984
J762
J9852
South
AQ72
AQ4
KQ104
73
W
N
E
S
7NT
P
P
P
D
7NT South
NS: 0 EW: 0
J
2
2
A
3
1
0
Q
3
5
5
3
2
0
4
7
K
8
1
3
0
K
4
2
5
1
4
0
J
8
7
8
1
5
0
3
9
A
9
3
6
0
Q
4
6
10
3
7
0
3
6
A
9
1
8
0
K
J
7
10
1
9
0
3
2
K
5
3
10
0
4
9
A
6
1
11
0
8
7
12

As you can see, declarer now has the king of clubs himself, so if we pretend once again that declarer's shape is known from the bidding, all the club cards in the defenders' hands are worthless (and they know it). I took away declarer's long spade in return so that the trick count is right again.

The changed club situation affects the probabilities in our computation. What are the odds that the defenders played the clubs as they did (I am referring only to the choice of cards, not their order)? In the above layout Y_1, East still had no choice, but West had to decide which club to keep. If they are all equal to him, he will keep the queen in one out of four cases, so we have P(X|Y_1)=1/4.

Likewise, we can wonder what the odds for East are to keep the queen from QJ9852. If he treats all six clubs as equals, the probability is 1/6, and this is what we get for P(X|Y_2) - in this layout West has no choice to make. This leads us to

P(Y_1|X) = 1/4 / (1/4 + 1/6) = 3/5.

Hence the diamond finesse is a favorite to succeed. If you do the numbers carefully, you may get the same result from counting holdings where the clubs (and the small diamonds, etc.) are all treated as x's. I prefer to think in terms of conditional probabilities, but that's something everyone has to decide for themselves.

Getting Comments... 