Join Bridge Winners
All comments by Richard Willey
You are ignoring the author of this comment. Click to temporarily show the comment.
I've seen all sorts of comments / complaints about the way that seeding is currently done… It happens all the time.

I have also heard that that the same set of pros are unwilling to invest in the basic type of record keeping that would e necessary to implement such a system. (For example, recording the scores of individual rounds from team games rather than simply noting the aggregate score)
Oct. 12
You are ignoring the author of this comment. Click to temporarily show the comment.
I return to my original comment in this thread:

Folks need to decide what the purpose of a rating system is.

1. If you want a rating system that is supposed to be an accurate measure of performance, then you need some way to evaluate its accuracy. And it seems self evident you want to be predicting how well people play

2. If you want a rating system to make people feel good about themselves or convince them to spend more money in order to become a platinum life master or go “Clear” or become “Operating Thetan” or … well, then you're welcome to make up whatever you want
Oct. 12
You are ignoring the author of this comment. Click to temporarily show the comment.
Sorry to have ignored these questions for a bit… Flu shot hit me kinda hard.

1. With respect to your question about black box algorithms… I don't know what the right answer is. I certainly appreciate that end users are going to want to understand how ratings are generated and may very well be skeptical of a black box algorithms. With this said and done, in my experience even simple linear models are far too complicated for the average end user to understand. So, if we're screwed either way, might as well going down with style.

2. With respect to your question about accuracy:

The approach that I suggest is intended to generate a model that can be used for prediction.

If some set of pairs who are included in the model were to play a session against one another tomorrow, I want to be able to generate as accurate as possible a set of MP score for those worthies.

As an outgrowth of this effort, it's also possible to create a “rating”. This rating would reflect the score that a given pair would expect to achieve against a random field.
Oct. 12
You are ignoring the author of this comment. Click to temporarily show the comment.
Hi Michael,

I took a look at the pages that you are referencing. I'm used to seeing much more explicit testing than what is provided here.

The only thing that is being presented is a set of tables showing that the PR ratings more accurate in ranking the results of different types of games than Match Point totals. While this is all fine and dandy, it doesn't seem like a particularly hard bar to meet.

From my own perspective, I would much rather see

1. A system that was able to generate predictions about what percentage a game people would have rather than a ordering of results

2. This information being used to parameterize the model. I don't consider these issues to be minor details. Getting these type of questions correct is critical to producing an accurate model. (And the difficulty in specifying them all is why I suspect that a non parametric modeling technique is the right way to go)
Oct. 11
You are ignoring the author of this comment. Click to temporarily show the comment.
I am shocked! Shocked I say to hear allegations that electronic signalling devices are being used to cheat in card games.
Oct. 11
You are ignoring the author of this comment. Click to temporarily show the comment.
These are all reasons why it is better to start by rating pairs rather than partnerships
Oct. 10
You are ignoring the author of this comment. Click to temporarily show the comment.
FWIW, I spent a bit of time thinking about how a bridge rating system might be implemented in this day and age. Here's the approach that I would want to experiment with.

Note that this presumes that

1. The primary goal is to create an accurate ratings system
2. End users are willing to accept a black box model

My natural inclination is to treat this as a hidden variable model

1. We have a whole bunch of categorical data: The names of different pairs
2. We have a whole bunch of board results: What were the results when pair foo1 played pair foo2 at such and such a time

Can we use this to generate a good predictive model? (What result do we expect if pair foo1 played pair foo3 and how accurate were these results?)

This problem is compounded by the fact that some boards are naturally flat while others have a lot of opportunity for swings so we'll want to include the variance of the board results as an input as well as just the player ID. (If we wanted to be really sophisticated we could even include data about the hands themselves. Is this hand suitable for a weak NT opening? Is this hand suitable for a strong club opening….)

I suspect that the best option is to use some of modern heavy duty machine learning algorithms. (A recurrent neural network or some kind of dynamic Bayesian network both could be reasonable choices)

We can train the network on some subset of the data then evaluate the accuracy of its predictions on the remainder…

Presuming that we are able to create a good predictive model, we can then use this to generate a rating scheme. Hypothetically, I could use the predictive model to generate an enormous number of virtual tournaments.

Select 54 pairs at random
Generate a set of boards (actually the variance for a set of boards)
Have the pairs play virtual rounds against one another

Repeat this a very large number of times, calculate the the player's average score across this set of events, and use this as their rating….
Oct. 10
Richard Willey edited this comment Oct. 10
You are ignoring the author of this comment. Click to temporarily show the comment.
> All this worry about variance is overblown

I'm not sure that this is true.

I would guess that an awful lot of bridge occurs between members of relatively closed populations. To some extent, this can be compensated for if you have a small number of people who travel between clusters. However, the system is going to be very exposed if these people have good/bad days.
Oct. 10
You are ignoring the author of this comment. Click to temporarily show the comment.
Speaking strictly for myself, my big issue with Power Ratings is that I don't want to rely on “gut feel” for evaluating the accuracy of a ratings system. Rather, I want to a system in which I can backtest the data and use this to evaluate the accuracy of system.

At a practical level, I want a system that can be used to generate predictions regarding events before they happen. If I have these 27 E/W pairs playing against these 27 N/S pairs here is what my expectations regarding which pair places where and, by how much? How much variance is there in the actual system?

In turn, if I have a system that can do this, it's pretty trivial to generate some kind of “rating” for different pairs.

Related to his, I want to be able to understand spatial and temporal clustering patterns in the data. Imagine a world in which players from the East Coast and players from the West Coast never competed against one another. It's far from clear whether we can safely combine the two pools of players into one ratings pool or, alternatively, if we need two separate ratings pools. (Hopefully, in the real world, there is enough sloshing between buckets that this isn't a problem however, its far from clear whether or not this is true)
Oct. 10
Richard Willey edited this comment Oct. 10
You are ignoring the author of this comment. Click to temporarily show the comment.
Comment 1: Statistical methods work better with large data sets

Comment 2: Ceteris paribus, if your partnership with Rodwell consistently produces better score that your partnership with Meckstroth then your partnership with Rodwell should be rated higher
Oct. 10
You are ignoring the author of this comment. Click to temporarily show the comment.
From my own perspective, I think that it is a mistake to try and rate individual members of a partnership before “solving” the easier problem of rating partnerships.

FWIW, I had the chance to discuss this topic with Mark Glickman a few years back and he agreed with this claim.
Oct. 10
You are ignoring the author of this comment. Click to temporarily show the comment.
> I see no reason that having one should require more
> than a man-month of effort by a reasonably competent
> data scientist, and I don't think any new data would
> need to be gathered.

I am reckoned to be a reasonably competent data scientist…

I agree with your estimate regarding the degree of effort required to create a rating system, however, its important to note that the overwhelming majority of the work on most data science project involve collecting / validating / cleaning data, not the actual analysis thereof. Equally significantly, the vast majority of the techniques that I would want to use are much much more accurate when you have relatively large amounts of data available.

As a practical example, in an ideal world, for pairs events I'd want to have the full matrix scores for each board that was played during an event. This provides much more information for an algorithm to chew on than a summary statistic like what % game that a given pair achieved.

You might be able to use tournament results from online events on BBO to prototype your efforts. BBO is in a good position to collect lots of data at a relatively low cost. Use this information to test various approaches. Compare the accuracy of a system that operates on a complete set of boards results with one that “only” is able to see what score a given partnership achieved. You can then use this information to make an informed decision regarding the speed with which a ratings system is able to converge onto something relatively accurate and whether you want to start collecting more comprehensive data in the real world.
Oct. 9
You are ignoring the author of this comment. Click to temporarily show the comment.
FWIW, I agree that there doesn't need to be any rush to remove ratings from players who has passed on. With this said and done, does anyone know whether Power Ratings are comparable across relatively long periods of time?

Suppose one team of players is rated as a “68” in year 1 and a different team is rated as a “68” in year six. If we transported the first team five years into the future, would we expect the two teams to tie one another?
Oct. 9
You are ignoring the author of this comment. Click to temporarily show the comment.
Pray tell, Randy, if any of this were true why do you bother tithing to Memphis?

If players don't care about ratings system, master points, and the like why does a tight wad like yourself bother to send $$$ to the ACBL for sanction fees?
Oct. 9
You are ignoring the author of this comment. Click to temporarily show the comment.
Four quick observations:

1. Jeff's post points to what is (probably) the biggest issue with developing a ratings systems; that being deciding what is the purpose of a ratings system actually is.

I think that there is an enormous disconnect between folks who want an accurate ratings system and those who are looking at this as a marketing gimmick.

2. If folks actually want an accurate ratings system then go out and hire someone with a good background in Statistics or Machine Learning.

3. If the ACBL is serious about creating an accurate ratings system, then they really need to start by improving the way in which they collect data. Ideally you want a system that is recording the results of every board that gets played.

4. If this is just some new improved marketing ploy, make up whatever set of lies you want…
Oct. 9
You are ignoring the author of this comment. Click to temporarily show the comment.
Thanks for the quick reply (I had somehow misconstrued the original statement and assumed that the new committee would be established to look into this. It feels kind of weird to delegate this to a senior group rather than a group of experts).

I personally would want to make sure that folks from Lovebridge, BBO, and the like were the ones creating this proposal…
Oct. 8
You are ignoring the author of this comment. Click to temporarily show the comment.
I would be quite interested in understanding who comprises the management committee for the following: World e-Bridge Championship
Oct. 8
You are ignoring the author of this comment. Click to temporarily show the comment.
> Do the producers have any experience working with
> Amazon, Netflix, or YouTube in delivering the movie to a wide audience?

Did you bother searching any of these venues to see how the movie is being distributed before writing this?

I personally suspect that the producers have at least some experience distributing content via Amazon Prime…
Oct. 2
You are ignoring the author of this comment. Click to temporarily show the comment.
No Don…

I am merely pointing out that your claim that “there are bridge playing multi millionaires” willing and able to fund projects like this seems flawed.

And, while I share your disgust with the competency of the ACBL (as well as the organizational structure that is trying to support the game of bridge), I think that your desire to bash ACBL management is leaving you a bit out over your skis on this one.

Once again, if this movie is successful in convincing ACBL management how badly FUBARed things are right now this could easily be money well spent.
Oct. 2
You are ignoring the author of this comment. Click to temporarily show the comment.
> there are many bridge-playing multi millionaires.
> That's who could have supported it.

In theory sure…

Hell, in THEORY the producers might have done this all for free out of the goodness of their hearts.

In practice? Well, that's a whole different ball of wax.

FWIW, my understanding is that Gates agreed to pony up some serious $$$ a few years back if folks could bring him a worthwhile project. No one was ever able to do so… Given the track record of the ACBL and the like in managing projects, I don't find this at all surprising.
Oct. 2
.

Bottom Home Top