Join Bridge Winners
All comments by Richard Willey
You are ignoring the author of this comment. Click to temporarily show the comment.
Thanks for preparing/presenting this.

Need to spend a bit of time re-reading and thinking about it.
Oct. 24
You are ignoring the author of this comment. Click to temporarily show the comment.
I tend to use a slightly different scoring system when using minibridge.

Declarer can look at their hand and dummy and decide whether to play

1NT
two of either major
three of either minor
Any game
Any slam

And will get the appropriate plus score if successful
Oct. 22
You are ignoring the author of this comment. Click to temporarily show the comment.
That a good score and a plus score are not the same thing
Oct. 22
You are ignoring the author of this comment. Click to temporarily show the comment.
1NT - 2 1N = balanced
3 - 3 3 = 3=2=3=5 shape
3N - 4 3N = 6 slam points
4N - 5 4N = Controls in C/S/D, no heart control
5 - 5 5 = no second control in Clubs
5 5 = no second control in Spades

At this point in time, you know that the 1NT bidder has 3 Aces
You aren't going to be able to ask about the 10 of Clubs and still sign off in 6N

There was the option to bid 4 instead of 3N which would have been RKCB for Clubs…
Oct. 21
You are ignoring the author of this comment. Click to temporarily show the comment.
yada yada yada relay sequence yada yada
Oct. 21
You are ignoring the author of this comment. Click to temporarily show the comment.
> We are talking about a rating system for a
> minority hobby not major league baseball.

If it's worth doing, it's worth doing right

Part of the reason that stuff is as fubared as they are is a never ending series of decisions to half ass things….
Oct. 18
You are ignoring the author of this comment. Click to temporarily show the comment.
> I agree with your above comment but please expand pretend
> I put u in charge of this task.
> Estimated time before ready?

The first thing that I would say is that while this is a good question, neither I (nor I think anyone else) is in a good position to answer this.

The first stage of any such project needs to be some up front work to try and provide reasonable estimates regarding cost, time requirements, etc.

>Would you test before rolling out and if so example of test.

I think that there needs to be a validation stage before any significant investment gets made.

The “expensive” part of an endeavor like this one will be the work required to collect, clean, and store data. Developing the actual algorithms will be relatively easy in comparison. So, I would start by looking for a representative dataset that is already available that folks can experiment with and then submit their algorithms for some sort of “bake off”.

At this point in time, it looks like there might be three different data repositories that might be good to work with.

1. ACBL Online games on BBO (non robot)
2. The board results that the EBU used for generating its own NGS ratings
3. The hands that get used for the “Common Game”

I would proceed as follows

1. Make a years worth of data available to people to develop their algorithms. (Provide roughly six months of time for people to do this work)
2. Provide the researchers with the “next” sixth months worth of data to use as a validation set . Let folks fine tune their methods
3. Test the algorithms against the next 2-3 months worth of data.

Evaluate which of the approaches is doing best, by how much, and (potentially) whether the any addition complexity of whatever approach might win yields a significant improvement in the accuracy of the results.

> Lastly to be worldwide would this be easy or would it be
> easier just to focus on ACBL and 9nce working than expand.

Once again, the difficult issue here involves data collection. I suspect that the larger the number of Zonal organizations that one is dealing with, the more difficult this becomes.

if I could wave my magic wand, I'm not sure that I would even start with the ACBL. I think that the USBF team trials, major events like the Vanderbilt and the Spingold, maybe the Bermuda Bowl and the like might be a better way to proceed.

1. You have orders of magnitude few pairs to worry about (any, along with this few pair versus pair comparisons)
2. Competitors tend to belong to fixed partnerships
2. You have many few boards to worry about
3. Folks care more about seeding and issues like that
Oct. 18
You are ignoring the author of this comment. Click to temporarily show the comment.
I'm not sure that detailed discussion regarding technical design is the right way to go at this point in time. This is an implementation detail and the last thing that you need to worry about.

It much more important to get agreement around:

1. What are the goals of the project (why do actually want a rating system)
2. How are you going to evaluate whether a proposed solution successfully meets these goals

When I see software projects fail, it is almost inevitably because people weren't aligned around requirements. Not because the developers couldn't produce appropriate code.
Oct. 18
Richard Willey edited this comment Oct. 18
You are ignoring the author of this comment. Click to temporarily show the comment.
@Barry

The major cost in rolling out any kind of rating system is going to be collecting and collating data. If the ACBL were to migrate to the NGS system, it would require precisely the same up front investment. I find you claims that the margin cost of using a different algorithm would be millions of dollars to be completely ridiculous.

For all I know, people in the UK like the NGS. That doesn't mean that it can be improved upon.

I think that it's possible to do so very low cost trials to evaluate the relative accuracy of the NGS algorithm compared to a more modern approach.

Starting to wonder why you are so opposed to this idea, especially when you start suggesting that the ACBL should pay licensing fees for the NGS code.
Oct. 18
You are ignoring the author of this comment. Click to temporarily show the comment.
Randy, you are really grasping at straws here

> Can you say lawsuit by players who can still play
> elsewhere but are banned in ACBL.

First and foremost, you are conflating the existence of a rating system with a judicial process to ban some hypothetical pair. I don't see any such link…

> How would you seed Funtune? They have a WBF ranking
> but can't play in ACBL land?

If Fantunes is banned from playing in ACBL land then the whole question of seeding them becomes moot. If we lived in some hypothetical world in which

1. Pairs like Fantunes were banned for collusive cheating and still allowed to play
2. We had a rating system

Then one could simply re-run the numbers, excluding those events in which Fantunes was convicted of cheating (or even were believed to be cheating)

> Easy claim their personal ratings hurt by this system and is costing them money.

In order to bring such a suit, they would need to demonstrate that their ratings were artificially low for some reason and that this was costing them $$$. I suspect that there are far easier ways for such a hypothetical pair to bring a lawsuit.

Do you seriously believe that some pair is going to make the claim:

We were convicted of cheating in event foo. The powers that be aren't using these results in their ratings system which hurts our seeding and therefore is costing us money?

If you're going to sue, sue based on the fact that being stripped of your titles is costing you money… MUCH easier to prove.

> Do you think WBF will honor your system or say we already
> have a seeding points process and not looking to change ours.

I suspect that this all depends on the accuracy of the competing approaches
Oct. 18
You are ignoring the author of this comment. Click to temporarily show the comment.
We could do this, but

1. Designing and building one off hardware is expensive
2. Shipping these tables all over the place is even more so (look what the WBF spend shipping screens around)

If you want to go down this path, having people play on tablets is the way to go, especially since you can physically players as well…
Oct. 17
You are ignoring the author of this comment. Click to temporarily show the comment.
If you are in the world of non parametric modelling, you let the Neural Network or the Dynamic Bayesian Network or whatever figure out what is what.

If you are in a world of parametric modeling, you need to specify how you want to go and treat this.

In either case, this will almost certainly include a mixture of

1. Assigning a provisional rating to the unknown pair (or the pair that contains an unknown player)

2. Decreasing the level of certainty that you have regarding the provisional pairs rating which means that

A. The ratings change of the unknown pair will be more dynamic
B. The ratings change of the “known” pair should be less dynamic
Oct. 17
You are ignoring the author of this comment. Click to temporarily show the comment.
There are all sorts of companies that provide blacklisting as a service.

The company that I work for (Akamai) has several products in this space.
Oct. 16
You are ignoring the author of this comment. Click to temporarily show the comment.
I agree that there is an enormous amount of variance in bridge.

I also agree that simpilier models are to be prefered to complex ones. If it turns out that the NGS system / Power Ratings / whatever is able to provide comparable accuracy to a more sophisticated approach then I am all for using the simpilier model.

However, if we are going to pretend that we care about an accurate ratings system then I think that we should also evaluate the accuracy of the various options.
Oct. 16
You are ignoring the author of this comment. Click to temporarily show the comment.
Once again, before trialing either of these proposals I think that it makes sense to evaluate their accuracy compared to alternative methods.

There have been incredible advances in data science over the past 20 years. I can not help but believe that it's possible to improve over these sorts of naive algorithms.
Oct. 16
You are ignoring the author of this comment. Click to temporarily show the comment.
@marty

First: Some, but certainly not all ACBL Online games use robots. As I mentioned earlier, you could in theory develop three different models

1. Robot games only
2. Human games only
3. Both game types combined

I find type 2 most interesting

Second: The Common Game match points across a large field, however, players only play boards against other pairs in their same club. I worry that this lack of mixing would result in a model that worked great for the Common game but might encounter real problems if someone from club A was suddenly competing playing boards against a different set of opposing pairs
Oct. 14
Richard Willey edited this comment Oct. 14
You are ignoring the author of this comment. Click to temporarily show the comment.
I've never really taken a close look at The Common Game

If they provide scores on a per board basis I don't why it couldn't be used, however, I do have some concerns that you have a large number of separate pools of players and that it might be difficult to construct accurate global ratings. So, you might be able to develop very accurate ratings of how well player foo does at their normal club and have no idea how well player foo might do playing in some different club.

In contrast, the online games would seem to have one very large pool of players with much more mixture between players.
Oct. 14
You are ignoring the author of this comment. Click to temporarily show the comment.
I am suggesting that we objectively measure Power ratings, the EBU National Grading System, whatever and then be able to make an informed decision about which of these systems is worth building upon.

Why don't I trust any of the existing systems?

Well, the big reason is that none of them seems cognizant of the fact that the world has changed in the last 20 years and approaches that might have looked really good a couple decades back seem laughable…

When it comes to data science, “Hey look what we came up with 15 years ago” really isn't much of a recommendation
Oct. 14
Richard Willey edited this comment Oct. 14
You are ignoring the author of this comment. Click to temporarily show the comment.
Couple quick thoughts here

1. I think that some folks on this thread are gravely mistaken about the degree to which working on a ratings system would preempt other work. It's not as if the folks that you'd want doing the heavy lifting here are the same ones that you need working on propping up sagging clubs or doing marketing

2. I think that the ACBL could get a significant “bang” for a fairly limited investment.

Here's how I'd proceed

A. Partner with BBO to create a data set that folks could use to validate different approaches for generating a ratings system. From my my perspective, you'd want BBO to do two things

First: Provide you with, say, one year's worth of data from the ACBL's online pair games


Second: Anonymize this same dataset (Change the names of various players to some random string). Its probably not absolutely necessary, but this doesn't cost anything and it might preempt a few complaints

Third: Publish the data from the said events and allow anyone who wants to download them

Finally: Announce that you'll be running a contest in, say sixth months time to evaluate the accuracy of whatever ratings system that people have developed.

The first year's worth of data will be your training set.
The next six month's worth of data will be your validation set
Then use the next month worth of tournament results as your test set.

If you want to generate some eternal interest, post the contest on Kaggle or some such and put up a prize… Say, $15K or so.
Oct. 14
Richard Willey edited this comment Oct. 14
You are ignoring the author of this comment. Click to temporarily show the comment.
> Masterpoints was a brilliant invention back in
> the day and while it is far from perfect in assessing
> a players true ability, nothing proposed comes close.

I think that your claim is ridiculous.

1. Masterpoints accumulate over time. Player's skill levels top out and eventually decline
2. Masterpoint allocations are wildly inconsistent over time, geography, and venue

I think that it's hard to think of a worse way of describing performance
(As a way to convince people to tithe to Memphis… Probably quite a bit better)

> It seems to me that those proposing the rating
> system just want some way to show how good
> they are without having to earn oodles of MP's.

I certainly don't. Rather, I find the problem itself interesting…
Oct. 14
Richard Willey edited this comment Oct. 14
.

Bottom Home Top