Join Bridge Winners
All comments by Ping Hu
You are ignoring the author of this comment. Click to temporarily show the comment.
I found a link about Lehman rating. It is a simplified version. It also has a link to detailed version.
http://www-personal.umich.edu/~bpl/oksimple.html
As you could see in detailed version. There were a lot of mathematical assumptions. I could not agree with all of them. I would not say it is completely wrong but some of them would be questionable. For example if I were to try to attribute the result of a board between two players, I could give more weight to declarer than dummy. If it is defending side, I would attribute slightly more weight to the opening leader.

I think this effort of trying to divide rating between players introduces a lot of subjective factors.
April 16, 2015
You are ignoring the author of this comment. Click to temporarily show the comment.
ELO has been widely used in chess and many other sport/games. It is a proven methodology to measure competitive performance.

I don't know the detailed calculation of Lehman rating so I could not comment about it. A simple web search showed that it tried to measure “individual” performance. I personally think this is a flawed assumption. Bridge is a game based on partnership so the performance has to be measured at partnership level. As Robert Lass pointed out, you could play with statistics to come up with something about “individual performance”, but I'm not confident on that if I don't know all the assumptions going into the calculation.
April 16, 2015
You are ignoring the author of this comment. Click to temporarily show the comment.
Robert, I like your comments about statistics.

I don't rate player. I rate partnership. Some people claim they could derive player rating from different partnership they play. I'm not convinced.

I also don't have “underrated” player in my system. If it rates some players they should be properly rated by the system.
April 15, 2015
Ping Hu edited this comment April 15, 2015
You are ignoring the author of this comment. Click to temporarily show the comment.
This might be a good idea to have different type of masterpoints by grade/stratum. Right now masterpoint is by regional/sectional/club. What is the real different between sectional flight B and regional flight B? Right now probably the only type of masterpoint that means something is platinum.

By the way, in chess there are rules against sandbag (player purposely perform poorly have have their rating drop so they could have a shot at prize money). This is one reason there is a rating floor.
April 14, 2015
Ping Hu edited this comment April 14, 2015
You are ignoring the author of this comment. Click to temporarily show the comment.
Chess rating handles it by assigning a provisional rating for new players. After certainly number of game, he/she could have established rating. Once the rating reach to certain level, there is a rating floor that if his/her rating drop, it would never drop below that floor.

I use a similar method. New players have provisional rating and could change quickly. Once they have more than 200 boards, it is established rating and is more stabilized. All history games are represented by current rating. There is no need to dig out old games and recompute the rating.

IMO individual rating might be able to derive from declarer play or other means. That will need more detailed data.
April 13, 2015
You are ignoring the author of this comment. Click to temporarily show the comment.
Chess tournament often allows player move up one level to play. A lot of young players take that opportunity to play against strong players to improve their game. If attendance is limited their might be only 2 or 3 sections. It is not much different from bridge.

The problem with masterpoint is only a small portion of players get masterpoint award per game. With rating everyone who played better than his/her level could gain rating even he/she does not finish at top 35%. In addition, some players mistakenly thought masterpoint is equivalent of rating. I played as a substitute in BBO ACBL speedball yesterday. My partner opened 1. I had a good hand (16 HCP) with support. I checked his CC and found he does not play J2N. So I decided to start with 2 as 2 over 1 with unpassed hand. My partner passed! I ended up play 2 with 4-2 trump fit instead with 5-5 fit. While I was playing, he commented his partner would never bid like I did. I said no reasonable bridge player would pass a forcing 2 over 1 bid. He said who had a higher rating here (he had 9 and I had 7). I said that number means nothing, it only meant he had spent more money than I.
April 13, 2015
You are ignoring the author of this comment. Click to temporarily show the comment.
Since the rating is at partnership level, you are always tied to your partner. The rating is for partnership not individual. If you don't like your partner, switch to a new partnership and get a new rating. A player could have multiple ratings with different partners and they are independent.

Teammates performance only has limited effect. In case of duplicate boards, your results are compared with all other tables, your teammate is just one of them. This is similar to matchpoint play where a weak pair could have an effect on entire field but it is not one pair who gets hit/benefit. If you played well and teammates played badly, you could still gain rating while losing the match. I have examples in my Bermuda Bowl study.
April 13, 2015
Ping Hu edited this comment April 13, 2015
You are ignoring the author of this comment. Click to temporarily show the comment.
This depends on implementation details. If you use a large K factor it would make rating fluctuate widely. A good rating system should provide accurate measure about player's ability and not let it change drastically due to one or two bad/good results. The only reason rating could change is player consistently perform above/below their current rating for a series of events.
April 13, 2015
You are ignoring the author of this comment. Click to temporarily show the comment.
Danny: If a board is not duplicated, it would have less weight in rating calculation. In my calculation, a board played only at two tables (like team game) has 1/3 of weight of a board played at many tables.

A pair or event both pairs of a team playing only KO is not a problem. As long as their opponents are rated, they could be rated correctly. It might takes more games if boards are not duplicated. A team with more than two pairs could also help because who played against who could make a difference in rating calculation. As my example from Bermuda Bowl shows, pairs from same team could have quite different ratings.

The ideal solution is to have one universal rating like FIDE rating in chess. We don't need each local bridge organization having their own rating. The you have to deal with problem of how to convert them. In chess FIDE has rules about how games could be rated. A tournament needs to have certain minimum number of rated players to be eligible to get FIDE rated.

A rating system could be more beneficial for Spingold/Vanderbilt where it has a lot of foreign players. IF all bridge players are measured with one rating system, it would be very easy to seed them. Right now each one has its own seeding rules and could not be done by program.
April 12, 2015
You are ignoring the author of this comment. Click to temporarily show the comment.
If this is actual play in a club, it may not be so simple. Declarer need to pitch a on 5th , in following diamond play he needs to unblock 10 under A, otherwise he could not take finesse.
April 11, 2015
You are ignoring the author of this comment. Click to temporarily show the comment.
As I did to 2005 Bermuda Bowl, I took every board results from Round Robin to Final. In round robin, every team played same board every round. So each board was played at 22 tables. I found out the pair at each table and calculated accordingly. So even this is a team event, I could calculate pair rating when I have the detailed board results.

The same method could be used for Spingold and Vanderbilt if board results are available. If the board is not duplicated and played only by two tables (typical KO of ACBL), the rating change would be very small. So it would take more boards to get enough statistics.

In my Bermuda example, most of the pairs rating were determined by their results from Round Robin (21 rounds with 20 boards per round). Even everyone was assigned a default initial rating of 2400, it could quickly get to their “true” rating after 100 boards. As the list showed, the final rating for 64 pairs ranged from 2000 to 2700.
April 11, 2015
You are ignoring the author of this comment. Click to temporarily show the comment.
Robin,

I discussed with Chris Champion and had compared our rating results with him on some sample data. My understanding is his rating is mostly based on most recent 18 months games. EBU system has a predefined weighting that favors recent results.

My system has a built in weighting. After a tournament, the players rating is adjusted by their performance. So the final rating is determined by two factor. First is their pre-event rating. Second is their performance. You could think of pre-rating as a weight of all their past games. The performance adjustment is current weighting. The performance adjustment is scaled by a K factor. It depends on how many games the players have played and how many boards from current tournament. It also depends on players' rating. Lower rated players get a large K value. For master level players the adjustment is usually very small. This is similar to chess rating system. K factor could be fine tuned. USCF has its rating system for 70 years, but still adjusted their K factor formula two years ago.

As you pointed out, volatility is not good for rating system. It needs to have some stability. I think the Elo methodology is sound in principle. Bridge score could be handled better than chess. In chess the game result only has win, loss and draw. In each tournament, a player only plays 4 or 5 games. So the results are very limited. If a player gets all wins or all losses, their rating could be changed a lot. In bridge, you rarely see a score of 24 IMP, in MP most of the games are between 40-60 % and each time players would play 20-30 boards. So bridge score could provide better statistics than chess. My system differs from others because I calculate based on board results. Others mostly use overall result from a game of 20-30 boards. In my case, I could find out exactly who played the same board and what their ratings are. In other systems they could only make an average adjustment for field strength. This affects on rating result of how fast you could reach a pair's “true” rating.

In my example from 2005 Bermuda Bowl, I calculated rating after each round results. For highest rated pair Norberto Bocchi and Giorgio Duboin, their rating after first round they played was 2706 (from 20 board result). After that it varied to a low of 2607 and high of 2763. After 200 board, it was 2727. At the end of tournament (476 boards) it is at 2714.
April 11, 2015
You are ignoring the author of this comment. Click to temporarily show the comment.
There are a few ways to deal with initial rating assignment. If most of the pairs in a tournament already have ratings and the new pair played at least 15 boards against rated pairs, I calculate an estimate based on the score and average opponent rating and use it to calculate rating. If most of the pairs are unrated, I used a default initial rating. This default could be different based on type of event. In my Bermuda Bowl calculation, I used a default rating of 2400 that is appropriate for masters. For club open game, I would chose 1300 (average player's rating). For a limited game, I may chose a lower number like 1100.
April 10, 2015
You are ignoring the author of this comment. Click to temporarily show the comment.
Yes, there are always same kind of luck in bridge. However when you compare a large amount of boards, statistically it should average out. This is just like play an 8 boards match, stronger team may not always win. However they are much more likely to win with 64 board match.
As I explained about the methodology, there are at least 4 pairs that could affect on one rating calculation. If the board is flat or one player acts irrationally it could create some problems. However not every board will be like that and we have to assume players will act rationally. The only board I would throw out is the one with adjusted scores(like the game won by forfeit in chess).
April 10, 2015
You are ignoring the author of this comment. Click to temporarily show the comment.
A rating system is not a replacement of masterpoint. Masterpoint is about reward and players want to get rewarded for their play. The problem is using masterpoint to determine what flight/stratum a player should play is not accurate. A player would eventually pushed up to a flight/stratum beyond his ability. This could only hurt player's continue participation of game because he/she has no chance of winning.
A rating system that measures player's ability used in pairing would put player in a group of similar playing strength. It could encourage players to play more by creating a better playing environment. So even a player with a lot of masterpoint could continue to play against players of his strength and not forced to against stronger players.
April 10, 2015
You are ignoring the author of this comment. Click to temporarily show the comment.
Does EBU use per board result or just overall percentage and average of field strength?
April 10, 2015
You are ignoring the author of this comment. Click to temporarily show the comment.
This is basically a pair-team format. I saw a number of sectional/regional tournament using it. The score was calculation for MP in pair and BAM in team. However you could certainly create other combinations like cross IMP for pair and VP for team.
March 27, 2015
You are ignoring the author of this comment. Click to temporarily show the comment.
As I said early there is a condition this kind of problem could happen. In order to avoid it, 6 rounds of Swiss team needs to have 12 or more teams.
March 24, 2015
You are ignoring the author of this comment. Click to temporarily show the comment.
Masterpoint is just a marketing tools for ACBL. It is like giving candy or toys to kids as reward. For old people you give them masterpoints.
Instead of looking at how much an event pays masterpoint or how much masterpoints a player has, maybe we should advertise by masterpoints per dollar and use masterpoints/dollar to evaluate how good a player is.
March 23, 2015
.

Bottom Home Top