# NCAA Lacrosse Bradley-Terry

Posted by jtwcornell91

**NCAA Lacrosse Bradley-Terry**

**Posted by: jtwcornell91**(Moderator)

**Date:**May 08, 2007 04:50PM

I finally got around to calculating Bradley-Terry ratings (known in the college hockey context as KRACH) ratings for NCAA D1 lacrosse. As I'm sure most of us know, straight-up BT puts Cornell, as an undefeated team with a non-insular schedule, at the top, with an infinite rating:

For "everybody else", see entries 2 to 52 in the table above.

# Team BT RRWP W- L W/L SOS 1 Cornell infin 1.00 13- 0 infin N/A 2 Duke 22032 .961 14- 2 7.000 3147 3 Virginia 6357 .918 12- 3 4.000 1589 4 Georgetown 3714 .889 11- 2 5.500 675.3 5 Johns Hopkins 3005 .876 9- 4 2.250 1336 6 Maryland 2153 .852 10- 5 2.000 1076 7 Albany 2016 .848 14- 2 7.000 288.1 8 Navy 1769 .837 11- 3 3.667 482.5 9 Princeton 1752 .837 10- 3 3.333 525.5 10 North Carolina 1695 .834 9- 5 1.800 941.7 11 Notre Dame 855.7 .774 11- 3 3.667 233.4 12 Loyola 530.0 .725 7- 5 1.400 378.6 13 UMBC 513.0 .721 10- 5 2.000 256.5 14 Colgate 466.3 .711 11- 5 2.200 212.0 15 Delaware 443.5 .705 11- 5 2.200 201.6 16 Towson 403.4 .695 8- 6 1.333 302.5 17 Drexel 293.1 .658 11- 5 2.200 133.2 18 Bucknell 283.0 .654 11- 4 2.750 102.9 19 Syracuse 272.4 .649 5- 8 .6250 435.9 20 Ohio State 206.2 .616 9- 5 1.800 114.6 21 Stony Brook 187.3 .604 8- 5 1.600 117.1 22 Yale 161.7 .586 7- 6 1.167 138.6 23 Rutgers 156.9 .582 6- 6 1.000 156.9 24 Fairfield 140.7 .569 6- 6 1.000 140.7 25 Massachusetts 137.6 .566 7- 7 1.000 137.6 26 Pennsylvania 133.5 .563 6- 7 .8571 155.7 27 Harvard 111.9 .541 5- 7 .7143 156.6 28 Denver 108.7 .537 9- 7 1.286 84.51 29 Brown 105.3 .534 7- 7 1.000 105.3 30 Penn State 89.82 .514 5- 8 .6250 143.7 31 Dartmouth 89.30 .514 5-10 .5000 178.6 32 Hofstra 76.70 .495 6- 8 .7500 102.3 33 Army 67.40 .480 6- 9 .6667 101.1 34 Binghamton 44.18 .432 4- 9 .4444 99.40 35 Hobart 35.63 .408 5- 9 .5556 64.13 36 St. John's 32.04 .397 5- 8 .6250 51.27 37 Villanova 30.68 .392 7- 7 1.000 30.68 38 Lehigh 28.24 .383 4- 9 .4444 63.54 39 Holy Cross 15.33 .323 6- 8 .7500 20.44 40 Vermont 11.59 .297 4-10 .4000 28.97 41 Quinnipiac 7.779 .262 6- 7 .8571 9.075 42 Air Force 7.323 .257 2-10 .2000 36.61 43 Bellarmine 6.846 .252 3-10 .3000 22.82 44 Siena 6.792 .251 9- 6 1.500 4.528 45 Sacred Heart 4.688 .221 4- 8 .5000 9.375 46 Providence 3.681 .203 7- 9 .7778 4.733 47 Manhattan 2.921 .186 6- 8 .7500 3.895 48 Saint Joseph's 2.177 .167 6-12 .5000 4.354 49 Marist 2.041 .163 6- 9 .6667 3.061 50 Canisius 1.900 .158 6- 8 .7500 2.533 51 Hartford 1.671 .151 2-13 .1538 10.86 52 Mount St. Mary's 1.397 .141 4-10 .4000 3.493 53 Robert Morris 0 .045 2- 9 .2222 0 54 Lafayette 0 .036 1-12 .0833 0 55 VMI 0 .027 1-11 .0909 0 56 Wagner 0 .000 0-15 0 N/AThis is nicely illustrated with the following graph:

For "everybody else", see entries 2 to 52 in the table above.

Edited 1 time(s). Last edit at 05/08/2007 04:52PM by jtwcornell91.

**Re: NCAA Lacrosse Bradley-Terry**

**Posted by: Ronald '09**(---.nys.biz.rr.com)

**Date:**May 08, 2007 05:09PM

Well the good news for the committee from that is that only one team (Colgate) can have a legitimate argument as far as making the tournament from these rankings. The field includes the top 16 teams minus Colgate plus Providence.

Georgetown and Navy have bigger complaints than we do. Georgetown would be a top 4 seed, although they would play Hopkins in the second round anyway. And Navy would be a top 8 seed, but they get to play a team they beat by 11 anyway.

So other than our seeding, which is ridiculous but really doesn't matter, and those other couple of small issues, it does appear that the committee did a reasonable job choosing the field of 16.

Georgetown and Navy have bigger complaints than we do. Georgetown would be a top 4 seed, although they would play Hopkins in the second round anyway. And Navy would be a top 8 seed, but they get to play a team they beat by 11 anyway.

So other than our seeding, which is ridiculous but really doesn't matter, and those other couple of small issues, it does appear that the committee did a reasonable job choosing the field of 16.

**Re: NCAA Lacrosse Bradley-Terry**

**Posted by: jtwcornell91**(Moderator)

**Date:**May 08, 2007 05:17PM

Okay, so there is an argument that explains how you might not judge from the game results that an undefeated team is automatically the best. Before the season starts you are in a state of ignorance about the strengths of the the various teams. (Okay, so you're really not, but to be fair to everyone you should put aside whatever prior expectations you have.) Every result gives you a little more information, and it could be that if Team A beats Team B and a bunch of cream puffs, you've gained one game's worth of interesting results. But then Team B goes out, plays a bunch of other teams which also rack up good records against a cross-section of teams, and say they lose to one other strong team, but rack up six wins against tough opponents, plus some array of easy wins that don't tell you too much. Well, since wins over cream-puffs don't tell you all that much, the information you have to work with is one win by Team A against a tough opponent, and six wins and two losses by Team B against similar competition. It could be that those eight games tell you more about Team B than the one does about Team A.

Well, the good news is we can make all of this quantitative, since this is exactly what Bayesian statistics tell us to do: start with some prior expectation the likelihood that unknown quantities (in this case teams' Bradley-Terry ratings) take on certain values, and modify those priors based on observational data (game results) to get a posterior probability distribution.

It turns out, if you use what's known as a Jeffreys prior, a uniform probability distribution in the logarithm of each team's BT rating, the maximum of the posterior probability distribution will be the usual set of ratings predicted by KRACH or its equivalent. But this is problematic, since it lets things run off to infinity, and basically represents the wrong kind of ignorance. For instance, if we ask the question "what fraction of games do we expect a given team to win against a team with a fixed rating, say 100", the prior probability distribution has infinitely sharp peaks at 0 and 1; basically, there's an infinite amount of room for a team's rating to be arbitrarily higher or lower than 100.

A more well-behaved prior is to pick some reference rating, like 100, and say that a team is a priori equally likely to have any expected head-to-head winning percentage against that team. This gives a probability distribution in log(BT rating) which is peaked around log(100). And, as it turns out, when you use this prior and construct the posterior probability distribution considering the results of the games, there is always a single peak at nice finite values of all the ratings, and it's basically the equivalent of a KRACH rating with two "fictitious games" (one win and one loss) for each team against that hypothetical team with a rating of 100. So we can calculate the maximum likelihood ratings with the usual software, and get (now "W/L" actually means (W+1)/(L+1) and likewise SOS includes the fictitious games)

Well, the good news is we can make all of this quantitative, since this is exactly what Bayesian statistics tell us to do: start with some prior expectation the likelihood that unknown quantities (in this case teams' Bradley-Terry ratings) take on certain values, and modify those priors based on observational data (game results) to get a posterior probability distribution.

It turns out, if you use what's known as a Jeffreys prior, a uniform probability distribution in the logarithm of each team's BT rating, the maximum of the posterior probability distribution will be the usual set of ratings predicted by KRACH or its equivalent. But this is problematic, since it lets things run off to infinity, and basically represents the wrong kind of ignorance. For instance, if we ask the question "what fraction of games do we expect a given team to win against a team with a fixed rating, say 100", the prior probability distribution has infinitely sharp peaks at 0 and 1; basically, there's an infinite amount of room for a team's rating to be arbitrarily higher or lower than 100.

A more well-behaved prior is to pick some reference rating, like 100, and say that a team is a priori equally likely to have any expected head-to-head winning percentage against that team. This gives a probability distribution in log(BT rating) which is peaked around log(100). And, as it turns out, when you use this prior and construct the posterior probability distribution considering the results of the games, there is always a single peak at nice finite values of all the ratings, and it's basically the equivalent of a KRACH rating with two "fictitious games" (one win and one loss) for each team against that hypothetical team with a rating of 100. So we can calculate the maximum likelihood ratings with the usual software, and get (now "W/L" actually means (W+1)/(L+1) and likewise SOS includes the fictitious games)

# Team BT RRWP W- L "W/L" SOS 1 Cornell 3384 .946 13- 0 14.00 241.7 2 Duke 1889 .910 14- 2 5.000 377.8 3 Virginia 992.5 .853 12- 3 3.250 305.4 4 Albany 809.5 .830 14- 2 5.000 161.9 5 Georgetown 804.7 .830 11- 2 4.000 201.2 6 Johns Hopkins 661.5 .806 9- 4 2.000 330.8 7 Princeton 513.8 .773 10- 3 2.750 186.8 8 Navy 511.7 .773 11- 3 3.000 170.6 9 Maryland 478.8 .763 10- 5 1.833 261.2 10 North Carolina 430.9 .748 9- 5 1.667 258.5 11 Notre Dame 427.3 .747 11- 3 3.000 142.4 12 UMBC 286.8 .686 10- 5 1.833 156.4 13 Colgate 273.4 .678 11- 5 2.000 136.7 14 Delaware 269.8 .676 11- 5 2.000 134.9 15 Loyola 255.5 .667 7- 5 1.333 191.7 16 Bucknell 230.9 .650 11- 4 2.400 96.23 17 Towson 218.0 .640 8- 6 1.286 169.5 18 Drexel 215.1 .637 11- 5 2.000 107.5 19 Stony Brook 177.8 .604 8- 5 1.500 118.6 20 Ohio State 167.9 .594 9- 5 1.667 100.7 21 Syracuse 150.9 .575 5- 8 .6667 226.4 22 Yale 143.4 .566 7- 6 1.143 125.5 23 Rutgers 128.6 .546 6- 6 1.000 128.6 24 Denver 128.3 .546 9- 7 1.250 102.7 25 Fairfield 124.8 .540 6- 6 1.000 124.8 26 Massachusetts 123.6 .539 7- 7 1.000 123.6 27 Pennsylvania 121.8 .536 6- 7 .8750 139.1 28 Brown 112.1 .521 7- 7 1.000 112.1 29 Harvard 106.6 .512 5- 7 .7500 142.1 30 Dartmouth 91.79 .484 5-10 .5455 168.3 31 Penn State 88.88 .478 5- 8 .6667 133.3 32 Hofstra 84.20 .469 6- 8 .7778 108.3 33 Army 81.36 .462 6- 9 .7000 116.2 34 Villanova 68.51 .431 7- 7 1.000 68.51 35 Binghamton 63.55 .418 4- 9 .5000 127.1 36 St. John's 59.35 .405 5- 8 .6667 89.03 37 Hobart 57.41 .400 5- 9 .6000 95.69 38 Siena 51.16 .379 9- 6 1.429 35.81 39 Lehigh 50.99 .379 4- 9 .5000 102.0 40 Holy Cross 43.91 .353 6- 8 .7778 56.45 41 Vermont 38.36 .330 4-10 .4545 84.40 42 Quinnipiac 38.08 .329 6- 7 .8750 43.52 43 Providence 32.08 .302 7- 9 .8000 40.11 44 Sacred Heart 29.77 .290 4- 8 .5556 53.59 45 Bellarmine 29.32 .288 3-10 .3636 80.62 46 Manhattan 27.90 .280 6- 8 .7778 35.87 47 Air Force 27.89 .280 2-10 .2727 102.3 48 Canisius 27.04 .275 6- 8 .7778 34.76 49 Saint Joseph's 22.17 .246 6-12 .5385 41.17 50 Marist 21.41 .241 6- 9 .7000 30.59 51 Mount St. Mary's 21.34 .241 4-10 .4545 46.94 52 Hartford 12.50 .173 2-13 .2143 58.34 53 Robert Morris 11.33 .163 2- 9 .3000 37.76 54 Lafayette 7.304 .120 1-12 .1538 47.48 55 VMI 3.277 .065 1-11 .1667 19.66 56 Wagner 1.241 .026 0-15 .0625 19.85One of the nice things about this method is that it also comes with a probability distribution for the ratings, and so you can estimate the uncertainties in each of them. I've got the raw numbers for those, but I don't have time to put them into a nice form tonight. But I have some ideas for cool graphs...

**Re: NCAA Lacrosse Bradley-Terry**

**Posted by: Swampy**(---.219.128.131.dhcp.uri.edu)

**Date:**May 08, 2007 05:35PM

jtwcornell91

Okay, so there is an argument that explains how you might not judge from the game results that an undefeated team is automatically the best. Before the season starts you are in a state of ignorance about the strengths of the the various teams. (Okay, so you're really not, but to be fair to everyone you should put aside whatever prior expectations you have.) Every result gives you a little more information, and it could be that if Team A beats Team B and a bunch of cream puffs, you've gained one game's worth of interesting results. But then Team B goes out, plays a bunch of other teams which also rack up good records against a cross-section of teams, and say they lose to one other strong team, but rack up six wins against tough opponents, plus some array of easy wins that don't tell you too much. Well, since wins over cream-puffs don't tell you all that much, the information you have to work with is one win by Team A against a tough opponent, and six wins and two losses by Team B against similar competition. It could be that those eight games tell you more about Team B than the one does about Team A.

Well, the good news is we can make all of this quantitative, since this is exactly what Bayesian statistics tell us to do: start with some prior expectation the likelihood that unknown quantities (in this case teams' Bradley-Terry ratings) take on certain values, and modify those priors based on observational data (game results) to get a posterior probability distribution.

It turns out, if you use what's known as a Jeffreys prior, a uniform probability distribution in the logarithm of each team's BT rating, the maximum of the posterior probability distribution will be the usual set of ratings predicted by KRACH or its equivalent. But this is problematic, since it lets things run off to infinity, and basically represents the wrong kind of ignorance. For instance, if we ask the question "what fraction of games do we expect a given team to win against a team with a fixed rating, say 100", the prior probability distribution has infinitely sharp peaks at 0 and 1; basically, there's an infinite amount of room for a team's rating to be arbitrarily higher or lower than 100.

A more well-behaved prior is to pick some reference rating, like 100, and say that a team is a priori equally likely to have any expected head-to-head winning percentage against that team. This gives a probability distribution in log(BT rating) which is peaked around log(100). And, as it turns out, when you use this prior and construct the posterior probability distribution considering the results of the games, there is always a single peak at nice finite values of all the ratings, and it's basically the equivalent of a KRACH rating with two "fictitious games" (one win and one loss) for each team against that hypothetical team with a rating of 100. So we can calculate the maximum likelihood ratings with the usual software, and get (now "W/L" actually means (W+1)/(L+1) and likewise SOS includes the fictitious games)

# Team BT RRWP W- L "W/L" SOS 1 Cornell 3384 .946 13- 0 14.00 241.7 2 Duke 1889 .910 14- 2 5.000 377.8 3 Virginia 992.5 .853 12- 3 3.250 305.4 4 Albany 809.5 .830 14- 2 5.000 161.9 5 Georgetown 804.7 .830 11- 2 4.000 201.2 6 Johns Hopkins 661.5 .806 9- 4 2.000 330.8 7 Princeton 513.8 .773 10- 3 2.750 186.8 8 Navy 511.7 .773 11- 3 3.000 170.6 9 Maryland 478.8 .763 10- 5 1.833 261.2 10 North Carolina 430.9 .748 9- 5 1.667 258.5 11 Notre Dame 427.3 .747 11- 3 3.000 142.4 12 UMBC 286.8 .686 10- 5 1.833 156.4 13 Colgate 273.4 .678 11- 5 2.000 136.7 14 Delaware 269.8 .676 11- 5 2.000 134.9 15 Loyola 255.5 .667 7- 5 1.333 191.7 16 Bucknell 230.9 .650 11- 4 2.400 96.23 17 Towson 218.0 .640 8- 6 1.286 169.5 18 Drexel 215.1 .637 11- 5 2.000 107.5 19 Stony Brook 177.8 .604 8- 5 1.500 118.6 20 Ohio State 167.9 .594 9- 5 1.667 100.7 21 Syracuse 150.9 .575 5- 8 .6667 226.4 22 Yale 143.4 .566 7- 6 1.143 125.5 23 Rutgers 128.6 .546 6- 6 1.000 128.6 24 Denver 128.3 .546 9- 7 1.250 102.7 25 Fairfield 124.8 .540 6- 6 1.000 124.8 26 Massachusetts 123.6 .539 7- 7 1.000 123.6 27 Pennsylvania 121.8 .536 6- 7 .8750 139.1 28 Brown 112.1 .521 7- 7 1.000 112.1 29 Harvard 106.6 .512 5- 7 .7500 142.1 30 Dartmouth 91.79 .484 5-10 .5455 168.3 31 Penn State 88.88 .478 5- 8 .6667 133.3 32 Hofstra 84.20 .469 6- 8 .7778 108.3 33 Army 81.36 .462 6- 9 .7000 116.2 34 Villanova 68.51 .431 7- 7 1.000 68.51 35 Binghamton 63.55 .418 4- 9 .5000 127.1 36 St. John's 59.35 .405 5- 8 .6667 89.03 37 Hobart 57.41 .400 5- 9 .6000 95.69 38 Siena 51.16 .379 9- 6 1.429 35.81 39 Lehigh 50.99 .379 4- 9 .5000 102.0 40 Holy Cross 43.91 .353 6- 8 .7778 56.45 41 Vermont 38.36 .330 4-10 .4545 84.40 42 Quinnipiac 38.08 .329 6- 7 .8750 43.52 43 Providence 32.08 .302 7- 9 .8000 40.11 44 Sacred Heart 29.77 .290 4- 8 .5556 53.59 45 Bellarmine 29.32 .288 3-10 .3636 80.62 46 Manhattan 27.90 .280 6- 8 .7778 35.87 47 Air Force 27.89 .280 2-10 .2727 102.3 48 Canisius 27.04 .275 6- 8 .7778 34.76 49 Saint Joseph's 22.17 .246 6-12 .5385 41.17 50 Marist 21.41 .241 6- 9 .7000 30.59 51 Mount St. Mary's 21.34 .241 4-10 .4545 46.94 52 Hartford 12.50 .173 2-13 .2143 58.34 53 Robert Morris 11.33 .163 2- 9 .3000 37.76 54 Lafayette 7.304 .120 1-12 .1538 47.48 55 VMI 3.277 .065 1-11 .1667 19.66 56 Wagner 1.241 .026 0-15 .0625 19.85One of the nice things about this method is that it also comes with a probability distribution for the ratings, and so you can estimate the uncertainties in each of them. I've got the raw numbers for those, but I don't have time to put them into a nice form tonight. But I have some ideas for cool graphs...

Yeah, but look at where the selection committee went to school. They'd never understand this!

**Re: NCAA Lacrosse Bradley-Terry**

**Posted by: jeh25**(---.ri.ri.cox.net)

**Date:**May 09, 2007 11:53AM

Swampy

Yeah, but look at where the selection committee went to school. They'd never understand this!

Hell. 5 Years ago, I demonstrated (in detail) on laxpower how the committee didn't even follow their own published selection criteria. As memory serves, the committee took Hofstra over Yale despite Yale clearly coming out ahead. Alas, the laxpower archives don't go back that far anymore.

Anyway, the takehome msg is that hockey fans don't realize how lucky they were/are to have a deterministic (if occasionally flawed) NCAA selection criteria than isn't a smokey backroom full of the old boys club. Consider that SU and JHUs storied histories wouldn't be quite so bright without 30+ years of favorable seedings and or outright selection.

Maybe Al can confirm but I seem to remember that back in the 4 team tourney days, Cornell, the defending champion, didn't even get a tourney bid because the committee felt Navy was more deserving in spite of Cornell having a better record.

___________________________

Cornell '98 '00; Yale 01-03; UConn 03-07; Brown 07-09; Penn State faculty 09-

Work is no longer an excuse to live near an ECACHL team...

Cornell '98 '00; Yale 01-03; UConn 03-07; Brown 07-09; Penn State faculty 09-

Work is no longer an excuse to live near an ECACHL team...

**Re: NCAA Lacrosse Bradley-Terry**

**Posted by: Al DeFlorio**(---.hsd1.nh.comcast.net)

**Date:**May 09, 2007 12:28PM

The only year Cornell wasn't invited to "defend" was 1972: 10-3 overall; 6-0 Ivy; but losses to Navy (12-9), Cortland (14-8), and Hobart (11-10). The non-Ivy wins were Hofstra, Adelphi, Syracuse (a legitimate cupcake back then) and Fairleigh-Ridiculous. I'm sure it was the old "strength of schedule" issue, but the losses to non-Ivys hurt (and helped Cortland make the tournament). Cortland beat Navy in the quarters but was beaten by Virginia--the eventual champ--in the semis. The 1973 team finished at 8-3, 5-1 Ivy, with losses to Navy, Hopkins, and Brown to open the season. Non-Ivy wins were Hobart, Syracuse, and Cortland. The 1974 through 1979 teams all made the tournament.jeh25

Maybe Al can confirm but I seem to remember that back in the 4 team tourney days, Cornell, the defending champion, didn't even get a tourney bid because the committee felt Navy was more deserving in spite of Cornell having a better record.

The tournament started with eight teams in 1971, and has since expanded to twelve and then, very recently, sixteen.

___________________________

Al DeFlorio '65

Al DeFlorio '65

**Re: NCAA Lacrosse Bradley-Terry**

**Posted by: jeh25**(---.ri.ri.cox.net)

**Date:**May 09, 2007 01:03PM

Ok, so Navy had SOS and H2H (and a southern bias?) while Cornell had WinPct, meaning it isn't quite as egregious as I had remembered. But still to take a .5714 team over the .7692 team when the later is the defending champ seems a little shady. I mean, 9 loses isn't anything to write home about, "quality teams" or not.

Also, I've always thought most fans and many coaches are too quick to overweight the importance of H2H: even a blind squirrel finds a nut once in a while. But of course, that's just the statistician in me showing through - most people don't think about measurement error on a daily basis.

As far as selecting Cortland over Cornell in '72, they split H2H and WinPct - would SOS have been comparable or would Cortland have used the flexibility of a non-Ivy schedule to fit in more "quality" southern teams?

Also, I've always thought most fans and many coaches are too quick to overweight the importance of H2H: even a blind squirrel finds a nut once in a while. But of course, that's just the statistician in me showing through - most people don't think about measurement error on a daily basis.

As far as selecting Cortland over Cornell in '72, they split H2H and WinPct - would SOS have been comparable or would Cortland have used the flexibility of a non-Ivy schedule to fit in more "quality" southern teams?

**Re: NCAA Lacrosse Bradley-Terry**

**Posted by: Al DeFlorio**(---.hsd1.ma.comcast.net)

**Date:**May 09, 2007 01:24PM

12-9 was the score of the Cornell-Navy game--not Navy's season record. Playing 21 lacrosse games in a regular season would probably in itself earn an invitation.jeh25

I mean, 9 loses isn't anything to write home about, "quality teams" or not.

___________________________

Al DeFlorio '65

Al DeFlorio '65

**Re: NCAA Lacrosse Bradley-Terry**

**Posted by: Beeeej**(Moderator)

**Date:**May 09, 2007 01:28PM

jtwcornell91

It turns out, if you use what's known as a Jeffreys prior, a uniform probability distribution in the logarithm of each team's BT rating, the maximum of the posterior probability distribution will be the usual set of ratings predicted by KRACH or its equivalent.

Henceforth, on this forum, we will refer to it as a Beeeej's Prior.

___________________________

Beeeej, Esq.

"Cornell isn't an organization. It's a loose affiliation of independent fiefdoms united by a common hockey team."

- Steve Worona

Beeeej, Esq.

"Cornell isn't an organization. It's a loose affiliation of independent fiefdoms united by a common hockey team."

- Steve Worona

**Re: NCAA Lacrosse Bradley-Terry**

**Posted by: ugarte**(38.136.14.---)

**Date:**May 09, 2007 01:36PM

Or Jeffffrey's Prior.Beeeejjtwcornell91

It turns out, if you use what's known as a Jeffreys prior, a uniform probability distribution in the logarithm of each team's BT rating, the maximum of the posterior probability distribution will be the usual set of ratings predicted by KRACH or its equivalent.

Henceforth, on this forum, we will refer to it as a Beeeej's Prior.

___________________________

Jokes and stuff

Jokes and stuff

**Re: NCAA Lacrosse Bradley-Terry**

**Posted by: KeithK**(---.external.lmco.com)

**Date:**May 09, 2007 02:37PM

Why should being the defending champ have any impact on whether you make the tournament or not? Applying any kind of carryover effect from previous seasons (even if restricted to championships) is exactly the kind of bias we are railing against here.jeh25

But still to take a .5714 team over the .7692 team when the later is the defending champ seems a little shady.

**Re: NCAA Lacrosse Bradley-Terry**

**Posted by: jeh25**(---.ri.ri.cox.net)

**Date:**May 09, 2007 04:12PM

sorry. wasn't clear - yes i agree that a name branding effect is to be avoided. but if you'd gonna throw rational criteria out the window and go with gut feeling, as they did in the bad old days, and only use an ephemeral "quality program" standard, not inviting the defending champion is a crock.

But in anyway, you can ignore my post because 12-9 was the margin of victory, not navy's win record. And this is my last post on the topic since my defense is in <4 weeks.

*john crawls back into his cave*

But in anyway, you can ignore my post because 12-9 was the margin of victory, not navy's win record. And this is my last post on the topic since my defense is in <4 weeks.

*john crawls back into his cave*

**Re: NCAA Lacrosse Bradley-Terry**

**Posted by: Hillel Hoffmann**(---.usb.temple.edu)

**Date:**May 09, 2007 04:31PM

Holy shit, what terrible timing. If Cornell makes it to... to... I can't say it, but if Cornell makes it far, I expect to see you there anyway.jeh25

And this is my last post on the topic since my defense is in <4 weeks.

**Re: NCAA Lacrosse Bradley-Terry**

**Posted by: KeithK**(---.external.lmco.com)

**Date:**May 09, 2007 04:46PM

It's not like defenses are that important anyway. By the time you get there you know you're going to pass it. Of course, if the draft isn't done yet it's a different story...Hillel HoffmannHoly shit, what terrible timing. If Cornell makes it to... to... I can't say it, but if Cornell makes it far, I expect to see you there anyway.jeh25

And this is my last post on the topic since my defense is in <4 weeks.

**Re: NCAA Lacrosse Bradley-Terry**

**Posted by: Rita**(---.agry.purdue.edu)

**Date:**May 09, 2007 04:59PM

KeithKIt's not like defenses are that important anyway. By the time you get there you know you're going to pass it. Of course, if the draft isn't done yet it's a different story...Hillel HoffmannHoly shit, what terrible timing. If Cornell makes it to... to... I can't say it, but if Cornell makes it far, I expect to see you there anyway.jeh25

And this is my last post on the topic since my defense is in <4 weeks.

Yeah, they do not let you schedule the defense unless they are certain you will pass . Here is some "unsolicited advice"; for the draft that you give your committee, do not get too hung up on formatting issues. They will most likely have changes and edits that they want you to make and you will have to re-format it anyways.

I also think it is some sort of "badge of honor" to only get ~ 15 hr of sleep in the last week of writing your thesis.

**Re: NCAA Lacrosse Bradley-Terry**

**Posted by: CowbellGuy**(Moderator)

**Date:**May 09, 2007 05:14PM

KeithK

Of course, if the draft isn't done yet it's a different story...

**We're talking about Hayes here.**

*IF?*___________________________

"[Hugh] Jessiman turned out to be a huge specimen of something alright." --Puck Daddy

"[Hugh] Jessiman turned out to be a huge specimen of something alright." --Puck Daddy

**Re: NCAA Lacrosse Bradley-Terry**

**Posted by: jtwcornell91**(Moderator)

**Date:**May 12, 2007 04:19PM

jtwcornell91

One of the nice things about this method is that it also comes with a probability distribution for the ratings, and so you can estimate the uncertainties in each of them. I've got the raw numbers for those, but I don't have time to put them into a nice form tonight. But I have some ideas for cool graphs...

Okay, so as promised. What you end up predicting is a probability for each possible set of values for all the ratings of the 56 teams, meaning that it's a function of 56 variables. But you can look at the probability distribution for one team's rating by "marginalizing" over all the other ratings; here are the marginalized probability distributions for the top six teams, each scaled so that the maximum is one:

If we want to normalize the distributions, so that the total probability in each case adds up to 1, we have to scale the broader curves down relative to the more sharply-peaked ones:

Edited 4 time(s). Last edit at 05/12/2007 05:21PM by jtwcornell91.

**Re: NCAA Lacrosse Bradley-Terry**

**Posted by: Jacob '06**(---.dhcp.psdn.ca.charter.com)

**Date:**May 12, 2007 04:22PM

Is the lower probability center for Cornell due to their weaker SOS, or less exposure to OOC teams? (By lower I meant the center of our distribution is only at ~.5 and we have a wider distribution)

Edited 1 time(s). Last edit at 05/12/2007 04:22PM by Jacob '06.

**Re: NCAA Lacrosse Bradley-Terry**

**Posted by: jtwcornell91**(Moderator)

**Date:**May 12, 2007 04:35PM

Jacob '06

Is the lower probability center for Cornell due to their weaker SOS, or less exposure to OOC teams? (By lower I meant the center of our distribution is only at ~.5 and we have a wider distribution)

It's just that in the second plot everything's scaled so that the area under all of the curves is the same. Since ours is broader, the peak is lower. The broader distribution means our rating is less precisely determined. That may be because of the slightly weaker schedule, but it may also just be that our rating is farther from 100, where everybody's prior was peaked at the start of the season. Note that Duke's distribution is the second-broadest.

**Re: NCAA Lacrosse Bradley-Terry**

**Posted by: jtwcornell91**(Moderator)

**Date:**May 12, 2007 04:36PM

Yeesh, let me finish...

**Re: NCAA Lacrosse Bradley-Terry**

**Posted by: ebilmes**(---.resnet.cornell.edu)

**Date:**May 12, 2007 04:44PM

JTW for President!

**Re: NCAA Lacrosse Bradley-Terry**

**Posted by: DeltaOne81**(---.bos.east.verizon.net)

**Date:**May 12, 2007 04:46PM

ebilmes

JTW for President!

... of the NCAA lacrosse committee

**Cornell vs Duke: posterior probabilities**

**Posted by: jtwcornell91**(Moderator)

**Date:**May 12, 2007 05:05PM

Okay, so actually marginalizing over everything is not quite the right thing to do. I.e., if you want to ask what we know about the strengths of Cornell and Duke, you have to think not just about the overall probability of Cornell having a certain rating and of Duke having a certain rating, but rather the probabilities of combinations of their ratings. (If we'd used the Jeeeeffreys prior, this would be even more extreme, since we'd actually know nothing about the product of two teams' rating, since the overall scale would not be defined.) Take, for example, Cornell and Duke; our knowledge of their ratings is correlated; if you marginalize over all the other teams' ratings and plot the probability distribution of combinations of Cornell's and Duke's ratings, it looks like this:

What's really interesting is the ratio of these two teams' ratings, which tells you, e.g., the expected odds that one will win a game against the other. The probability distribution of that (marginalizing over everything else) looks like this:

About three times as much of the area under that curve lies to the right of one as lies to the left, which says there's a 25.5% chance that Duke is actually stronger than Cornell and just got unluckier in their game results. But it's 74.5% likely that Cornell is indeed stronger than Duke, but not necessarily by the factor of two that the most likely ratings indicate. It could be more or less.

What's really interesting is the ratio of these two teams' ratings, which tells you, e.g., the expected odds that one will win a game against the other. The probability distribution of that (marginalizing over everything else) looks like this:

About three times as much of the area under that curve lies to the right of one as lies to the left, which says there's a 25.5% chance that Duke is actually stronger than Cornell and just got unluckier in their game results. But it's 74.5% likely that Cornell is indeed stronger than Duke, but not necessarily by the factor of two that the most likely ratings indicate. It could be more or less.

**Cornell vs Virginia, posterior probabilities**

**Posted by: jtwcornell91**(Moderator)

**Date:**May 12, 2007 05:09PM

If we do the same thing with Virginia, the contour plot looks like this:

And the ratio of the ratings is distributed as follows.

There's only a 9.3% chance that UVa is actually stronger than Cornell.

And the ratio of the ratings is distributed as follows.

There's only a 9.3% chance that UVa is actually stronger than Cornell.

**Cornell vs Johns Hopkins: posterior probabilities**

**Posted by: jtwcornell91**(Moderator)

**Date:**May 12, 2007 05:14PM

Finally, let's do the same thing with Hopkins (although they're actually only the sixth strongest team according to this evaluation).

There's only a 3.8% chance that Johns Hopkins is actually better than Cornell.

There's only a 3.8% chance that Johns Hopkins is actually better than Cornell.

Edited 1 time(s). Last edit at 05/13/2007 01:28AM by jtwcornell91.

**Re: NCAA Lacrosse Bradley-Terry**

**Posted by: Jacob '06**(---.dhcp.psdn.ca.charter.com)

**Date:**May 12, 2007 05:18PM

Please forward to the person in charge of the selections. Of course that person will probably have no clue what you are talking about.

**Re: NCAA Lacrosse Bradley-Terry**

**Posted by: Jim Hyla**(---.twcny.res.rr.com)

**Date:**May 12, 2007 07:27PM

Jacob '06

Please forward to the person in charge of the selections. Of course that person will probably have no clue what you are talking about.

And we do?

___________________________

"Cornell Fans Made the Timbers Tremble", Boston Globe, March/1970

Cornell lawyers stopped the candy throwing. Jan/2005

"Cornell Fans Made the Timbers Tremble", Boston Globe, March/1970

Cornell lawyers stopped the candy throwing. Jan/2005

**Re: NCAA Lacrosse Bradley-Terry**

**Posted by: Swampy**(---.ri.ri.cox.net)

**Date:**May 14, 2007 12:39AM

jtwcornell91Jacob '06

Is the lower probability center for Cornell due to their weaker SOS, or less exposure to OOC teams? (By lower I meant the center of our distribution is only at ~.5 and we have a wider distribution)

It's just that in the second plot everything's scaled so that the area under all of the curves is the same. Since ours is broader, the peak is lower. The broader distribution means our rating is less precisely determined. That may be because of the slightly weaker schedule, but it may also just be that our rating is farther from 100, where everybody's prior was peaked at the start of the season. Note that Duke's distribution is the second-broadest.

Wouldn't the spread reflect the total number of games played? The standard deviation is inversely proportional to the sample size, and the curves have a more than passing resemblance to a normal curve.

**Re: NCAA Lacrosse Bradley-Terry**

**Posted by: French Rage**(---.hsd1.ca.comcast.net)

**Date:**May 14, 2007 12:44AM

What's a nubian?

___________________________

03/23/02: Maine 4, Harvard 3

03/28/03: BU 6, Harvard 4

03/26/04: Maine 5, Harvard 4

03/26/05: UNH 3, Harvard 2

03/25/06: Maine 6, Harvard 1

03/23/02: Maine 4, Harvard 3

03/28/03: BU 6, Harvard 4

03/26/04: Maine 5, Harvard 4

03/26/05: UNH 3, Harvard 2

03/25/06: Maine 6, Harvard 1

**Re: NCAA Lacrosse Bradley-Terry**

**Posted by: jtwcornell91**(Moderator)

**Date:**May 14, 2007 03:22AM

Swampyjtwcornell91Jacob '06

Is the lower probability center for Cornell due to their weaker SOS, or less exposure to OOC teams? (By lower I meant the center of our distribution is only at ~.5 and we have a wider distribution)

It's just that in the second plot everything's scaled so that the area under all of the curves is the same. Since ours is broader, the peak is lower. The broader distribution means our rating is less precisely determined. That may be because of the slightly weaker schedule, but it may also just be that our rating is farther from 100, where everybody's prior was peaked at the start of the season. Note that Duke's distribution is the second-broadest.

Wouldn't the spread reflect the total number of games played? The standard deviation is inversely proportional to the sample size, and the curves have a more than passing resemblance to a normal curve.

Yeah, the curves are all Gaussians in log(rating). The maximum likelihood equations are exact, but working with the actual shape of the posterior as a function of 56 variables would be impractical. (Marginalizing over the other 54 teams' ratings would mean doing a 54-dimensional numerical integration!) So all of the probabilities actually use the Taylor expansion of log(Posterior) to second order in log(ratings), which is a Gaussian.

Anyway, the standard deviation depends on the number of games played but also on the lopsidedness of the games, according to the maximum-likelihood values for the ratings. If N(A,B) is the number of games played by A against B, HHWP(A,B) is the predicted head-to-head winning percentage for A vs B according to the ML ratings, and HHWP(A,0) and HHWP(0,A) are the corresponding quantities for team A vs the reference team with a rating of 100, the inverse-sigma-squared matrix is

invsigsq(A,A) = 2 * HHWP(A,0) * HHWP(0,A) + sum_B [ N(A,B) * HHWP(A,B) * HHWP(B,A) ] invsigsq(A,B!=A) = - N(A,B) * HHWP(A,B) * HHWP(B,A)The widths of the Gaussians are the square roots of the diagonal elements of the inverse of this matrix. So increasing the number of games played does scale up the inverse sigma squared matrix and therefore scale down the relevant sigmas. But the products of ML HHWPs come into play as well. Note that

HHWP(A,B) * HHWP(B,A)is a maximum (1/2*1/2=1/4) when A and B have the same ML rating and goes to zero (0*1=0) when the ratio of the ratings goes to zero or infinity. So the

2 * HHWP(A,0) * HHWP(0,A)term will give a higher inverse-sigma-squared and therefore a tighter distribution for a team whose rating is closer to 100. Also, the

HHWP(A,B) * HHWP(B,A)inside the sum means that a game does more to tighten a team's distribution when the two teams playing are of similar strengths. (So for stronger teams, a strong schedule is a way to more precisely nail down the ratings, which makes intuitive sense. I learn more about Maryland's strength as a team from a game against Georgetown than from a game against Bellarmine.)

**Re: NCAA Lacrosse Bradley-Terry**

**Posted by: RichH**(216.195.201.---)

**Date:**May 14, 2007 09:56AM

jtwcornell91

but working with the actual shape of the posterior as a function of 56 variables would be impractical.

Don't I know it!

*rimshot*

**Re: NCAA Lacrosse Bradley-Terry**

**Posted by: Hillel Hoffmann**(---.usb.temple.edu)

**Date:**May 15, 2007 04:18PM

John, you rule.

Sorry, only registered users may post in this forum.