RPI primer (brackets for dummies): Why #2/#3 Cornell may seed lower

Started by billhoward, February 26, 2005, 02:07:32 PM

Previous topic - Next topic

ninian '72

Bill, let me jump in on this.  Newman may want to follow up.  Explained variance of 19% means that the KRACH model can only explain 19% of the "variability" of outcomes in the dataset of all games played thus far.  That means that for any PARTICULAR game there are variables in addition to those measured by KRACH that may predict outcome, that there is randomness in the outcome that can't be explained by any variable, or - most probably - both.  It doesn't mean that KRACH can predict the outcome of only one out of five games.  What it does mean is that KRACH helps us predict the outcome of a game better than flipping a coin, but I wouldn't bet the mortgage based on what it tells us.

DeltaOne81

[Q]billhoward Wrote:
I wonder if one could determine that a team playing a solid defensive game that results in 2-0 and 2-1 scores has more, better, and/or more consistent outcomes than offensive flyers who win 8-5, 7-3, and occasionally lose 6-5? If so, then there's solid math behind Schafer's madness. [/q]
Or perhaps there's solid madness behind Schafer's math ;-)

billhoward

Second try: KRACH can account for about one-fifth of the explanation of how any game turns out (for games among ~closely matched teams)? As opposed to (my bad interpolation) it can account for the outcome of one of every five games among closely matched teams - you're saying the second interpretation isn't the same as the first?

jtwcornell91

Okay, I have to admit I haven't been following this discussion in detail.  I know Ken played around a little with this a few years back and found, as you did, that the confidence intervals were pretty broad.

One thing I have noticed by skimming the discussion is that y'all seem, at least in the explanations, to be taking a Bayesian perspective on what is fundamentally a frequentist rating system.  The confidence intervals don't bound the possible values of a team's KRACH rating, they bound the range of values which would predict results consistent with the ones we see.  I've thought about a Bayesian analogue to KRACH, especially to handle situations like Division I-A football this season where there are multiple undefeated teams, but I had trouble coming up with an "obvious" prior when there were more than two teams involved.

KeithK

I love how a thread titled "RPI primer (brackets for dummies)" ends up with a discussion  of Bavesian vs. frequentist ratings, to the point that it might go over the head of someone who has a PhD...

jkahn

I think there's been a lot of KRACH overanalysis here.  Let's get back to the origninal thread subject.  First, here's my try at explaining KRACH for dummies.
For Teams A, B, C, D etc., KRACH ratings of a, b, c, d, etc. represent the relative strength of those teams, such that the probability of Team A winning against Team B (with a tie counting as half a win) is a/(a+b).  The KRACH ratings are solved for based upon the results to date.  
It won't, of course, predict winners in each individual game, but it does correlate perfectly (by definition) with each Team's total number of wins (tie being 1/2) over the entire season, based upon the schedule they played.
Let's assume there are only three teams, and each plays the other four times.  Team A is 6-2 and Teams B and C are each 3-5.  The KRACH ratings would be 300, 100 and 100 (or 3x, x and x as it's the multiplicative relationship which matters).  Based on it's schedule, Team A's expected number of wins would be 3 of 4 vs. Team B (300/(300+100)) and 3 of 4 vs. Team C, resulting in a total expected value of 6 wins.  Teams B and C have the same KRACH, so their expected wins would be 2 against each other and a total of 3 wins in their 8 game schedules.  What KRACH does is solve for the relative team strength that yields an expected number of wins which equals the actual number.  The KRACH ratings are the same regardless of whether Team A actually was 3-1 against both A and B or 4-0 and 2-2, as long as the schedule was the same and the total records are the same.
KRACH takes into account all your games, your strength of schedule, etc. - they're just not broken out as a separate line items.  
Let's say that you believe that a certain team's KRACH rating is overrated compared to the other teams.  Well. if you lowered the KRACH for that team, what you are saying is that you believe that that team's expected value of the number of wins is less than what they actually have turned out to be.  So, they must have outperformed your rating, and therefore should be rated higher - and the only solution that works back to calculating the actual number of wins is the KRACH rating.
Jeff Kahn '70 '72

Newman

[Q]billhoward Wrote:

 Second try: KRACH can account for about one-fifth of the explanation of how any game turns out (for games among ~closely matched teams)? As opposed to (my bad interpolation) it can account for the outcome of one of every five games among closely matched teams - you're saying the second interpretation isn't the same as the first? [/q]

An approximate although not exact analogy can be explained thinking about a dice game: two sides each roll five dice, and the side with the highest total wins.

However, suppose instead of totally random dice, the each die represents a factor: inherent skill of the team, officiating calls in your favor, luck, etc. Some of the dice are rolled fresh every game, such as luck or officiating, but some are based on skill, coaching, or other persistent factors and are not actually rolled for each game, but predetermined beforehand. However, no one actually can see the individual dice when you play; the totals are counted by a referee, and a winner (or tie) is determined without anyone knowing the exact counts.  Now, two sides match up, and they only roll four dice each, and get points from the fifth (skill) die at the assigned levels, with the winner having the highest total from all five dice. With KRACH ratings, we can sort of reverse engineer the results and estimate the inherent skill die for each side after a reasonable number of games, since it is theoretically the same every time. (This is why KRACH is unavailable early in the season.) We still know nothing about the other dice in each game.

When two sides have very similar (identical) KRACH ratings, then their assigned dice values are the same, and the game is a toss-up - we cannot predict the outcome any better than if we didn't have KRACH ratings. When two sides have vastly different ratings, such that one side has a six and the other side has a one, then we know that the side with the six will win more often. When we say that KRACH explains 19% (about 20%, or 1/5) of the variance, it's similar to saying KRACH tells us the outcome of one of five dice.

When the KRACH table is calculated, the values in it are a mathematical best guess at what the number is: equivalent to saying "the skill die for Cornell is most likely a five" but we cannot tell with great precision. What to my knowledge was never done before yesterday was any analysis of how good a guess that table was. What the new table I made adds is upper and lower bounds for that guess, the rough equivalent of saying "the skill die for Cornell is most likely a five, but it could be as high as six or as low as four." Since we can never see the individual components, we can't really tell with great precision what the KRACH values are.

I hope this helps. And I'd also point out that Newman being my real name and an affinity for Seinfeld characters are not mutually exclusive events.

ugarte

[Q]Newman Wrote: With KRACH ratings, we can sort of reverse engineer the results and estimate the inherent skill die for each side after a reasonable number of games, since it is theoretically the same every time. (This is why KRACH is unavailable early in the season.) We still know nothing about the other dice in each game.[/q]Needless to say, I understood this better than [q]jtwcornell91 Wrote: One thing I have noticed by skimming the discussion is that y'all seem, at least in the explanations, to be taking a Bayesian perspective on what is fundamentally a frequentist rating system. The confidence intervals don't bound the possible values of a team's KRACH rating, they bound the range of values which would predict results consistent with the ones we see.[/q]Though after reading Newman, I better understood Whelan. Thank you to both of you for trying to stoop to my level.

DeltaOne81

Although to be fair, this thread was not started as KRACH for dummies, as there really is no such things. Even basic KRACH is fairly complex mathematically, though easy enough to read.

RPI and PWR for dummies however, is much more doable.

billhoward

Most online threads degenerate to unpleasantness and name-calling. Here it moves higher and upward in the noblest of Ivy League traditions. While still leaving us time to name-call the Harvard swells.

KRACH may be complex mathmatecally, but it may be possible to explain what it does so that people of normal abilities (eg liberal arts majors) have some clue was to what's going on. You don't have to know about stoichiometric combustion in order to step on the gas when the light turns green.

Cool thread and again, thank you to all who offered ideas. You saw how some people didn't know what RPI meant if it wasn't wearing red, or that TUC wasn't the German safety certification.

David Harding


Robb

[Q]KeithK Wrote:
Bill, you must really have a different Cornell hockey fandom frame of reference than I do if you question whether someone is really named Newman.  [/q]

I don't even know this '98 Newman.  Perhaps related to the Lab Newmans, based on his/her postings?  Certainly beyond what I'd expect from any of *my* recent Newman relatives (no offense, gang).  Some of my Potter relatives, on the other hand...

Let's Go RED!

Beeeej

He's my little brother's little brother's little brother's little brother, if I remember correctly...

Beeeej
Beeeej, Esq.

"Cornell isn't an organization.  It's a loose affiliation of independent fiefdoms united by a common hockey team."
   - Steve Worona

Newman

[Q]Beeeej Wrote:
 He's my little brother's little brother's little brother's little brother, if I remember correctly...
Beeeej[/q]
I think you're short one, but you get the gist. And for those of you who don't know me: I was  a Tuba player back in the day, and now my schooling at Northwestern keeps me far from Lynah most of the time, but I've been to the Michigan games in recent years.

Beeeej

[Q]Newman Wrote:
[Q2]Beeeej Wrote:
He's my little brother's little brother's little brother's little brother, if I remember correctly...[/Q]
I think you're short one, but you get the gist.[/q]

Hm... Beeeej->Stu->Ralph->Scrog->Newman isn't correct?  Who am I missing?  :-{)}

Beeeej
Beeeej, Esq.

"Cornell isn't an organization.  It's a loose affiliation of independent fiefdoms united by a common hockey team."
   - Steve Worona