# Mathematical Models

Posted by Jeff Hopkins '82

**Mathematical Models**

**Posted by: Jeff Hopkins '82**(---.102.128.104.res-cmts.sm.ptd.net)

**Date:**February 26, 2018 07:13PM

For all of those who want to piss and moan about mathematical models, predictions, and similar subjects, do it here.

**Re: Mathematical Models**

**Posted by: RichH**(---.mycingular.net)

**Date:**March 03, 2018 02:50PM

This is the funniest thread.

**Re: Mathematical Models**

**Posted by: Jeff Hopkins '82**(1.102.1.---)

**Date:**March 03, 2018 08:05PM

RichH

This is the funniest thread.

Irony is funny, isn't it?

**Re: Mathematical Models**

**Posted by: CU77**(---.sb.sd.cox.net)

**Date:**March 03, 2018 10:24PM

KRACH win probabilities would be more accurate with an additional win and loss for each team against a fictitious "average" team. This is explained by Wes Colley in Section 3 of this white paper on his ranking method (which is similar to KRACH):

[www.colleyrankings.com]

[www.colleyrankings.com]

**Re: Mathematical Models**

**Posted by: jfeath17**(---.bstnma.fios.verizon.net)

**Date:**March 03, 2018 11:14PM

The Dealing With Perfection section on this page discusses this. It did used to be used in KRACH, but it was found that there are better ways to handle it. [elynah.com]

**Re: Mathematical Models**

**Posted by: jtwcornell91**(Moderator)

**Date:**March 04, 2018 10:23AM

jfeath17

The Dealing With Perfection section on this page discusses this. It did used to be used in KRACH, but it was found that there are better ways to handle it. [elynah.com]

I don't know that I'd say better. Ken originally put in the fictitious games to give everyone finite ratings, but we later worked out a workaround to allow the computation to run: [www.arxiv.org] . There's still the question of whether it makes sense to let the data tell you to expect perfection, which has been a subject of debate for a long time. (The add-a-win-and-a-loss trick is a version of the Bayes-Laplace rule of succession, which was spelled out in 1814.) For that matter it's not clear what the appropriate number of "fictitious games" is. You might guess two, one or even zero, but it's basically a measure of how much parity you expect. How mismatched are teams likely to be. My student and I recently found that for Major League Baseball, for example, the level of parity is rather high, corresponding to something like 50 fictitious games: [arxiv.org]

**Re: Mathematical Models**

**Posted by: abmarks**(---.hsd1.ma.comcast.net)

**Date:**March 04, 2018 03:49PM

jtwcornell91

jfeath17

The Dealing With Perfection section on this page discusses this. It did used to be used in KRACH, but it was found that there are better ways to handle it. [elynah.com]

I don't know that I'd say better. Ken originally put in the fictitious games to give everyone finite ratings, but we later worked out a workaround to allow the computation to run: [www.arxiv.org] . There's still the question of whether it makes sense to let the data tell you to expect perfection, which has been a subject of debate for a long time. (The add-a-win-and-a-loss trick is a version of the Bayes-Laplace rule of succession, which was spelled out in 1814.) For that matter it's not clear what the appropriate number of "fictitious games" is. You might guess two, one or even zero, but it's basically a measure of how much parity you expect. How mismatched are teams likely to be. My student and I recently found that for Major League Baseball, for example, the level of parity is rather high, corresponding to something like 50 fictitious games: [arxiv.org]

JTW- I read the paper about MLB and I have a question about table 5. Is the error cited the error for a specific team? Is it the error found for all individual teams? Or is it the error for all games and all teams?

Meaning: Is the April 15 error of 8.82 specific to a chosen single team (say the Yanks), or is it saying that when yo ulook at all teams individually you found 8.82 games of error across the board? etc.

THanks!

**Re: Mathematical Models**

**Posted by: CU77**(---.sb.sd.cox.net)

**Date:**March 04, 2018 05:36PM

Right, but two fictitious games is the minimum. This corresponds to a flat prior on win probability for each team. Less than two, and the prior sags in the middle and rises at the ends. This predicts that winning percentages will cluster around zero and one; in real life, this never ever happens; wining percentages (at the end of the season) always cluster around 0.5. Reproducing this in KRACH requires more than two fictitious games. And as you found in baseball, it can be a lot more than two.jtwcornell91

it's not clear what the appropriate number of "fictitious games" is. You might guess two, one or even zero, but it's basically a measure of how much parity you expect.

**Re: Mathematical Models**

**Posted by: jtwcornell91**(Moderator)

**Date:**March 04, 2018 07:07PM

abmarks

jtwcornell91

jfeath17

The Dealing With Perfection section on this page discusses this. It did used to be used in KRACH, but it was found that there are better ways to handle it. [elynah.com]

I don't know that I'd say better. Ken originally put in the fictitious games to give everyone finite ratings, but we later worked out a workaround to allow the computation to run: [www.arxiv.org] . There's still the question of whether it makes sense to let the data tell you to expect perfection, which has been a subject of debate for a long time. (The add-a-win-and-a-loss trick is a version of the Bayes-Laplace rule of succession, which was spelled out in 1814.) For that matter it's not clear what the appropriate number of "fictitious games" is. You might guess two, one or even zero, but it's basically a measure of how much parity you expect. How mismatched are teams likely to be. My student and I recently found that for Major League Baseball, for example, the level of parity is rather high, corresponding to something like 50 fictitious games: [arxiv.org]

JTW- I read the paper about MLB and I have a question about table 5. Is the error cited the error for a specific team? Is it the error found for all individual teams? Or is it the error for all games and all teams?

Meaning: Is the April 15 error of 8.82 specific to a chosen single team (say the Yanks), or is it saying that when yo ulook at all teams individually you found 8.82 games of error across the board? etc.

THanks!

It's the average of the errors for all the teams. See equation (25) on the same page. (The sd column shows the spread--measured as a standard deviation--of the errors among the teams, as defined in equation (26).)

**Re: Mathematical Models**

**Posted by: jtwcornell91**(Moderator)

**Date:**March 05, 2018 12:32AM

CU77

Right, but two fictitious games is the minimum. This corresponds to a flat prior on win probability for each team. Less than two, and the prior sags in the middle and rises at the ends. This predicts that winning percentages will cluster around zero and one; in real life, this never ever happens; wining percentages (at the end of the season) always cluster around 0.5. Reproducing this in KRACH requires more than two fictitious games. And as you found in baseball, it can be a lot more than two.jtwcornell91

it's not clear what the appropriate number of "fictitious games" is. You might guess two, one or even zero, but it's basically a measure of how much parity you expect.

I don't think there's necessarily a minimum. In the simpler case of Bernoulli trials with a single probability, the Jeffreys prior splits the difference between the Bayes-Laplace uniform-in-probability and the Haldane uniform-in-log-odds-ratio, and corresponds to one rather than two "fictitious trials". The distribution is peaked at p=0 and 1, but if you change variables to ln(p/(1-p)) it's peaked at zero (i.e., p=1/2).

If you consider a situation where there's no parity among the competitors (people drawn at random off the street arm-wrestling or something), you might imagine that if you grab a pair of people and have them compete again and again, one person is likely to win a high percentage of the time. Winning percentages don't cluster at zero and one because schedules are constructed to match competitively comparable teams. Consider this year's Olympic women's hockey: if they hadn't divided the eight teams into a "strong group" and a "weak group", you'd have had two team (USA and CAN) that won about 92% of their games, four teams with winning percentages close to 50% (FIN, OAR, SUI and SWE), and two teams with long-term winning percentahes around 8% (JPN and COR). Something similar might have happened with Men's DI around 2000 when the MAAC teams became full DI, but their winning percentages were inflated because they mostly played each other. (Compare winning percentage to RRWP in [www.elynah.com] )

Also, note that the quantity that has a uniform prior with two fictitious games is not the expected winning percentage against a balanced schedule, but the expected winning percentage against an average team. So in the extreme Haldane-like situation where the team's KRACH strengths are drawn from a uniform-in-log prior, you'd expect to get a situation where each team is infinitely better than the teams below them and infinitely worse than the teams above them. That gives you a uniform distribution of winning percentages (the kth best team out of n beats the n-k teams below them and loses to the k-1 teams above, for a winning percentage of (n-k)/(n-1) = 1 - (k-1)/(n-1)), not one that clusters around 0 and 1.

**Re: Mathematical Models**

**Posted by: Trotsky**(---.washdc.fios.verizon.net)

**Date:**March 05, 2018 07:41AM

jtwcornell91

CU77

Right, but two fictitious games is the minimum. This corresponds to a flat prior on win probability for each team. Less than two, and the prior sags in the middle and rises at the ends. This predicts that winning percentages will cluster around zero and one; in real life, this never ever happens; wining percentages (at the end of the season) always cluster around 0.5. Reproducing this in KRACH requires more than two fictitious games. And as you found in baseball, it can be a lot more than two.jtwcornell91

it's not clear what the appropriate number of "fictitious games" is. You might guess two, one or even zero, but it's basically a measure of how much parity you expect.

I don't think there's necessarily a minimum. In the simpler case of Bernoulli trials with a single probability, the Jeffreys prior splits the difference between the Bayes-Laplace uniform-in-probability and the Haldane uniform-in-log-odds-ratio, and corresponds to one rather than two "fictitious trials". The distribution is peaked at p=0 and 1, but if you change variables to ln(p/(1-p)) it's peaked at zero (i.e., p=1/2).

If you consider a situation where there's no parity among the competitors (people drawn at random off the street arm-wrestling or something), you might imagine that if you grab a pair of people and have them compete again and again, one person is likely to win a high percentage of the time. Winning percentages don't cluster at zero and one because schedules are constructed to match competitively comparable teams. Consider this year's Olympic women's hockey: if they hadn't divided the eight teams into a "strong group" and a "weak group", you'd have had two team (USA and CAN) that won about 92% of their games, four teams with winning percentages close to 50% (FIN, OAR, SUI and SWE), and two teams with long-term winning percentahes around 8% (JPN and COR). Something similar might have happened with Men's DI around 2000 when the MAAC teams became full DI, but their winning percentages were inflated because they mostly played each other. (Compare winning percentage to RRWP in [www.elynah.com] )

Also, note that the quantity that has a uniform prior with two fictitious games is not the expected winning percentage against a balanced schedule, but the expected winning percentage against an average team. So in the extreme Haldane-like situation where the team's KRACH strengths are drawn from a uniform-in-log prior, you'd expect to get a situation where each team is infinitely better than the teams below them and infinitely worse than the teams above them. That gives you a uniform distribution of winning percentages (the kth best team out of n beats the n-k teams below them and loses to the k-1 teams above, for a winning percentage of (n-k)/(n-1) = 1 - (k-1)/(n-1)), not one that clusters around 0 and 1.

tldr.

**Re: Mathematical Models**

**Posted by: Swampy**(---.ri.ri.cox.net)

**Date:**March 05, 2018 10:28AM

jtwcornell91

CU77

jtwcornell91

it's not clear what the appropriate number of "fictitious games" is. You might guess two, one or even zero, but it's basically a measure of how much parity you expect.

I don't think there's necessarily a minimum. In the simpler case of Bernoulli trials with a single probability, the Jeffreys prior splits the difference between the Bayes-Laplace uniform-in-probability and the Haldane uniform-in-log-odds-ratio, and corresponds to one rather than two "fictitious trials". The distribution is peaked at p=0 and 1, but if you change variables to ln(p/(1-p)) it's peaked at zero (i.e., p=1/2).

If you consider a situation where there's no parity among the competitors (people drawn at random off the street arm-wrestling or something), you might imagine that if you grab a pair of people and have them compete again and again, one person is likely to win a high percentage of the time. Winning percentages don't cluster at zero and one because schedules are constructed to match competitively comparable teams. Consider this year's Olympic women's hockey: if they hadn't divided the eight teams into a "strong group" and a "weak group", you'd have had two team (USA and CAN) that won about 92% of their games, four teams with winning percentages close to 50% (FIN, OAR, SUI and SWE), and two teams with long-term winning percentahes around 8% (JPN and COR). Something similar might have happened with Men's DI around 2000 when the MAAC teams became full DI, but their winning percentages were inflated because they mostly played each other. (Compare winning percentage to RRWP in [www.elynah.com] )

Also, note that the quantity that has a uniform prior with two fictitious games is not the expected winning percentage against a balanced schedule, but the expected winning percentage against an average team. So in the extreme Haldane-like situation where the team's KRACH strengths are drawn from a uniform-in-log prior, you'd expect to get a situation where each team is infinitely better than the teams below them and infinitely worse than the teams above them. That gives you a uniform distribution of winning percentages (the kth best team out of n beats the n-k teams below them and loses to the k-1 teams above, for a winning percentage of (n-k)/(n-1) = 1 - ((k-1)/(n-1)), not one that clusters around 0 and 1.

FYP. (-5%)

**Re: Mathematical Models**

**Posted by: Trotsky**(---.washdc.fios.verizon.net)

**Date:**March 05, 2018 10:33AM

Swampy

That gives you a uniform distribution of winning percentages (the kth best team out of n beats the n-k teams below them and loses to the k-1 teams above, for a winning percentage of (n-k)/(n-1) = 1 - ((k-1)/(n-1)), not one that clusters around 0 and 1.

FYP. (-5%)[/quote]

Wrong. Read it again. And don't tug on Superman's cape.

Edited 1 time(s). Last edit at 03/05/2018 10:34AM by Trotsky.

**Re: Mathematical Models**

**Posted by: Swampy**(---.ri.ri.cox.net)

**Date:**March 05, 2018 10:42AM

Trotsky

jtwcornell91

CU77

jtwcornell91

it's not clear what the appropriate number of "fictitious games" is. You might guess two, one or even zero, but it's basically a measure of how much parity you expect.

I don't think there's necessarily a minimum. In the simpler case of Bernoulli trials with a single probability, the Jeffreys prior splits the difference between the Bayes-Laplace uniform-in-probability and the Haldane uniform-in-log-odds-ratio, and corresponds to one rather than two "fictitious trials". The distribution is peaked at p=0 and 1, but if you change variables to ln(p/(1-p)) it's peaked at zero (i.e., p=1/2).

If you consider a situation where there's no parity among the competitors (people drawn at random off the street arm-wrestling or something), you might imagine that if you grab a pair of people and have them compete again and again, one person is likely to win a high percentage of the time. Winning percentages don't cluster at zero and one because schedules are constructed to match competitively comparable teams. Consider this year's Olympic women's hockey: if they hadn't divided the eight teams into a "strong group" and a "weak group", you'd have had two team (USA and CAN) that won about 92% of their games, four teams with winning percentages close to 50% (FIN, OAR, SUI and SWE), and two teams with long-term winning percentahes around 8% (JPN and COR). Something similar might have happened with Men's DI around 2000 when the MAAC teams became full DI, but their winning percentages were inflated because they mostly played each other. (Compare winning percentage to RRWP in [www.elynah.com] )

Also, note that the quantity that has a uniform prior with two fictitious games is not the expected winning percentage against a balanced schedule, but the expected winning percentage against an average team. So in the extreme Haldane-like situation where the team's KRACH strengths are drawn from a uniform-in-log prior, you'd expect to get a situation where each team is infinitely better than the teams below them and infinitely worse than the teams above them. That gives you a uniform distribution of winning percentages (the kth best team out of n beats the n-k teams below them and loses to the k-1 teams above, for a winning percentage of (n-k)/(n-1) = 1 - (k-1)/(n-1)), not one that clusters around 0 and 1.

tldr.

Unfortunately, jtw's comments do make mathematical sense. I must admit, I'm either unfamiliar with or have forgotten (when you get to me as old as me, you'll understand) Haldane, but it's certainly true that a log-odds transformation converts probabilities from a range of [0,1] to [-oo, +oo], with the probability distribution for two teams having an equal chance of winning having a peak a 0. The first part of that sentence does have an error in that the peak is at p=.5, not p=0 and 1.

In my earlier comments on this subject I deliberately simplified things by tacitly assuming each team's probability of winning is independent of their opponent. Obviously this is wrong, and jtw is correct in pointing out that any good predictor of outcomes should take into account the two teams as a pair rather than as the winner being independent who the opponent is. Even more, I'd say it needs to take into account the styles of play in any given match-up. If two teams emphasize defense, but one is better at it than the other, then the better one would be more likely to win. OTOH, if the opponent is a high-scoring team with so-so defense, the game's more likely to be a toss-up.

Oh, BTW, thanks for the link. Everyone on this list knows Harvard sucks, but it's nice every once in a while to be reminded that even popular culture knows how pretentious the assholes that go to Harvard can be.

Edited 1 time(s). Last edit at 03/05/2018 10:46AM by Swampy.

**Re: Mathematical Models**

**Posted by: Swampy**(---.ri.ri.cox.net)

**Date:**March 05, 2018 10:44AM

Trotsky

Swampy

That gives you a uniform distribution of winning percentages (the kth best team out of n beats the n-k teams below them and loses to the k-1 teams above, for a winning percentage of (n-k)/(n-1) = 1 - ((k-1)/(n-1)), not one that clusters around 0 and 1.

FYP. (-5%)

Wrong. Read it again. And don't tug on Superman's cape.[/quote]

Wait. I just counted parentheses: your original has two left parentheses and three right parentheses. How can this be correct, Kal El?

Edited 1 time(s). Last edit at 03/05/2018 10:46AM by Swampy.

**Re: Mathematical Models**

**Posted by: adamw**(---.phlapa.fios.verizon.net)

**Date:**March 05, 2018 10:49AM

I don't understand more than 2% of what John said - but I think I get the gist .... If you read the KRACH Explanation on College Hockey News, it's basically my "English" translation of everything that John explained to me. That's my role.

I think John's remarks though are very welcome and go a long way in explaining why no "better model" magically appears when it comes to forecasting. There are competing opinions on the models, and more to take into account than what others would lead to believe. That's why I always push back when people vehemently insist KRACH is "wrong" and there must be a better way. (see other thread)

John - I think you had a student who wanted to work with me on a better model at one time - but I wrote to the student, and then never heard back. That might've been a couple years ago - maybe more, I don't remember, time flies. ... But if anyone is interested, please pass them along.

I think John's remarks though are very welcome and go a long way in explaining why no "better model" magically appears when it comes to forecasting. There are competing opinions on the models, and more to take into account than what others would lead to believe. That's why I always push back when people vehemently insist KRACH is "wrong" and there must be a better way. (see other thread)

John - I think you had a student who wanted to work with me on a better model at one time - but I wrote to the student, and then never heard back. That might've been a couple years ago - maybe more, I don't remember, time flies. ... But if anyone is interested, please pass them along.

**Re: Mathematical Models**

**Posted by: Swampy**(---.ri.ri.cox.net)

**Date:**March 05, 2018 11:02AM

Swampy

Trotsky

Swampy

That gives you a uniform distribution of winning percentages (the kth best team out of n beats the n-k teams below them and loses to the k-1 teams above, for a winning percentage of (n-k)/(n-1) = 1 - ((k-1)/(n-1)), not one that clusters around 0 and 1.

FYP. (-5%)

Wrong. Read it again. And don't tug on Superman's cape.

Wait. I just counted parentheses:~~your~~theoriginal has two left parentheses and three right parentheses. How can this be correct, Kal El?

OK. Now I see what you're talking about. I think both

*The Elements of Style*and

*The Chicago Manual of Style*, would want something to separate the equation from the parenthesis marking the end of the parenthetical elaboration. Maybe put the equation on its own line, use brackets for one or the other, or at least leave some spaces between the two closing parentheses. No matter what, an equation like

(n-k)/(n-1) = 1 - (k-1)/(n-1)

is making some strong assumptions about the precedence of the operators. Does the equation want [1-(k-1)]/(n-1) or 1-[(k-1)/(n-1)]? Some computer languages would interpret the equation one way (i.e. they use left-to-right precedence), and others would interpret it the other way (i.e. they give * and / higher precedence than + and - ). I understand that mathematics generally uses the latter, but there's nothing lost, except ambiguity, by adding brackets.

All this is complicated by the fact that we're writing on a forum that will interpret a parentheses preceded by a quotation mark as a smiley ")."

Edited 3 time(s). Last edit at 03/05/2018 11:08AM by Swampy.

**Re: Mathematical Models**

**Posted by: Trotsky**(---.washdc.fios.verizon.net)

**Date:**March 05, 2018 11:09AM

This happens remarkably often and was definitely a Bad Decision by the Bureau of Smiley Standards at OSF or wherever.Swampy

All this is complicated by the fact that we're writing on a forum that will interpret a parentheses preceded by a quotation mark as a smiley ")."

**Re: Mathematical Models**

**Posted by: Trotsky**(---.washdc.fios.verizon.net)

**Date:**March 05, 2018 11:14AM

That scene gets my award for "best scene in worst movie."Swampy

Oh, BTW, thanks for the link. Everyone on this list knows Harvard sucks, but it's nice every once in a while to be reminded that even popular culture knows how pretentious the assholes that go to Harvard can be.

Barrett in the penalty box may be #2.

**Re: Mathematical Models**

**Posted by: jtwcornell91**(Moderator)

**Date:**March 05, 2018 12:07PM

Swampy

Swampy

Trotsky

Swampy

That gives you a uniform distribution of winning percentages (the kth best team out of n beats the n-k teams below them and loses to the k-1 teams above, for a winning percentage of (n-k)/(n-1) = 1 - ((k-1)/(n-1)), not one that clusters around 0 and 1.

FYP. (-5%)

Wrong. Read it again. And don't tug on Superman's cape.

Wait. I just counted parentheses:~~your~~theoriginal has two left parentheses and three right parentheses. How can this be correct, Kal El?

OK. Now I see what you're talking about. I think bothThe Elements of StyleandThe Chicago Manual of Style, would want something to separate the equation from the parenthesis marking the end of the parenthetical elaboration. Maybe put the equation on its own line, use brackets for one or the other, or at least leave some spaces between the two closing parentheses.

You're right, most journal style guides tell you to change one level of parentheses to brackets in that case.

No matter what, an equation like

(n-k)/(n-1) = 1 - (k-1)/(n-1)

is making some strong assumptions about the precedence of the operators. Does the equation want [1-(k-1)]/(n-1) or 1-[(k-1)/(n-1)]? Some computer languages would interpret the equation one way (i.e. they use left-to-right precedence), and others would interpret it the other way (i.e. they give * and / higher precedence than + and - ). I understand that mathematics generally uses the latter, but there's nothing lost, except ambiguity, by adding brackets.

The principle I learned on operator precedence is "multiplication takes precedence over addition, and put everything else in parentheses to be sure".

**Re: Mathematical Models**

**Posted by: abmarks**(---.hsd1.ma.comcast.net)

**Date:**March 05, 2018 06:00PM

jtwcornell91

The principle I learned on operator precedence is "multiplication takes precedence over addition, and put everything else in parentheses to be sure".

7th grade algebra class FTW here.

**Re: Mathematical Models**

**Posted by: jtwcornell91**(Moderator)

**Date:**March 08, 2018 11:31PM

adamw

I don't understand more than 2% of what John said - but I think I get the gist .... If you read the KRACH Explanation on College Hockey News, it's basically my "English" translation of everything that John explained to me. That's my role.

I think John's remarks though are very welcome and go a long way in explaining why no "better model" magically appears when it comes to forecasting. There are competing opinions on the models, and more to take into account than what others would lead to believe. That's why I always push back when people vehemently insist KRACH is "wrong" and there must be a better way. (see other thread)

John - I think you had a student who wanted to work with me on a better model at one time - but I wrote to the student, and then never heard back. That might've been a couple years ago - maybe more, I don't remember, time flies. ... But if anyone is interested, please pass them along.

If it's the student I'm thinking of, he ended up looking more into the game theory of how to design schedules for college football assuming a reasonable rating system. I'm planning to try to recruit another student soon, but I think it's likely to end up being fall at the earliest due to the semester schedule. There are basically two ways that KRACH is over-certain: first, as the OP points out, it should have a prior that keeps expected winning percentages from running away to 0 or 1. But also, even if everyone's rating is finite, they're still uncertain, and that uncertainty should be included in the estimated win probability.

**Re: Mathematical Models**

**Posted by: KenP**(137.75.68.---)

**Date:**March 09, 2018 01:21PM

It sounds like you are trying to replace a deterministic KRACH rating with a probabilistic distribution. It wouldn't be a normal distribution instead relative to rating. Higher ratings would have PDF weighted downwards and vice versa for near-zero ratings. From there you could assess the uncertainty in absolute chance-of-winning statements.

**Re: Mathematical Models**

**Posted by: jtwcornell91**(Moderator)

**Date:**March 10, 2018 12:32AM

KenP

It sounds like you are trying to replace a deterministic KRACH rating with a probabilistic distribution. It wouldn't be a normal distribution instead relative to rating. Higher ratings would have PDF weighted downwards and vice versa for near-zero ratings. From there you could assess the uncertainty in absolute chance-of-winning statements.

Well, it's not a matter of non-deterministic randomness, but of uncertain knowledge. KRACH is the maximum likelihood estimate of each team's strength based on their game results. This estimate has some uncertainty associated with it, although the interpretation of that from the frequentist perspective underlying maximum likelihood is not directly translatable into a probability distribution. But in the Bayesian framework, the game results give you a posterior probability distribution for the team strengths, and the width (and shape) of this distribution influence future predictions.

**Re: Mathematical Models**

**Posted by: KenP**(---.washdc.fios.verizon.net)

**Date:**March 10, 2018 08:00AM

My point is that a “perfect rating” would assess the true worthiness of a team vis a vis others. KRACH gets us as close as we can with the data but it still is not “perfect”. The Perfect rating for a team would be a random variable with mean/median of KRACH with a PDF that leans more towards the mean for extremely good or bad teams.jtwcornell91

KenP

It sounds like you are trying to replace a deterministic KRACH rating with a probabilistic distribution. It wouldn't be a normal distribution instead relative to rating. Higher ratings would have PDF weighted downwards and vice versa for near-zero ratings. From there you could assess the uncertainty in absolute chance-of-winning statements.

Well, it's not a matter of non-deterministic randomness, but of uncertain knowledge. KRACH is the maximum likelihood estimate of each team's strength based on their game results. This estimate has some uncertainty associated with it, although the interpretation of that from the frequentist perspective underlying maximum likelihood is not directly translatable into a probability distribution. But in the Bayesian framework, the game results give you a posterior probability distribution for the team strengths, and the width (and shape) of this distribution influence future predictions.

**Re: Mathematical Models**

**Posted by: Swampy**(---.ri.ri.cox.net)

**Date:**March 10, 2018 11:22AM

KenP

My point is that a “perfect rating” would assess the true worthiness of a team vis a vis others. KRACH gets us as close as we can with the data but it still is not “perfect”. The Perfect rating for a team would be a random variable with mean/median of KRACH with a PDF that leans more towards the mean for extremely good or bad teams.jtwcornell91

KenP

It sounds like you are trying to replace a deterministic KRACH rating with a probabilistic distribution. It wouldn't be a normal distribution instead relative to rating. Higher ratings would have PDF weighted downwards and vice versa for near-zero ratings. From there you could assess the uncertainty in absolute chance-of-winning statements.

Well, it's not a matter of non-deterministic randomness, but of uncertain knowledge. KRACH is the maximum likelihood estimate of each team's strength based on their game results. This estimate has some uncertainty associated with it, although the interpretation of that from the frequentist perspective underlying maximum likelihood is not directly translatable into a probability distribution. But in the Bayesian framework, the game results give you a posterior probability distribution for the team strengths, and the width (and shape) of this distribution influence future predictions.

If KRACH is a MLE, wouldn't a resampling estimate of the pdf's characteristics yield something like what KenP is advocating?

**Re: Mathematical Models - WSJ on RPI**

**Posted by: billhoward**(---.nwrk.east.verizon.net)

**Date:**March 10, 2018 01:10PM

[www.wsj.com]

Jo Craven McGinty, WSJ

Hardcore basketball nerds go nuts this time of year, not because the NCAA reveals which teams will compete in the Division I men’s basketball tournament, but because a controversial metric known as the ratings percentage index plays a significant role in the selection.

Don’t think of it as March Madness. Think of it as Methodology Mania.

RPI has been used to rank men’s college basketball teams since 1981, and that’s the problem: It’s outdated. It’s crude. And more sophisticated measures exist.

But the RPI persists.

Sixty-eight teams play in the NCAA’s premier basketball tournament, which begins next week and runs through early April. Thirty-two of those teams automatically qualify by winning their postseason conference tournaments; the other 36 schools are chosen by a 10-member selection committee that relies on “team sheets,” a sort of report card, with RPI rankings and other data.

Back in the day, RPI was the only tool designed to help the committee make objective decisions.

“It was pretty impressive for its time,” said Ken Pomeroy, a basketball analytics guru who now advocates ditching the metric. “Before that, there was no objective statistical process for selecting the field.”

RPI combines three factors: a team’s winning percentage, which counts for 25% of the score; the opponents’ winning percentage, which is 50%; and the opponents’ opponents’ winning percentage, which is 25%. The sum of the three factors is the RPI, and the team with the highest RPI is ranked No. 1.

One criticism of the original formula was that it didn’t acknowledge home-court advantage, even though teams tend to win about twice as many games at home as they do away. In 2004, the formula was tweaked so that now, home wins count 0.6, or just over half a win, while road victories count as 1.4 wins. The opposite is true for losses: Home losses count 1.4 while road losses count 0.6. Wins and losses at neutral sites each count as 1.

[continues (paywall)]

**Re: Mathematical Models - WSJ on RPI**

**Posted by: marty**(---.nycap.res.rr.com)

**Date:**March 10, 2018 01:56PM

The weights accounting for home-court advantage are applied to wins and losses before calculating the teams’ winning percentage. But those weights are not applied to the opponents’ or opponents’ opponents’ wins and losses, another point of contention.billhoward

[www.wsj.com]

Jo Craven McGinty, WSJ

Hardcore basketball nerds go nuts this time of year, not because the NCAA reveals which teams will compete in the Division I men’s basketball tournament, but because a controversial metric known as the ratings percentage index plays a significant role in the selection.

Don’t think of it as March Madness. Think of it as Methodology Mania.

RPI has been used to rank men’s college basketball teams since 1981, and that’s the problem: It’s outdated. It’s crude. And more sophisticated measures exist.

But the RPI persists.

Sixty-eight teams play in the NCAA’s premier basketball tournament, which begins next week and runs through early April. Thirty-two of those teams automatically qualify by winning their postseason conference tournaments; the other 36 schools are chosen by a 10-member selection committee that relies on “team sheets,” a sort of report card, with RPI rankings and other data.

Back in the day, RPI was the only tool designed to help the committee make objective decisions.

“It was pretty impressive for its time,” said Ken Pomeroy, a basketball analytics guru who now advocates ditching the metric. “Before that, there was no objective statistical process for selecting the field.”

RPI combines three factors: a team’s winning percentage, which counts for 25% of the score; the opponents’ winning percentage, which is 50%; and the opponents’ opponents’ winning percentage, which is 25%. The sum of the three factors is the RPI, and the team with the highest RPI is ranked No. 1.

One criticism of the original formula was that it didn’t acknowledge home-court advantage, even though teams tend to win about twice as many games at home as they do away. In 2004, the formula was tweaked so that now, home wins count 0.6, or just over half a win, while road victories count as 1.4 wins. The opposite is true for losses: Home losses count 1.4 while road losses count 0.6. Wins and losses at neutral sites each count as 1.

[continues (paywall)]

Because most of the RPI is based on the strength-of-schedule measures, some argue that a team with a poor winning percentage could still earn a respectable RPI by playing good teams, or, on the flipside, playing weaker opponents could perhaps unfairly reduce a team’s RPI.

To help make up for the RPI’s shortcomings, this year, for the first time, NCAA team sheets include five additional measures: two results-oriented metrics and three predictive metrics. The new measures are Michigan State University Assistant Athletic Director Kevin Pauga’s index; ESPN’s strength of record metric and basketball power index; Mr. Pomeroy’s rankings; and Jeff Sagarin’s ratings for USA Today.

While RPI, a results-oriented measure, is confined to wins, losses and strength of schedule, the basketball power index, as an example, is a predictive metric that considers factors such as the coach’s past performance and player injuries and setbacks.

(To what extent predictive measures should be used to pick teams for the tournament is another debate.)

The March Madness teams, their seeds and the tournament bracket will be announced this coming Sunday, and it could be the last year that RPI, at least in its current form, plays a leading role.

“We are in process of evaluating a different metric to potentially replace RPI,” said David Worlock, director of media coordination and statistics for the NCAA. “We hope to have something done prior to the 2018-19 season.”

The move would please many analysts and hoops fans, who would be happy to say “RIP” to the RPI.

Write to Jo Craven McGinty at Jo.McGinty@wsj.com

**Re: Mathematical Models**

**Posted by: jtwcornell91**(Moderator)

**Date:**March 10, 2018 02:26PM

Swampy

KenP

My point is that a “perfect rating” would assess the true worthiness of a team vis a vis others. KRACH gets us as close as we can with the data but it still is not “perfect”. The Perfect rating for a team would be a random variable with mean/median of KRACH with a PDF that leans more towards the mean for extremely good or bad teams.jtwcornell91

KenP

It sounds like you are trying to replace a deterministic KRACH rating with a probabilistic distribution. It wouldn't be a normal distribution instead relative to rating. Higher ratings would have PDF weighted downwards and vice versa for near-zero ratings. From there you could assess the uncertainty in absolute chance-of-winning statements.

Well, it's not a matter of non-deterministic randomness, but of uncertain knowledge. KRACH is the maximum likelihood estimate of each team's strength based on their game results. This estimate has some uncertainty associated with it, although the interpretation of that from the frequentist perspective underlying maximum likelihood is not directly translatable into a probability distribution. But in the Bayesian framework, the game results give you a posterior probability distribution for the team strengths, and the width (and shape) of this distribution influence future predictions.

If KRACH is a MLE, wouldn't a resampling estimate of the pdf's characteristics yield something like what KenP is advocating?

The standard error associated with the MLE, whether calculated analytically or estimated via resampling or some Monte Carlo method, is still a measure of the variability of the estimator, not the range of plausible values of the quantity being estimated. I.e., it says, if my true strength is 100, what's the spread of KRACH values I could espect due to the randomness of game outcomes. Or you could use it to generate a confidence interval, and ask, what's the range of true strengths or which the spread of probable KRACH values includes 100. But what you really want is, given the game results, what's the range of plausible underlying team strengths, and that's the Bayesian posterior distribution.

**Re: Mathematical Models**

**Posted by: adamw**(---.phlapa.fios.verizon.net)

**Date:**March 10, 2018 02:46PM

jtwcornell91

If it's the student I'm thinking of, he ended up looking more into the game theory of how to design schedules for college football assuming a reasonable rating system. I'm planning to try to recruit another student soon, but I think it's likely to end up being fall at the earliest due to the semester schedule.

I'll take any help I can get. Send them over whenever you/they can. Just make sure they already are equipped with your stellar knowledge of the topic at hand.

**Re: Mathematical Models**

**Posted by: billhoward**(---.nwrk.east.verizon.net)

**Date:**March 10, 2018 02:49PM

Creating a better formula to rank teams is God's work.

As is Adam's translating this stuff to English.

As is Adam's translating this stuff to English.

Sorry, only registered users may post in this forum.