Bracketology for 2020 NCAAs

KenP · February 14, 2020, 11:39:24 AM

Quote from: BearLover
Quote from: adamw
Quote from: RobbThat's why there would be a range of final PWR ratings for each team - in some of the trials, Cornell will be unlucky and lose a bunch of games to worse teams, and in some trials they won't. That's where the histogram comes in - so you can see the range of possibilities and get an estimate of how likely each of those possibilities appears to be (again, assuming that each team really is exactly as good as it's current KRACH rating).

Maybe it really is "most likely" that Cornell winds up 3rd, but if the range of possibilities is that we could end up anywhere from 1st to 20th, then 3rd might only happen in 15% of the trials, so then you'd get a sense that it's far more likely that Cornell finishes worse than 3rd even though 3rd is the most likely individual result.

The Matrix shows a possibility of Cornell finishing anywhere from 1 to 13, currently. I'm really bad with graphical thingies - so, thus, the Matrix, as opposed to a histogram. No difference really.

https://www.collegehockeynews.com/ratings/probabilityMatrix.php

As I said above - the gaps in RPI are significant from 2 to 3 and, moreso, 3 to 4. That is one reason Cornell is pretty stable in that spot. Doesn't mean it will definitely happen.

But for the umpteenth straight year, BearLover is focusing on the wrong thing. The simulation is what it is. Obviously it only goes by past results. It can't do anything else. The gripe you have is with how KRACH works, not how the simulation works. Ratings are based on relatively small samples of past results, and can't possibly take into account many things. Therefore, adding an "uncertainty" node into the algorithm will smooth things out a bit -- though it's a somewhat generic addition, and isn't based on any sort of analysis of team strength, injuries, or whatever.

Getting hung up on trashing the simulations really misses the point. Though, as I've said, we're definitely trying to "improve" it.
Sorry, I don't agree. KRACH is not meant to be predictive. Therefore, a simulation based entirely on KRACH to predict future outcomes is flawed. The simulation is called the "Pairwise Probability Matrix" and assigns a percentage to all possible outcomes. To a casual viewer, like me, there is no note or disclaimer anywhere on the page that suggests the model is NOT predictive. In fact, below the chart there is a line that reads: "these numbers accurately reflect each team's possible finish to a high rate of precision."

I do not believe almost anyone reading the matrix understands it is limited to extrapolating existing KRACH over the rest of the season, or what this reliance on KRACH entails. I have seen dozens of posts on this forum over the past several years that look to the percentages as good predictive data. I don't think the simulation provides good predictive data.

I don't mean to sound overly harsh. CHN is one of my most-visited websites and I'd love to see the simulation improved. But in its current form, the simulation is not particularly helpful and I'd argue it is actively misleading.

KRACH has two main assumptions: (a) all games and results are reflective of overall quality of each team, and (b) that quality is consistent through the entire season including future games. You may wish for a system that incorporates more data to address those assumptions... but given those two statements KRACH absolutely is meant to be predictive.

BearLover · February 14, 2020, 12:13:40 PM

Quote from: KenP
Quote from: BearLover
Quote from: adamw
Quote from: RobbThat's why there would be a range of final PWR ratings for each team - in some of the trials, Cornell will be unlucky and lose a bunch of games to worse teams, and in some trials they won't. That's where the histogram comes in - so you can see the range of possibilities and get an estimate of how likely each of those possibilities appears to be (again, assuming that each team really is exactly as good as it's current KRACH rating).

Maybe it really is "most likely" that Cornell winds up 3rd, but if the range of possibilities is that we could end up anywhere from 1st to 20th, then 3rd might only happen in 15% of the trials, so then you'd get a sense that it's far more likely that Cornell finishes worse than 3rd even though 3rd is the most likely individual result.

The Matrix shows a possibility of Cornell finishing anywhere from 1 to 13, currently. I'm really bad with graphical thingies - so, thus, the Matrix, as opposed to a histogram. No difference really.

https://www.collegehockeynews.com/ratings/probabilityMatrix.php

As I said above - the gaps in RPI are significant from 2 to 3 and, moreso, 3 to 4. That is one reason Cornell is pretty stable in that spot. Doesn't mean it will definitely happen.

But for the umpteenth straight year, BearLover is focusing on the wrong thing. The simulation is what it is. Obviously it only goes by past results. It can't do anything else. The gripe you have is with how KRACH works, not how the simulation works. Ratings are based on relatively small samples of past results, and can't possibly take into account many things. Therefore, adding an "uncertainty" node into the algorithm will smooth things out a bit -- though it's a somewhat generic addition, and isn't based on any sort of analysis of team strength, injuries, or whatever.

Getting hung up on trashing the simulations really misses the point. Though, as I've said, we're definitely trying to "improve" it.
Sorry, I don't agree. KRACH is not meant to be predictive. Therefore, a simulation based entirely on KRACH to predict future outcomes is flawed. The simulation is called the "Pairwise Probability Matrix" and assigns a percentage to all possible outcomes. To a casual viewer, like me, there is no note or disclaimer anywhere on the page that suggests the model is NOT predictive. In fact, below the chart there is a line that reads: "these numbers accurately reflect each team's possible finish to a high rate of precision."

I do not believe almost anyone reading the matrix understands it is limited to extrapolating existing KRACH over the rest of the season, or what this reliance on KRACH entails. I have seen dozens of posts on this forum over the past several years that look to the percentages as good predictive data. I don't think the simulation provides good predictive data.

I don't mean to sound overly harsh. CHN is one of my most-visited websites and I'd love to see the simulation improved. But in its current form, the simulation is not particularly helpful and I'd argue it is actively misleading.
KRACH has two main assumptions: (a) all games and results are reflective of overall quality of each team, and (b) that quality is consistent through the entire season including future games. You may wish for a system that incorporates more data to address those assumptions... but given those two statements KRACH absolutely is meant to be predictive.

Okay, pardon me for my poor verbiage. Practically speaking though, KRACH is meant to be used as a way of ranking/seeding teams, which depends entirely on past performance. It does a terrible job of predicting future outcomes (at least over a sample as small as 25 or so games).

scoop85 · February 14, 2020, 12:16:23 PM

To me Bracketology is just something fun to check out this time of year for amusement, nothing more than that.

Too many people seem to put way too much stock in this stuff (Adam excluded because it's what he does for a living--or at least part of a living).

ugarte · February 14, 2020, 12:17:49 PM

Quote from: scoop85To me Bracketology is just something fun to check out this time of year for amusement, nothing more than that.

Too many people seem to put way too much stock in this stuff (Adam excluded because it's what he does for a living--or at least part of a living).

yeah i look at bracketology and say "neat. that would be ___ for us."

adamw · February 14, 2020, 12:47:25 PM

Quote from: BearLoverSorry, I don't agree. KRACH is not meant to be predictive. Therefore, a simulation based entirely on KRACH to predict future outcomes is flawed. The simulation is called the "Pairwise Probability Matrix" and assigns a percentage to all possible outcomes. To a casual viewer, like me, there is no note or disclaimer anywhere on the page that suggests the model is NOT predictive. In fact, below the chart there is a line that reads: "these numbers accurately reflect each team's possible finish to a high rate of precision."

I do not believe almost anyone reading the matrix understands it is limited to extrapolating existing KRACH over the rest of the season, or what this reliance on KRACH entails. I have seen dozens of posts on this forum over the past several years that look to the percentages as good predictive data. I don't think the simulation provides good predictive data.

I don't mean to sound overly harsh. CHN is one of my most-visited websites and I'd love to see the simulation improved. But in its current form, the simulation is not particularly helpful and I'd argue it is actively misleading.

I think you get way too hung up on this. Most people do, in fact, know this. Since most people probably know that there's nothing that can be done to entirely accurately predict the future. As someone else pointed out, it IS predictive - just not perfectly so - nothing is. The only thing anyone can do is to try to get better and better data to feed into the machine. There is literally no other way to predict the future any better.

I just don't know what you're looking for, or what you expect to find with any model that predicts future results. My guess is, you probably don't really know. I think you are getting hung up, and lack an understanding over what is being done, or can be done.

If you have data that's better than KRACH to rely upon - let me know.

adamw · February 14, 2020, 12:50:06 PM

Quote from: DafatoneCould things be more complex? Sure. You could use goal differentials or advanced possession/shot metrics to estimate team quality. You could also weight for recency (Cornell's been weaker over the last month or so than the rest of the season, so maybe we're "worse" than our total record). But that's all more complicated and not necessarily more accurate.

I do actually want to add in a "recency bias" - working on it. But, like you say, that's a subjective decision being introduced into the model, and there's no way of knowing that it's more or less accurate. Anything that tries to predict the future is going to be incomplete, obviously. Until future species master quantum mechanics.

KenP · February 14, 2020, 01:04:58 PM

Quote from: adamw
Quote from: DafatoneCould things be more complex? Sure. You could use goal differentials or advanced possession/shot metrics to estimate team quality. You could also weight for recency (Cornell's been weaker over the last month or so than the rest of the season, so maybe we're "worse" than our total record). But that's all more complicated and not necessarily more accurate.

I do actually want to add in a "recency bias" - working on it. But, like you say, that's a subjective decision being introduced into the model, and there's no way of knowing that it's more or less accurate. Anything that tries to predict the future is going to be incomplete, obviously. Until future species master quantum mechanics.

Two thoughts: (1) add discrete events to split up a season, e.g. "New BU Goalie" or "Player Injured". Somehow identify the difference in KRACH pre- and post-event? (2) I think there are statistical tests to look at the data and determine if more than one regression line is appropriate. Paired t-test? You'd have to ask an actual statistician. But potentially you could pinpoint those pivot moments and identify the "recent" bias from the data.

adamw · February 14, 2020, 01:05:49 PM

Quote from: BearLoverOkay, pardon me for my poor verbiage. Practically speaking though, KRACH is meant to be used as a way of ranking/seeding teams, which depends entirely on past performance. It does a terrible job of predicting future outcomes (at least over a sample as small as 25 or so games).

It's not that KRACH does a poor job .... EVERYTHING does a poor job predicting the future. KRACH is the best tool we have. If you had anything better, you'd win a lot of money in Vegas.

adamw · February 14, 2020, 01:07:03 PM

Quote from: KenPAdam, I have a suggestion. Cornell has guaranteed itself an ECAC home playoff series, i.e. our worst-case ECAC rank is 8. Can you do something similar for NCAA participation? Specifically:

Create a worst-case simulation for each team. Set their games to losses, rerun the Monte Carlo simulation for the rest of the field, and grab the lowest PWR from the modified simulation. Add that as a

Everyone likes to know when their team officially "clinches" a playoff berth... this approach would give fans that answer.

P.S. other comments / suggestions:
* check your number formatting. Some values round up to "1.0%" while others round down to "1%"
* repeat table header at the bottom.
* not sure if you can have a slider bar at both the top and bottom but that would be nice too
* consider similar best case scenarios. playoffstatus.com has that which is why i saw Vermont is officially out of the race.
* for the best case and worse data... either show as additional columns or use that information and enter "x" for impossible year-end rankings. ("Impossible" may not be the right word given this is only 20,000 simulations.. but hopefully you and your readers will get the point.)

All good suggestions. I'm not minimizing them. Just - time is an issue.

marty · February 14, 2020, 01:12:48 PM

Quote from: BearLover
Quote from: adamw
Quote from: RobbThat's why there would be a range of final PWR ratings for each team - in some of the trials, Cornell will be unlucky and lose a bunch of games to worse teams, and in some trials they won't. That's where the histogram comes in - so you can see the range of possibilities and get an estimate of how likely each of those possibilities appears to be (again, assuming that each team really is exactly as good as it's current KRACH rating).

Maybe it really is "most likely" that Cornell winds up 3rd, but if the range of possibilities is that we could end up anywhere from 1st to 20th, then 3rd might only happen in 15% of the trials, so then you'd get a sense that it's far more likely that Cornell finishes worse than 3rd even though 3rd is the most likely individual result.

The Matrix shows a possibility of Cornell finishing anywhere from 1 to 13, currently. I'm really bad with graphical thingies - so, thus, the Matrix, as opposed to a histogram. No difference really.

https://www.collegehockeynews.com/ratings/probabilityMatrix.php

As I said above - the gaps in RPI are significant from 2 to 3 and, moreso, 3 to 4. That is one reason Cornell is pretty stable in that spot. Doesn't mean it will definitely happen.

But for the umpteenth straight year, BearLover is focusing on the wrong thing. The simulation is what it is. Obviously it only goes by past results. It can't do anything else. The gripe you have is with how KRACH works, not how the simulation works. Ratings are based on relatively small samples of past results, and can't possibly take into account many things. Therefore, adding an "uncertainty" node into the algorithm will smooth things out a bit -- though it's a somewhat generic addition, and isn't based on any sort of analysis of team strength, injuries, or whatever.

Getting hung up on trashing the simulations really misses the point. Though, as I've said, we're definitely trying to "improve" it.
Sorry, I don't agree....

Just "add" nauseam and repeat. ;-)

KGR11 · February 14, 2020, 01:16:27 PM

Here's an example of the issue with using KRACH to predict final pairwise:

Cornell (KRACH Rating: 526) is playing St Lawrence (KRACH Rating: 11) in a few weeks. My understanding is that the ratings can be used to come up with a pseudo-record between the two teams (Cornell with 526 wins, St Lawrence with 11). The Monte Carlo simulation uses this record to determine how often Cornell wins. In this case, the model predicts they win 98% of the time.

jfeath's regression analysis from 2 years ago shows that a team with a KRACH winning percentage of 100% theoretically wins about 83.4% of the time, 14.5% lower than what KRACH states for the Cornell-St Lawrence game.

I think it makes sense for 83.4% to be an upper bound on winning percentage. Any goalie can have an incredible/incredibly bad day. Also, the fact that there are ties in hockey means that the winning percentage should be more weighted to 50% than a sport where you can't have ties.

Ideally, jfeath's regression analysis would be an in-between step in the Pairwise probability matrix to convert the KRACH winning percentages (which show what happened to date) to predictive winning percentages.

Jeff Hopkins '82 · February 14, 2020, 01:24:25 PM

Quote from: marty
Quote from: BearLover
Quote from: adamw
Quote from: RobbThat's why there would be a range of final PWR ratings for each team - in some of the trials, Cornell will be unlucky and lose a bunch of games to worse teams, and in some trials they won't. That's where the histogram comes in - so you can see the range of possibilities and get an estimate of how likely each of those possibilities appears to be (again, assuming that each team really is exactly as good as it's current KRACH rating).

Maybe it really is "most likely" that Cornell winds up 3rd, but if the range of possibilities is that we could end up anywhere from 1st to 20th, then 3rd might only happen in 15% of the trials, so then you'd get a sense that it's far more likely that Cornell finishes worse than 3rd even though 3rd is the most likely individual result.

The Matrix shows a possibility of Cornell finishing anywhere from 1 to 13, currently. I'm really bad with graphical thingies - so, thus, the Matrix, as opposed to a histogram. No difference really.

https://www.collegehockeynews.com/ratings/probabilityMatrix.php

As I said above - the gaps in RPI are significant from 2 to 3 and, moreso, 3 to 4. That is one reason Cornell is pretty stable in that spot. Doesn't mean it will definitely happen.

But for the umpteenth straight year, BearLover is focusing on the wrong thing. The simulation is what it is. Obviously it only goes by past results. It can't do anything else. The gripe you have is with how KRACH works, not how the simulation works. Ratings are based on relatively small samples of past results, and can't possibly take into account many things. Therefore, adding an "uncertainty" node into the algorithm will smooth things out a bit -- though it's a somewhat generic addition, and isn't based on any sort of analysis of team strength, injuries, or whatever.

Getting hung up on trashing the simulations really misses the point. Though, as I've said, we're definitely trying to "improve" it.
Sorry, I don't agree....

Just "add" nauseam and repeat. ;-)

::deadhorse::

nshapiro · February 14, 2020, 01:29:49 PM

Quote from: marty
Quote from: BearLover
Quote from: adamw
Quote from: RobbThat's why there would be a range of final PWR ratings for each team - in some of the trials, Cornell will be unlucky and lose a bunch of games to worse teams, and in some trials they won't. That's where the histogram comes in - so you can see the range of possibilities and get an estimate of how likely each of those possibilities appears to be (again, assuming that each team really is exactly as good as it's current KRACH rating).

Maybe it really is "most likely" that Cornell winds up 3rd, but if the range of possibilities is that we could end up anywhere from 1st to 20th, then 3rd might only happen in 15% of the trials, so then you'd get a sense that it's far more likely that Cornell finishes worse than 3rd even though 3rd is the most likely individual result.

The Matrix shows a possibility of Cornell finishing anywhere from 1 to 13, currently. I'm really bad with graphical thingies - so, thus, the Matrix, as opposed to a histogram. No difference really.

https://www.collegehockeynews.com/ratings/probabilityMatrix.php

As I said above - the gaps in RPI are significant from 2 to 3 and, moreso, 3 to 4. That is one reason Cornell is pretty stable in that spot. Doesn't mean it will definitely happen.

But for the umpteenth straight year, BearLover is focusing on the wrong thing. The simulation is what it is. Obviously it only goes by past results. It can't do anything else. The gripe you have is with how KRACH works, not how the simulation works. Ratings are based on relatively small samples of past results, and can't possibly take into account many things. Therefore, adding an "uncertainty" node into the algorithm will smooth things out a bit -- though it's a somewhat generic addition, and isn't based on any sort of analysis of team strength, injuries, or whatever.

Getting hung up on trashing the simulations really misses the point. Though, as I've said, we're definitely trying to "improve" it.
Sorry, I don't agree....

Just "add" nauseam and repeat. ;-)

Most people enjoy a good magic show. Some people look at it and think 'It all is just fake' and dismiss it. They will never be satisfied until the magic is real, and critics of predictive tools will never be satisfied unless they are perfect. Magic will never be real, and predictive tools will never be perfect, and those types will live their lives unsatisfied.

adamw · February 14, 2020, 01:37:45 PM

Quote from: KGR11Here's an example of the issue with using KRACH to predict final pairwise:

Cornell (KRACH Rating: 526) is playing St Lawrence (KRACH Rating: 11) in a few weeks. My understanding is that the ratings can be used to come up with a pseudo-record between the two teams (Cornell with 526 wins, St Lawrence with 11). The Monte Carlo simulation uses this record to determine how often Cornell wins. In this case, the model predicts they win 98% of the time.

jfeath's regression analysis from 2 years ago shows that a team with a KRACH winning percentage of 100% theoretically wins about 83.4% of the time, 14.5% lower than what KRACH states for the Cornell-St Lawrence game.

I think it makes sense for 83.4% to be an upper bound on winning percentage. Any goalie can have an incredible/incredibly bad day. Also, the fact that there are ties in hockey means that the winning percentage should be more weighted to 50% than a sport where you can't have ties.

Ideally, jfeath's regression analysis would be an in-between step in the Pairwise probability matrix to convert the KRACH winning percentages (which show what happened to date) to predictive winning percentages.

1st - this kind of disparity is extreme. St. Lawrence is historically lousy right now. I'd venture to say, Cornell probably would win 98% of the time.

2nd - ties are taken into consideration with a hard-coded 19% chance - which is already acting as a smoothing mechanism. It's probably not accurate to hard-code that value for every matchup - but it does act as a "smoother," so to speak. This lowers Cornell's chance of a win to closer to the 83% you're referring to. Cornell is winning 98% of the 81% percent of non-ties. So, 80% - plus 9.5%'s worth of win value (i.e. points) via the tie.

KGR11 · February 14, 2020, 05:40:36 PM

Quote from: adamw
Quote from: KGR11Here's an example of the issue with using KRACH to predict final pairwise:

Cornell (KRACH Rating: 526) is playing St Lawrence (KRACH Rating: 11) in a few weeks. My understanding is that the ratings can be used to come up with a pseudo-record between the two teams (Cornell with 526 wins, St Lawrence with 11). The Monte Carlo simulation uses this record to determine how often Cornell wins. In this case, the model predicts they win 98% of the time.

jfeath's regression analysis from 2 years ago shows that a team with a KRACH winning percentage of 100% theoretically wins about 83.4% of the time, 14.5% lower than what KRACH states for the Cornell-St Lawrence game.

I think it makes sense for 83.4% to be an upper bound on winning percentage. Any goalie can have an incredible/incredibly bad day. Also, the fact that there are ties in hockey means that the winning percentage should be more weighted to 50% than a sport where you can't have ties.

Ideally, jfeath's regression analysis would be an in-between step in the Pairwise probability matrix to convert the KRACH winning percentages (which show what happened to date) to predictive winning percentages.

1st - this kind of disparity is extreme. St. Lawrence is historically lousy right now. I'd venture to say, Cornell probably would win 98% of the time.

2nd - ties are taken into consideration with a hard-coded 19% chance - which is already acting as a smoothing mechanism. It's probably not accurate to hard-code that value for every matchup - but it does act as a "smoother," so to speak. This lowers Cornell's chance of a win to closer to the 83% you're referring to. Cornell is winning 98% of the 81% percent of non-ties. So, 80% - plus 9.5%'s worth of win value (i.e. points) via the tie.

Hard-coding ties like that is a huge deal. You effectively changed the maximum winning percentage from 100% to 90%. That makes the simulation framework way better than I thought, since your max winning percentage is closer to jfeath's maximum winning percentage of 83%. The next time you publish a primer for the probability matrix, it might be worth including this.

I pointed to an extreme-disparity game because those are the games where KRACH overestimates the favorite team's winning percentage (per jfeath's analysis). That analysis shows that KRACH lines up pretty well when the favorite has a projected winning percentage of 70% or less. Somewhat of a moot point given the 19% tie input.

I'm tempted to propose a 50-1 bet for the Cornell-SLU game. Best case scenario, Cornell wins and I lose $2. Worst case scenario, Cornell loses and I win $100. Just seems a bit sacrilege.