Or the pep band and marching band.
This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.
Show posts MenuQuote from: TrotskyThe statement that the probabilities are too high has not been demonstrated empirically. They may be, they may not, but "feeling" they are too high is not a mathematical argument.
Easy enough to run the numbers. If people want to put some reality behind their gut feelings, they should.
Quote from: TrotskyA 2-loss season and finishing with a 9-game wining streak wouldn't be bad.+1
But man this would be disappointing.
Yes, it's the prudent thing to do.
Stupid prudence.
Quote from: adamwI'd put the odds of having fans at Lake Placid or any NCAA games at 0.47%
math
Quote from: adamwQuote from: KGR11Here's an example of the issue with using KRACH to predict final pairwise:
Cornell (KRACH Rating: 526) is playing St Lawrence (KRACH Rating: 11) in a few weeks. My understanding is that the ratings can be used to come up with a pseudo-record between the two teams (Cornell with 526 wins, St Lawrence with 11). The Monte Carlo simulation uses this record to determine how often Cornell wins. In this case, the model predicts they win 98% of the time.
jfeath's regression analysis from 2 years ago shows that a team with a KRACH winning percentage of 100% theoretically wins about 83.4% of the time, 14.5% lower than what KRACH states for the Cornell-St Lawrence game.
I think it makes sense for 83.4% to be an upper bound on winning percentage. Any goalie can have an incredible/incredibly bad day. Also, the fact that there are ties in hockey means that the winning percentage should be more weighted to 50% than a sport where you can't have ties.
Ideally, jfeath's regression analysis would be an in-between step in the Pairwise probability matrix to convert the KRACH winning percentages (which show what happened to date) to predictive winning percentages.
1st - this kind of disparity is extreme. St. Lawrence is historically lousy right now. I'd venture to say, Cornell probably would win 98% of the time.
2nd - ties are taken into consideration with a hard-coded 19% chance - which is already acting as a smoothing mechanism. It's probably not accurate to hard-code that value for every matchup - but it does act as a "smoother," so to speak. This lowers Cornell's chance of a win to closer to the 83% you're referring to. Cornell is winning 98% of the 81% percent of non-ties. So, 80% - plus 9.5%'s worth of win value (i.e. points) via the tie.
Quote from: BearLoverCasually getting screwed every single year.Yeah, that decision is screwed up.
Quote from: adamwQuote from: TrotskySeems to me that HC winning would be a godsend for the committee since it locks down a presumptive patsy and a site. St. Cloud gets the Bemidji 2009 Memorial Reward and goes to Worcester and the committee only has to worry about the 1/4 permutations of the other 3 sites.
It's iron clad that the committee will never cross over a tier line, right? (e.g., they would never move the #4 overall to a 2-seed to duck a North Dakota "4 seed home game." If that's set in stone I don't see a lot of chaos -- just somebody getting inevitably and deterministically well and truly fucked. I would expect that to be the 4. If Mankato sneaks into the 4 I'd practically guarantee it -- Mankato is closer to Sioux Falls than is NoDak (as is, for that matter, St. Cloud. The West is big.).
That is correct - not crossing over is sacrosanct - though I've argued many times it shouldn't be, in extreme cases.
You're right that it's not chaos, in the sense that it makes things obvious about who goes where, but it is messier, for sure. Even if North Dakota doesn't make it, but Holy Cross does ... it means that instead of (relatively) nearby Sioux Falls, St. Cloud will be forced to go to Worcester. We already know St. Cloud can't go to Sioux Falls if North Dakota was there - but Holy Cross' involvement would be an extra scenario where St. Cloud can't go to Sioux Falls.
Quote from: jfeath17While you can't perfectly verify a prediction model, you can get an idea of its performance by separating the past data into training and testing sets. The fact that we are trying to predict probabilities and not simple classification does make it much more difficult to evaluate the performance. For classification problems the predictor is either right or wrong so it is easy to state a accuracy percentage. We however cannot directly observe the outcome probability of some matchup but only outcome of one trial of this matchup. This brings us to what I am attempting to do. By looking at the outcomes of many games with a similar krach predicted winning percentage we can come up with an estimate for the actual winning percentage of a team in this matchup.
My methodology for this was to use a gaussian weighted average of the games centered at varying krach probabilities. I calculated the winning percentage using these weights. I also used the weights to come up with average krach probability (this doesn't necessarily line up with the center of the gaussian particularly at the endpoints where all the games are to one side or the other). This is basically the logical extrapolation of the binning that was suggested earlier in the thread. The binning was actually the first analysis I did but the data didn't look good. I think the major improvement here is not that I am using the gaussian to come up with weights (that is probably overkill), but that I am using the weighted average of the krach probabilities rather than the center of the bin. What this trend line looks like can be changed significantly by changing the std dev of the gaussian (effectively changing the bin size). Basically the larger the bin the more underfit and the smaller the more overfit.
I also sought to measure the performance of KRACH in another way by looking at the R2 (Coefficient of Determination) The Wikipedia page is a pretty good explanation of this. The R^2 value can be looked at as the percentage of variance in the dependent variable (game outcomes) that can be explained by the independent variable (krach probability, etc..). These probabilities are all very low which makes sense since there is a lot of variability in the outcome of hockey games.
Independent Variable | R^2
--------------------------------------------------
Krach Probability | 0.023
Logistic Regression | 0.100
Linear Fit on Gaussian Average (0.1) | 0.112
(y=.749x+.126)
Another improvement I made was to include the inverse of each game (prob = 1-prob and swap wins and losses). This improves the fit around 0.5 since it is a little nonsensical if the matchup of two equal KRACH teams is not 0.5 in a model only dependent on KRACH.
One final point which I think has been established at this point, but I want to make sure we are all on the same page. Predictive models are going to have some subjectivity built into them. It is great that KRACH has no subjectivity and is a mathematically pure ranking for its goal of NCAA seeding since that needs to be "fair" and should be based on the actual outcomes of the games. However when creating a predictive model, we unfortunately do not have the luxury of a mathematically pure system. There are parameters and methods that must be chosen both when designing a predictor and measuring the performance. It is the designers goal to choose these such that the predictor is not over/underfit or have any bias's built in.
Quote from: TrotskyQuote from: adamwQuote from: abmarksit would be interesting to run a KRACH computation for last year's full NHL regular season, for example, and then see if the numbers pass people's gut checks or not.
This is honestly an unnecessary exercise. For past results, it's hard to improve on KRACH. The KRACH ratings, if you played the schedule that already happened, would come out to the actual results. That's the whole point of KRACH's existence.
How about this exercise. Number all the NC$$ games in chron order. Calculate KRACH from the odd number games. Now compare how the even numbered games turned out against KRACH "predictions."
I know, I know, I know that KRACH reviews a data set and is not designed to be predictive. But... can you do that for shits and giggles? I'm not even sure how we'd interpret the results. What constitutes a reliable or unreliable percentage of accuracy? I mean, hopefully it's over 50%. Hopefully it's better than just taking whoever has the better winning percentage excluding games against each other.
Another method: start on game 1 and just march through the list constantly recalculating KRACH and using that as the prediction against the next game. Or since obviously KRACH gets better as the season goes on, iterate through say the first 10% of games and only start predicting after that. Now get your accuracy score. That's truly abusing KRACH as predictive.
Quote from: BearLoverI think this discussion is getting old too, but since some people keep saying those criticizing the model are doing so based on "feel," I just want to say that we really aren't. (a) jfeath17 already showed KRACH overstates the chances of higher-ranked teams winning an individual game. (b) When combining several artificially inflated individual probabilities together (Cornell's chances of winning the quarters, semis, and finals) to form one joint probability (Cornell winning the ECAC tournament), you end up with a very, very overly inflated likelihood (the 55% chance of Cornell winning the ECAC). (c) There are no betting odds for any NHL game that come close to the odds this model is assigning many games every weekend.
Quote from: adamwThe beauty of it is, what anyone thinks of Cornell's chances vs. one team or another doesn't matter. KRACH is what it is. And there is nothing better that perfectly captures PAST results. There are many "flaws" if you will to the model when it comes to projecting odds of winning future games - but I don't think they're flaws. They're just incomplete. All of the reasons stated are valid. But, as someone said, you'd need to come to the issue with actual valid math, and a better algorithm, before getting all hot and bothered about it. Until then, saying that you "feel" that 52% chance is "flawed" is just as "flawed" of an argument as anything else.
Polls are a shit-ton more flawed than this is - doesn't stop Jim from posting them... I've given him grief about it in the past, but all in fun. Would never tell him to stop.
Feel free to point out things all you want. But until you have a better model, and are willing to program it, then jeebus h. criminy, let people discuss it. It does a pretty fair job of giving you a portrait of what could happen. I think everyone here (unlike many other places) is smart enough to know to take it with some grain of salt. But it's as good a guideline as you've got.