Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - jfeath17

#1
Hockey / Re: 2023 ECAC Post Season
March 17, 2023, 07:51:29 PM
Quote from: dag14if anyone is posting from LP, can you please let the rest of us know when our game will start?  Thanks

~8:37
#2
Hockey / Re: 2023 ECAC Post Season
March 15, 2023, 02:21:01 PM
FYI from Cornell ticket site:

"Cornell fans are encouraged to purchase tickets in Section 3 - which can be accessed by visiting the Lake Placid Olympic Center ticketing site, clicking the lock icon on the top of the screen and entering the access code 'CORNELL23'"

You also save $10 using the code.
#3
Hockey / Re: Cornell - Northeastern
March 30, 2019, 02:51:14 PM
Tickets from Cornell are in the 105-106 corner
#4
Hockey / Re: Lake Placid 2019 Roll Call
March 22, 2019, 08:27:53 PM
I forget exactly what I needed to do and I don't have a computer around to try again.

I know that you do need to enable flash for the site and then I think there should be a button that appears to select seats. I think it's above the select price button. You may have to click the seat map.
#5
Hockey / Re: Ticket info & Who's going to Worcester?
March 18, 2018, 08:22:27 PM
Section 125
#6
Hockey / Re: Bracketology Starts
March 17, 2018, 10:20:07 PM
#7
Hockey / Re: Mathematical Models
March 03, 2018, 11:14:38 PM
The Dealing With Perfection section on this page discusses this. It did used to be used in KRACH, but it was found that there are better ways to handle it. http://elynah.com/tbrw/tbrw.cgi?krach
#8
Quote from: jkahn
Quote from: adamw
Quote from: jfeath17
Quote from: adamwBTW - I can't read your charts, so I have no idea what they're telling me. If there's an English translation, feel free.

My one sentence summary is that the current KRACH probabilities are biased towards the higher ranked team, a very simple modification which would greatly increase the accuracy is to change the formula to P(A Winning) = .749*(KRACH_A/(KRACH_A+KRACH_B)) + .126

Wait, that's English? :)  ... If you want to work with me on something going forward, feel free to drop me a line. adamw@collegehockeynews.com (same for anyone else who has chimed in here with something concrete to offer)
Basically, what's being suggested here is to use KRACH for 75% of the probability and split the other 25% equally.  I was actual thinking of suggesting a lesser dampening effect, such as using 90% KRACH and then adding that to 5% for each team, just based upon the feeling that even the weakest team should have at least a 5% chance - but that's just based upon gut feel, no data analysis.
Nevertheless, I do appreciate and enjoy the KRACH model, and do believe it's the fairest way of ranking teams.

Yes this exactly. Much better of a simplification than mine. :)
#9
Quote from: KGR11Many thanks to jfeath for this awesome work. One think I'd be curious to know is how the R-squared for the KRACH, Logistic Regression, and Linear Fit on Gaussian Average varies from year to year. This could give an indication of the extent (if any) that KRACH is a lesser predictor than your outputs.

I can try this out to just as a sanity check to make sure the models aren't overfitting. I may even do it on a completely new season to be sure.
#10
Quote from: adamwBTW - I can't read your charts, so I have no idea what they're telling me. If there's an English translation, feel free.

My one sentence summary is that the current KRACH probabilities are biased towards the higher ranked team, a very simple modification which would greatly increase the accuracy is to change the formula to P(A Winning) = .749*(KRACH_A/(KRACH_A+KRACH_B)) + .126
#11
Quote from: KGR11
Quote from: TrotskyAnother method: start on game 1 and just march through the list constantly recalculating KRACH and using that as the prediction against the next game.  Or since obviously KRACH gets better as the season goes on, iterate through say the first 10% of games and only start predicting after that.  Now get your accuracy score.  That's truly abusing KRACH as predictive.  :)

I think your second method is essentially what jfeath did, right?

Yes, that is basically what I did except I stepped through week by week and started in January. I now have updated data which includes 4 seasons and steps through on a daily basis starting in January.

Quote from: BearLover
Quote from: adamwIs KRACH not empirically based?
The future win probabilities inferred from KRACH aren't, because they're not verifiable by observation/experience.

What future probabilities of any kind are verifiable?

While you can't perfectly verify a prediction model, you can get an idea of its performance by separating the past data into training and testing sets. The fact that we are trying to predict probabilities and not simple classification does make it much more difficult to evaluate the performance. For classification problems the predictor is either right or wrong so it is easy to state a accuracy percentage. We however cannot directly observe the outcome probability of some matchup but only outcome of one trial of this matchup. This brings us to what I am attempting to do. By looking at the outcomes of many games with a similar krach predicted winning percentage we can come up with an estimate for the actual winning percentage of a team in this matchup.

My methodology for this was to use a gaussian weighted average of the games centered at varying krach probabilities. I calculated the winning percentage using these weights. I also used the weights to come up with average krach probability (this doesn't necessarily line up with the center of the gaussian particularly at the endpoints where all the games are to one side or the other). This is basically the logical extrapolation of the binning that was suggested earlier in the thread. The binning was actually the first analysis I did but the data didn't look good. I think the major improvement here is not that I am using the gaussian to come up with weights (that is probably overkill), but that I am using the weighted average of the krach probabilities rather than the center of the bin. What this trend line looks like can be changed significantly by changing the std dev of the gaussian (effectively changing the bin size). Basically the larger the bin the more underfit and the smaller the more overfit.


I also sought to measure the performance of KRACH in another way by looking at the R2 (Coefficient of Determination) The Wikipedia page is a pretty good explanation of this. The R^2 value can be looked at as the percentage of variance in the dependent variable (game outcomes) that can be explained by the independent variable (krach probability, etc..). These probabilities are all very low which makes sense since there is a lot of variability in the outcome of hockey games.

Independent Variable                 | R^2
--------------------------------------------------
Krach Probability                    | 0.023
Logistic Regression                  | 0.100
Linear Fit on Gaussian Average (0.1) | 0.112
        (y=.749x+.126)


Another improvement I made was to include the inverse of each game (prob = 1-prob and swap wins and losses). This improves the fit around 0.5 since it is a little nonsensical if the matchup of two equal KRACH teams is not 0.5 in a model only dependent on KRACH.


One final point which I think has been established at this point, but I want to make sure we are all on the same page. Predictive models are going to have some subjectivity built into them. It is great that KRACH has no subjectivity and is a mathematically pure ranking for its goal of NCAA seeding since that needs to be "fair" and should be based on the actual outcomes of the games. However when creating a predictive model, we unfortunately do not have the luxury of a mathematically pure system. There are parameters and methods that must be chosen both when designing a predictor and measuring the performance. It is the designers goal to choose these such that the predictor is not over/underfit or have any bias's built in.
#12
Hockey / Re: USCHO PWR discussion
February 20, 2018, 10:22:58 AM
I think they were more talking about recent results. Cornell hasn't won by more than 1 goal since Dartmouth and their age goal differential over the past 6 games is only 0.5
#13
Hockey / NCAA D1 Mens Hockey Revenues
February 19, 2018, 10:11:14 PM
This was posted on the r/collegehockey subreddit and I thought it may be of interest here.
https://i.imgur.com/12AdhE3.png
#14
Hockey / Re: 2018 ECAC Permutations
February 19, 2018, 10:04:14 PM
Some more details:

I used data from the past two complete seasons. I hope to add more seasons, but the dates for games on USCHO from 3 years ago seem to be in a mix of d/m/y and m/d/y which kinda breaks things. (On that note, if anyone knows of a easily parsable database of game results that would be great since I'm currently copying the table from USCHO into excel and exporting it as a csv.)

I step through the schedule week by week and update the KRACH rating for every team. Then , I calculate the KRACH based projection of winning percentage for the upcoming week's games. I always use the higher ranked teams winning likelihood so they are all within the 0.5-1 range. I then save the result of the game (tie, higher ranked won, higher ranked loss) along with the KRACH based likelihood. I start this process at the beginning of January to ignore the early season variability of KRACH (also to avoid any of the complexities of calculating KRACH on undefeated teams). Now that I am typing this up I realize that stepping through on a weekly basis isn't really necessary and I am thinking about changing it to a day by day step.

Now I am left with two variables, KRACH Prediction Win Likelihood and the actual result of the game. I tried a couple different things here to try to find a good correlation between the two. Based on my research, I think the best way is to use a Logistic Regression and those are the results shown in the above post. I don't consider myself an expert in this stuff at all so I very well could be making some bad assumptions here. If anyone has a better method to compare them, I'm interested to hear.

If anyone has any questions or suggestions for further things to try out, I'd love to hear them.
#15
Hockey / Re: 2018 ECAC Permutations
February 19, 2018, 09:32:07 PM
I started doing some analysis on how accurate using KRACH is to predict results. I will follow up in the next post with some more details on what I did, but the basic gist was I collected the KRACH based projected winning percentage for the better ranked team and the result of the game (win/tie/loss). I then used a logistic regression to this data using the KRACH prediction as the independent variable. I used the results of 1129 games over the two previous seasons.

KRACH | Result
------+-------
 0.50 | 0.5390
 0.55 | 0.5750
 0.60 | 0.6102
 0.65 | 0.6444
 0.70 | 0.6771
 0.75 | 0.7081
 0.80 | 0.7374
 0.85 | 0.7647
 0.90 | 0.7899
 0.95 | 0.8131
 1.00 | 0.8343