2015-16

Started by Trotsky, March 13, 2015, 10:21:21 AM

Previous topic - Next topic

Tom Lento

Quote from: Towerroad
Quote from: Tom Lento
Quote from: Towerroad
Quote from: pfibigerAbsolutely. Here's the spreadsheet:

https://dl.dropboxusercontent.com/u/233950/scoring%20offense%20year-by-year.xlsx

Thanks, The R squared does not tell the whole story. The t stats on the regression coefficients are significantly different at well beyond the 99% level. There is a significant diffference between Cornell's scoring trend and D1 College Hockey as a whole.

Correlation is not causation. The cause is still open to discussion, the difference is not.

What? You can't make an assertion like that on 10 high variance data points for a team average metric compared with 10 data points representing a national average. For one thing, this is totally un-normalized. Cornell's numbers are, well, Cornell's numbers against Cornell's record. The league average is the global average across all teams. It is NOT the expected goal scoring for a league average team against Cornell's schedule, which is more or less what you'd need to draw meaningful conclusions about Cornell's offensive execution during that time span.

The best you can discern from this is Cornell's overall pattern is to be somewhere around a league average offense. That's it. Really.

Also, everyone keeps talking about 2015 dragging the slope down as an outlier year, but nobody has yet pointed out that 2003 does the same. These numbers are very noisy, and comparing slopes of raw averages over time is not going to help you understand much about what's going on or, at this data volume, even if anything meaningful is going on.

Sorry, but you can make those inferences at least on a statistical basis.

First of all, there are 14 data points for each series. I would like more data, who would not, but the stats work at some basic level.

Secondly, both coefficients are meaningfully differenct from the null hypothesis of 0 and differ from each other by meaningfull amounts.

As for differing variance. Yes they differ and they should but neither data set demonstrates heteroskedasity so there is no bias in the standard error of estimate.

You can make too much of this sort of analysis but you can't dismiss it. Our trend over the period is significantly different from the D1 average trend.

Back when I was in grad school my thesis advisor liked to go through a workflow where he would have his students compare zodiac signs to reported frequency of sexual intercourse. It turns out, statistically speaking, sagitarrians (I think, I'm not certain of the actual sign) got significantly more love than anybody else during the 1980s. Of course this was garbage - the samples involved were small enough that a chance shift in gender distribution caused reports from people born under that sign to have a significantly higher value than one would expect based strictly on population averages. Everything looked good in that analysis, too, but the data was still shit.

What you've just done is to quote your statistics 101 textbook and tell me, tautologically, that there is a statistically significant difference between two measures and therefore those measures are, statistically speaking, significantly different. What I'm telling you is that while that's true, it's also meaningless, because a comparison of the general aggregate against an individual team is inherently problematic.

The ECAC-only charts are a bit more reasonable for interpretation, but they still aren't properly normalized. Also, raw GFA is, by itself, a pretty bad measure. Maybe I'll actually take some time to look at some normalized trends and try to get a useful comparison together. I'm more interested in goal difference trends and evaluating the relative importance of offense to defense in general, particularly for the top 15-20% of teams, but maybe I can find a way to render Cornell-only results meaningful as anything other than an interesting sideline for all of us here.

None of this really has anything to do with next year, so I'll stop here unless I have some data that might be predictive of next year's results. If I do the general analysis I'm thinking about maybe I'll move it to a new thread where we can geek out about badly normalized team aggregate metrics.

Towerroad

Quote from: Tom Lento
Quote from: Towerroad
Quote from: Tom Lento
Quote from: Towerroad
Quote from: pfibigerAbsolutely. Here's the spreadsheet:

https://dl.dropboxusercontent.com/u/233950/scoring%20offense%20year-by-year.xlsx

Thanks, The R squared does not tell the whole story. The t stats on the regression coefficients are significantly different at well beyond the 99% level. There is a significant diffference between Cornell's scoring trend and D1 College Hockey as a whole.

Correlation is not causation. The cause is still open to discussion, the difference is not.

What? You can't make an assertion like that on 10 high variance data points for a team average metric compared with 10 data points representing a national average. For one thing, this is totally un-normalized. Cornell's numbers are, well, Cornell's numbers against Cornell's record. The league average is the global average across all teams. It is NOT the expected goal scoring for a league average team against Cornell's schedule, which is more or less what you'd need to draw meaningful conclusions about Cornell's offensive execution during that time span.

The best you can discern from this is Cornell's overall pattern is to be somewhere around a league average offense. That's it. Really.

Also, everyone keeps talking about 2015 dragging the slope down as an outlier year, but nobody has yet pointed out that 2003 does the same. These numbers are very noisy, and comparing slopes of raw averages over time is not going to help you understand much about what's going on or, at this data volume, even if anything meaningful is going on.

Sorry, but you can make those inferences at least on a statistical basis.

First of all, there are 14 data points for each series. I would like more data, who would not, but the stats work at some basic level.

Secondly, both coefficients are meaningfully differenct from the null hypothesis of 0 and differ from each other by meaningfull amounts.

As for differing variance. Yes they differ and they should but neither data set demonstrates heteroskedasity so there is no bias in the standard error of estimate.

You can make too much of this sort of analysis but you can't dismiss it. Our trend over the period is significantly different from the D1 average trend.

Back when I was in grad school my thesis advisor liked to go through a workflow where he would have his students compare zodiac signs to reported frequency of sexual intercourse. It turns out, statistically speaking, sagitarrians (I think, I'm not certain of the actual sign) got significantly more love than anybody else during the 1980s. Of course this was garbage - the samples involved were small enough that a chance shift in gender distribution caused reports from people born under that sign to have a significantly higher value than one would expect based strictly on population averages. Everything looked good in that analysis, too, but the data was still shit.

What you've just done is to quote your statistics 101 textbook and tell me, tautologically, that there is a statistically significant difference between two measures and therefore those measures are, statistically speaking, significantly different. What I'm telling you is that while that's true, it's also meaningless, because a comparison of the general aggregate against an individual team is inherently problematic.

The ECAC-only charts are a bit more reasonable for interpretation, but they still aren't properly normalized. Also, raw GFA is, by itself, a pretty bad measure. Maybe I'll actually take some time to look at some normalized trends and try to get a useful comparison together. I'm more interested in goal difference trends and evaluating the relative importance of offense to defense in general, particularly for the top 15-20% of teams, but maybe I can find a way to render Cornell-only results meaningful as anything other than an interesting sideline for all of us here.

None of this really has anything to do with next year, so I'll stop here unless I have some data that might be predictive of next year's results. If I do the general analysis I'm thinking about maybe I'll move it to a new thread where we can geek out about badly normalized team aggregate metrics.
I don't think we actually disagree. I used to teach stats and build econometric models in a different life and doing parametric stats on modest datasets is of limited value. On the other hand the data you have is the data you have.

The only conclusion that I come to is that given the Coaches statement about going back to the old ways and the very clear scoring trend there is no reason to believe that Cornell is going to be a goal scoring powerhouse next year. We probably did not need fancy or not so fancy stats to tell us that but it is the off season

KeithK

I look at this data and see a powerhouse squad from 2002 to 2003 which scored a lot (relatively speaking) followed by a trend that is pretty cose to that for the ECAC average from 2004 through 2014, then finally an out of family low value in 2015.

Is it fair to only look at 2004-2014 to assess the overall trend instead of 2002-2015?  Or 1996-2015? *shrug* There's so much variability here that one can make up whatever model he wants to try to learn something but it's not like there's a clear underlying process we're identifying.

I think everyone here is concerned about the extreme lack of scoring this year and would agree that if it were to continue that Cornell will not be competitive going forward.  I'm not convinced that the 2015 offensive numbers were due to a structural problem in the program. I would fully expect the team to regress back to the longer term trend and if so I think the team will be very competitive in 2016. t is definitely true that Schafer is almost certainly not going to ever field an offensive powerhouse.  but as others have suggested, this year's defense and a middle of the pack offense probably puts us in the tournament.

Towerroad

Quote from: KeithKI look at this data and see a powerhouse squad from 2002 to 2003 which scored a lot (relatively speaking) followed by a trend that is pretty cose to that for the ECAC average from 2004 through 2014, then finally an out of family low value in 2015.

Is it fair to only look at 2004-2014 to assess the overall trend instead of 2002-2015?  Or 1996-2015? *shrug* There's so much variability here that one can make up whatever model he wants to try to learn something but it's not like there's a clear underlying process we're identifying.

I think everyone here is concerned about the extreme lack of scoring this year and would agree that if it were to continue that Cornell will not be competitive going forward.  I'm not convinced that the 2015 offensive numbers were due to a structural problem in the program. I would fully expect the team to regress back to the longer term trend and if so I think the team will be very competitive in 2016. t is definitely true that Schafer is almost certainly not going to ever field an offensive powerhouse.  but as others have suggested, this year's defense and a middle of the pack offense probably puts us in the tournament.

I think the proper time span is probably 2000-2015 which would be teams that were all recruited and trained by Schafer. You are correct, the slope is heavily influenced by the performance of the teams in the early 2000's. But the reality is that our offense has been well below the national average for 4 of the last 5 years.

I agree with your assessment that the a team with a top 10 defense and middle of the pack offense has a good shot at the tournament. That belies the fact that we have a long way to go to get to an average offense and begs the question of whether there are defensive tradeoffs that would have to be made to improve the scoring. The gap between us and the national average is on the order of 0.5 goals per game (1 goal every 2 games). The big question is what do you have to do to achieve this not trival improvement. Nothing about our recent history suggests that we have an answer to that question.

KeithK

Quote from: TowerroadI agree with your assessment that the a team with a top 10 defense and middle of the pack offense has a good shot at the tournament. That belies the fact that we have a long way to go to get to an average offense and begs the question of whether there are defensive tradeoffs that would have to be made to improve the scoring. The gap between us and the national average is on the order of 0.5 goals per game (1 goal every 2 games). The big question is what do you have to do to achieve this not trival improvement. Nothing about our recent history suggests that we have an answer to that question.
We had a 0.5 goal per game drop this past season. While there may have been some structural reasons for that drop (Schafer has said he tried certain things that didn't work) I suspect that a lot of that drop (most of it?) was random variation. We had a bad year. I think that w'll score a bunch more goals next year due to regression, maybe enough to put us back near the middle of the pack offensively.

Does this mean that the coaching staff should sit back and trust that things will be better next year?  Of course not.  But I think that we as fans should hesitate before overreacting to hat might be a one year outlier.

BearLover

Quote from: KeithK
Quote from: TowerroadI agree with your assessment that the a team with a top 10 defense and middle of the pack offense has a good shot at the tournament. That belies the fact that we have a long way to go to get to an average offense and begs the question of whether there are defensive tradeoffs that would have to be made to improve the scoring. The gap between us and the national average is on the order of 0.5 goals per game (1 goal every 2 games). The big question is what do you have to do to achieve this not trival improvement. Nothing about our recent history suggests that we have an answer to that question.
We had a 0.5 goal per game drop this past season. While there may have been some structural reasons for that drop (Schafer has said he tried certain things that didn't work) I suspect that a lot of that drop (most of it?) was random variation. We had a bad year. I think that w'll score a bunch more goals next year due to regression, maybe enough to put us back near the middle of the pack offensively.

Does this mean that the coaching staff should sit back and trust that things will be better next year?  Of course not.  But I think that we as fans should hesitate before overreacting to hat might be a one year outlier.
Before the .5 GPG drop we were already awful on offense, though--last year just happened to be spectacularly bad.  Regression to our shitty mean isn't something we want. Plus, the defense may regress too.

Trotsky

Quote from: KeithKBut I think that we as fans should hesitate before overreacting to what might be a one year outlier.
2015 may have sharpened the concern, but it has been building long before this year.

There's no reason why a team can't have a punishing defense and then rely on counter attacks and exploitation of forced errors.  But the offense might have been hurt paradoxically by the absence of defensively solid players.  At their best, you always knew a Schafer team was going to force the opponent to cough up the puck again and again in their own end, at worst disrupting their breakout and at best giving us turnovers with a clear route to the net.  The last few years, it's been the opponent who has been doing that to us.  We used to "get bigger" as the game went along and the opponent was ground into dust by constant harassment.  Now we look like a wheezing big guy who just ran 5 miles, weak as a kitten and mentally fragile.

I'm not asking for Moulsons.  I'll settle for Babys (Babies?).

KeithK

Oh, I'm not saying everything is rosy.  But we're on here analyzing trends in offensive production and extrapolating to utter gloom and doom. Which I think is overreacting, particularly when trends are heavily influenced by one year that looks like an outlier.

Tom Lento

Quote from: BearLover
Quote from: KeithK
Quote from: TowerroadI agree with your assessment that the a team with a top 10 defense and middle of the pack offense has a good shot at the tournament. That belies the fact that we have a long way to go to get to an average offense and begs the question of whether there are defensive tradeoffs that would have to be made to improve the scoring. The gap between us and the national average is on the order of 0.5 goals per game (1 goal every 2 games). The big question is what do you have to do to achieve this not trival improvement. Nothing about our recent history suggests that we have an answer to that question.
We had a 0.5 goal per game drop this past season. While there may have been some structural reasons for that drop (Schafer has said he tried certain things that didn't work) I suspect that a lot of that drop (most of it?) was random variation. We had a bad year. I think that w'll score a bunch more goals next year due to regression, maybe enough to put us back near the middle of the pack offensively.

Does this mean that the coaching staff should sit back and trust that things will be better next year?  Of course not.  But I think that we as fans should hesitate before overreacting to hat might be a one year outlier.
Before the .5 GPG drop we were already awful on offense, though--last year just happened to be spectacularly bad.  Regression to our shitty mean isn't something we want. Plus, the defense may regress too.

Incidentally, 0.5 GPG is almost exactly the difference between Ferlin/Lowry's goal scoring production last year and Ferlin/Lowry's goal scoring production this year. I don't think this season was structural because it looks an awful lot like bad luck. It's entirely possible that a fully healthy senior class plus Ferlin scores at around the national average.

The 4 out of 5 bad years on offense isn't necessarily an alarming long term trend. It starts with a hangover year after a hugely talented class departed in 2010, and then you've got, essentially, one bad recruiting cycle. Is Cornell's performance over the last 3-5 seasons cause for concern? Sure. I think it's totally legitimate to be concerned, in 2015, that Schafer is not capable of returning the team to its winning ways of 2002-2010. But at the moment I don't think it makes a lot of sense to declare the program dead so long as Schafer is at the helm, which is the general tenor of a lot of the discussion these days.

I'd expect 2016 to be an improvement over this year on the offensive end, if only because it can't get much worse, but given the returning scoring on the team I think league average is about the *best* you can hope to get.

The interesting question, for me, is whether 2016 is a 2.2 - 2.1 goal difference kind of year with some promise on the horizon* or an effective repeat of this year with no clear talent spark in the freshman/sophomore classes to provide substantial hope for the near future.

* think Cornell, 2000-2001: 33 games, 16-12-5, 73 goals scored, 72 goals allowed, hugely talented sophomore class

marty

Quote from: Trotsky
Quote from: TowerroadProjections are a tricky business but we could be in real trouble in 20 years or so.
When we're scoring -2.5 goals per game?  ;-)

Paul Ehrlich would be proud!
"When we came off, [Bitz] said, 'Thank God you scored that goal,'" Moulson said. "He would've killed me if I didn't."

RichH

All I can say at this point is that it's a shame that jtwcornell91 doesn't check in here more often. Because this thread.

Robb

Heym I fit this line to these two points, and OMG MY R-SQUARED IS ONE!!!!  IT MUST MEAN SOMETHING!!1!1!
Let's Go RED!

Jim Hyla

Quote from: RichHAll I can say at this point is that it's a shame that jtwcornell91 doesn't check in here more often. Because this thread.

Because this thread needs him(?).
"Cornell Fans Made the Timbers Tremble", Boston Globe, March/1970
Cornell lawyers stopped the candy throwing. Jan/2005

Rosey

Quote from: Jim Hyla
Quote from: RichHAll I can say at this point is that it's a shame that jtwcornell91 doesn't check in here more often. Because this thread.

Because this thread needs him(?).

"Because noun". Language evolution.
[ homepage ]

Dafatone

Quote from: Kyle Rose
Quote from: Jim Hyla
Quote from: RichHAll I can say at this point is that it's a shame that jtwcornell91 doesn't check in here more often. Because this thread.

Because this thread needs him(?).

"Because noun". Language evolution.

Eh.  It's just a dropped/implied "of".  It's a language shift, sure, but it's not as big a deal as some people make it out to be.  At least in my opinion.