USCHO.com poll, 2/17

Started by Section A, February 17, 2003, 03:50:34 PM

Previous topic - Next topic

Keith K

Well, I've managed to get part of the problem set up as a regression problem.  But it doesn't quite work right (no surprise there): my results currently say that Dartmouth received -183.667 first place votes, which counter their 40 votes for each of 2nd, 3rd, 4th, 7-9th and 12-14th places.  Dartmouth is really popular!

(Haven't quite figured out how to mathematically restrict each team to one vote per ballot yet. And the pesky integer thing...)

Shorts

I think the way to go about trying to solve this would be to set up a three dimensional array of variables, x[i,j,k].  Let i iterate through each ballot/voter, let j iterate through each possible placings (ie 1 - 15), and let k iterate through each team.  A total of 40*15*21 = 12,600.  A little much for my student edition solver, but not that far out.


The constraints would be:

For each voter-placing [i,j], the sum over k must equal 1.  (600 constraints).  There might be a way to simplify this, but I'm not entirely sure.
For each team k, the sum over all voters and placings, multiplied by the appropriate weights, is equal to the number of points they got (21 constraints).
Each x[i,j,k] is required to be either 0 or 1.

If you wanted to, you could try to set it up as a simple linear algebra problem, where x is a 12600 entry vector, b is a vector with 621 entries (the first 600 would be 1's, the next 21 would be the various numbers of points received).  The A matrix would have 621 rows, and 12600 columns, simply containing the coefficients for each of the constraints.

However, this would not do the trick at all, because of the symmetry between the different voters.  Trying to just work it out by hand might be easier.

pat

The goal is to determine all of the unknown votes, right? So that's 14*40 = 560 unknowns.

Of the knowns, we have the sum of each teams' votes, which can be expressed in the form

139 = 14*cc_2  + 13*cc_3  + 12*cc_4  + ... + cc_15
403 = 14*cor_2 + 13*cor_3 + 12*cor_4 + ... + cor_15
 :
  1 = mu_15.

(We can get rid of the 1st place vote variable, since those are known.)

You can also say that each voter can only cast 1 vote per place. In other words:

40 = cc_2  + cor_2  + me_2  + ... + yu_2
40 = cc_3  + cor_3  + me_3  + ... + yu_2
:
40 = cc_15 + cor_15 + me_15 + ... + mu_15.


There are 21 team equations and 14 ballot equations, so that gives us 35 total. Without further restriction, you can't do it. Now, you could start making up rules like to say each higher-ranked team receives more higher-ranked votes than the teams below it, which isn't entirely realistic but not bad. After all, Cornell probably didn't get any 15th-place votes, and PC probably didn't get any second-place votes. That task is left to the reader.

Now, what pet does the German own?

P.S. Age, when you edit a post with PRE tags, it puts BRs at the start of every line, thus the double spacing. Anything you can do about that?

Keith K

Actually, you have 21 more equations, since each team can only have a up to 40 votes.  Unfortunately, these are pretty much inequalities since we don't know how many votes each team received.

You can set this up as a least squares problem and get a solution.  But since it's underconstrained, the linear LS answer doesn't necessarily give you the right answer.  In fact the chances of picking the right one are vanishingly small (infinite number of solutions that satisfy the equations).  Now, limiting the range of each variable to [0,40] would cut down the possibilities to a smaller infinity, but still wouldn't get you to the answer. this also pretty much eliminates the linear LS approach.  Limiting the answers to integers would get you closer, but I'm not sure how to "phrase" that mathematically, unless you're just searching the space.

ugarte

You are going to regret doing all of this if you aren't finished by the time the polls come out next Monday.


RedAR

Well, Cornell got no less than 13 and no greater than 28 second place votes.  Does that help narrow down the possibilities at all? ::screwy::

CowbellGuy

Pat Carr '96 wrote:
QuoteP.S. Age, when you edit a post with PRE tags, it puts BRs at the start of every line, thus the double spacing. Anything you can do about that?
I remember wrestling with that problem for a long time when I set eLF up initially and came to the conclusion that I couldn't do much about it. Don't remember what the deal was, but I don't have time to revisit it now, anyway.

"[Hugh] Jessiman turned out to be a huge specimen of something alright." --Puck Daddy

Greg Berge

PRE is a problematic tag anyway.  You'll find different browsers interpreting it in different ways.  I would not be shocked to see it defacated, er... well you know, before long.

CowbellGuy

You mean deprecated? I should think not. It works just fine as long as you don't go specify a non-monospace type in your font preferences for PRE formatting. This particular problem involves line breaks being converted to CR's in the edit window, then back for HTML displaying and the PRE tags not actually being PRE tags. It's just a stylesheet that simulates PRE formatting for reasons I don't want to get into. The edit window can't tell that part of the text in a post is PRE and part isn't without getting into some unsavory presumptions. The default font for phorum is Courier to get around that. I changed it to something more pleasant to look at and adding the PRE tag option. Anyway, it's complicated and has absolutely nothing to do with actual HTML PRE tags (which are fine in their own right).

"[Hugh] Jessiman turned out to be a huge specimen of something alright." --Puck Daddy

Keith K

Naah.  We're sharper than that (a little anyway). The point would be to put together a computer code that would solve for the votes given all the known data.  I'm not sure it's really possible and I'm not really putting anymore time into it, but that was my original thought (aside from killing some time).

Greg Berge

Naturally I meant deprecated.  That's what I wrote.  And at 1 in the morning too -- not too shabby.  :-D

Not that I remember anything else about it, but can't you solve lots of simultaneous equations with relatively simple matrix manipulation?  One row per team, 15 columns per row.  A given cell value a*x where a = 15, 14, 13, ... and x is the unknown number of votes of that weight.  The sum of each column of a's is 40.  Hmmm... 15xN unsolved values for N=number of teams receiving votes and ... ok, maybe that doesn't sound too promising.

Can you just fill the matrix all the way out with a few hundred additional rows with all cofficients = 0, or does that also contribute no useful information?

jeh25

But we need a eLynah distributed computing project! We can solve for votes and have a neato screensaver with Age's pretty pictures and everything.

Plus it is a better use of space cycles than SETI@home..... ;)

::rock::

Cornell '98 '00; Yale 01-03; UConn 03-07; Brown 07-09; Penn State faculty 09-
Work is no longer an excuse to live near an ECACHL team... :(

Keith K

Simultaneous equations are easy to solve.  Unfortunately we have far more variables (15 per team, or 14 since we know first place votes) than equations.  So it's an underconstrained problem and not directly solvable.

jtwcornell91

Greg wrote:
QuoteCan you just fill the matrix all the way out with a few hundred additional rows with all cofficients = 0, or does that also contribute no useful information?
You can't get something for nothing.  If you try to add trivial equations to an underdetermined system, the resulting matrix had zero determinant and can't be inverted.


Greg Berge

Yeah, I had that instinct, although every once in a while a cutsey little endrun would crop up in my quant methods class where you would effectively add a ton of null data to a series of regression equation and... poof... all of a sudden you got more information.

Mind you, that was going on 14 years ago now, so for all I know...  ::screwy::