Analyzing the 2016 Hugo Noms, Part 1
No use putting this off any longer. I was hoping we’d see some more leaked information/numbers, but we’re stick with pretty minimal information this year. Here we go . . .
Where We’re At: Yesterday, the 2016 Hugo Nominations came out. Once again, the Rabid Puppies dominated the awards, grabbing over 75% of the available Hugo nomination slots.
If you’re here for the quick drive-by (Chaos Horizon is not a good website for the casual Hugo fan), I’m estimating the Rabid Puppies at 300 this year, with a broader range of 250-370. Lower than that, you can’t sweep categories. Higher, more would have been swept. Given this year’s turnout, 300 seems about the number that gets you these results. Calculations below. Be warned!
EDIT 4/28/2016: Sources are telling me that there were indeed withdrawals in several categories. This greatly muddies the upper limit of the Rabid Puppy vote. As such, I think the 250-370 should be read as the lower limit of the Rabid Puppy vote, with the upper limit being somewhere in the range of 100 higher of that. I did some quick calculations for the upper RP limit using the Best Novel category, assuming Jemisin got 10%, 15%, or 20% of the vote. We know she beat John C. Wright’s Somewhither. That gives upper limits of 335, 481, and 615. I think 481 is a good middle-of-the-road estimate. Remember that Best Novel numbers are always inflated because more people vote in this category than any other, so a big RP number in Best Novel doesn’t necessarily carry over to all categories.
So revised RP estimate: 250-480. If there were many withdrawals, push to the high end (or beyond) of that range. Fewer withdrawals, low end. Perhaps the people who withdrew will come forward in the next few days and this will allow us to be more precise. If those withdrawals are made public, please post them in the comments for me.
EDIT 4/28/2016: Over at Rocket Stack Rank, Greg has done his own Hugo analysis, using a different set of assumptions. While I assume a linear increase of “organic” voters (non-Puppy voters), he uses a “power law” distribution. Most simply put, it’s the difference between fitting a line or a curve to the available data. I go with the line because of the low amount of data we have, but Greg is certainly right that the curve is the way to go if you trust the amount of data you have.
Using his method, Greg comes up with a lower Rabid Puppy number (around 200), but that’s also accompanied by a lower number of “organic” voters than my method estimates. Go over and take a look at his estimate. It’s a great example of how different statistical assumptions can yield substantially different results. I’ll leave it up to you to decide which estimate you think is better. I personally love that we now have multiple estimates using different approaches. It really broadens our understanding of this whole process. Now we need someone to come along and do a Bayesian analysis!
The Estimate: This year, MidAmeriCon II released minimal data information at this stage. They’re not obligated to release any, so I guess we should be happy with what we got. Last year, we got the range of votes, which allowed us to estimate how strong the slate effect was. This year, we only have the list of nominees and the votes per category. Is that enough to make any estimates?
Here on Chaos Horizon, I work with what I have. I think we can piece together an estimate using the following information:
- The Rabid Puppies swept some but not all of the categories. That’s a very valuable piece of information: it means the Rabid Puppies are strong, but not strong enough to dominate everything. With careful attention, we should be able to find the line (or at least the vicinity of the line).
- Zooming more closely in, the Rabid Puppies swept the following categories: Short Story, Related Work, Graphic Story, Professional Artist, Fanzine. Because of this, we know that the Rabid Puppies had to beat whatever the #1 non-Rabid Puppy pick was in those categories.
- The Rabid Puppies took 4/5 slots in Novella, Novelette, Semiprozine, Fan Writer, Fan Artist, Fan Cast, and Campbell. This means that, in the categories, the #1 non-Rabid Puppy pick had to be larger than Rabid Puppy slate number.
With that information, if I could just find out what the number of votes the #1 non-Rabid Puppy pick likely received, I could estimate the Rabid vote. Now, couldn’t I use the historical data—the average percentage that the #1 pick has received in past years—to come up with this estimate?
One potential wrench: what if people withdrew from nominations? There’s no way to know this, and that would screw the numbers up substantially. However, with more than 10 categories to work with, we can only hope this didn’t happen in all 10. If you believe at least one person withdrew in Novelette, Semiprozine, Fan Writer, Fan Artist, Fan Cast, and Campbell, add 100 to my Rabid Puppy estimate for 400. There’s also the question of Sad Puppy influence, which I’ll tackle in a later post.
Or, to write it out: In the swept categories, Rabid Puppy Number (x) is likely greater than the Non-Rabid voters (Total – x) * the average percentage of the #1 work from previous years.
In the 4/5 categories, the Rabid Puppy number (x) is likely less than the Non-Rabid voters (Total – x) * the average percentage of the #1 work from previous years.
While that won’t be 100% accurate, as the #1 work gets a range of numbers, it’s going to give us something to start with. Here’s the actual formula for calculating the Rabid Puppy lower limit in swept categories using this logic:
x > (Total – x) * #1%
x > #1% * Total – #1% * x
x + #1% * x > #1% * Total
(1 + #1%)x > #1% * Total
x > (#1% * Total) / (1 + #1%)
So, quick chart: we need the #1%, the average percent of vote the #1 work gets, i.e. the highest placing non-RP work, in all categories that were either swept or had 4/5. I’ll use the 4/5 Rabid categories in a second to establish an upper limit.
Off to the Hugo stats to create the chart. I used data from 2010-2013, giving me 4 years. I didn’t use 2014 and 2015 because the Sad Puppies and Rabid Puppies changed the data sets by their campaigns. I didn’t use 2009 data because the WorldCon didn’t format it conveniently that year, so it is much harder to pull the percentages off. I don’t have infinite time to work on this stuff. :). I also had to toss out Fan Cast because it’s such a new category.
Chart #1: Percentage the #1 Hugo Nominee Received 2010-2013
Notice that far right column of “range”: that’s the difference between the high and low in that 4 year period. This big range is going to introduce a lot of statistical noise into the calculations: if I estimate Best Related work to get 16.6%, I’d be off as much as 5% in some years. I could try to offset this by fancier stat tools, but 4 data points will produce a garbage standard deviation, though, so I won’t use that. On 300 votes, this 5% error would throw a +/- halo of 15 votes. Significant but not overwhelming.
Okay, now that I have this data, let’s use it to calculate the lower limit of Rabid Puppies:
Chart 2: Calculating Min Rabid Puppy Number from 2016 Swept Categories
|Swept Category||Total Votes||#1 %||Min RP|
Okay, what the hell does this chart say? The Short Story category had 2451 voters this year. In past years, the #1 Sad Puppy pick grabbed 14% of the vote. To beat that 14%, there needed to be at least 302 Rabid Puppy voters. With that number, you get 302 Rabid Votes, (2451-302) = 2149 Non-Rabid votes, voting at 14% = 301 votes. Thus, the Rabid Puppies would beat all the Non-Rabid votes by 1 point.
Now, surely that number isn’t 100% accurate. Maybe the top short story this year got 18% of the vote. Maybe it got 12%. But 300 seems about the line here–if Rabid Puppies are lower than that, you wouldn’t expect it to sweep.
Keep in mind, this chart just gives us a minimum. Now, let’s do the other limit, using the categories were the Puppies took 4/5. This is uglier, I’m warning you:
Chart 3: Calculating Max Rabid Puppy Number from 2016 4/5 Categories
|4/5 Category||Total Votes||#1 %||Max|
Ugh. Disaster befalls Chaos Horizon. This number should be higher than the last one, creating a nice range. Oh, the failed dreams. This chart is full of outliers, ranging from that huge 477 in Novella to that paltry 190 in Fan Artist. Did someone withdraw from the Fan Artist category, skewing the numbers? If I take that out, it bumps the average up to 325, which fixes my problem. Of course, if I dump the low outlier, I should dump the high outlier, which puts us back in the same fix.
A couple conclusions: the fact that both calculations turned up the 300 number is actually pretty remarkable. We could conclude that this is just about the line: if the Rabid Puppies are much stronger than 300 (say 350), they should have swept more categories. If they’re much weaker (250), they shouldn’t have swept any. 300 is the sweet spot to be competitive in most of these categories, with the statistical noise of any given year pushing some works over, some works not.
It also really, really looks like Novelette and Fan Artist should have been swept. Withdrawals?
To wrap up my estimate, I took the further step of using the 4 year high % and the 4 year low % (i.e. I deliberately min/mixed to model more centralized and less centralized results). You can find that calculation on this 2016 Hugo Nom Calcs. This gives us the range of 250-370 I mentioned earlier in the post. I’d keep in mind that the raw number of Rabid Puppies might be higher than that—this is just the slate effect they generated. It may be that some Rabid Puppies didn’t vote in all categories, didn’t vote for all the recommended works, etc.
There’s lots of factors that could skew my calculation: perhaps more voters spread the vote out more rather than consolidating it. Perhaps the opposite happened, with voters taking extra care to try to centralize their vote. Both might throw the estimate off by 50 or even 100.
Does around 300 make sense? That’s a good middle ground number that could dominate much of the voting in downballot categories but would be incapable of sweeping popular categories like Novel or Dramatic Work. I took my best shot, wrong as it may be. I don’t think we’ll do much better with our limited data—got any better ideas on how to calculate this?