Estimating the 2016 Hugo Nominations, Part 4
Predicting how the “Sad Puppy” voters are going to nominate in 2016 is the most speculative part of all. The Sad Puppies drastically changed their approach, moving from a recommended slate to a crowd-sourced list. It’s an open question of how that change will impact the Hugo nominations.
What we do know, though, is that last nomination season the Sad Puppies were able to drive between 100-200 votes to the Hugos in most categories, and the their numbers likely grew in the finally voting stage. I estimated 450. All those voters are eligible to nominate again; if you figured the Sad Puppies doubled from the nomination stage in 2015 to now, they’d be able to bring 200-400 votes to the table. Then again, their votes might be diffused over the longer list; some Sad Puppies might abandon the list completely; some Sad Puppies might become Rabid Puppies, and so forth into confusion.
When you do predictive modelling, almost nothing good comes from showing how the sausage is made. Most modelling hides behind the mathematics (statistical mathematics forces you to make all sorts of assumptions as well, they’re just buried in the formulas, such as “I assume the responses are distributed along a normal curve”) or black box the whole thing since people only care about the results. Black boxing is probably the smart move as it prevents criticism. Chaos Horizon doesn’t work that way.
So, I need some sort of decay curve of the 10 Sad Puppy recommendations to run through my model. What I decided to go with is treating the Sad Puppy list as a poll showing the relative popularity of the novels. That worked pretty well in predicting the Nebulas. Here’s that chart, listing how many votes each Sad Puppy received, as well as the relative % compared to the top vote getter.
|Somewhither||John C Wright||25||100%|
|Honor At Stake||Declan Finn||24||96%|
|The Cinder Spires: The Aeronaut’s Windlass||Jim Butcher||21||84%|
|A Long Time Until Now||Michael Z Williamson||17||68%|
|Son of the Black Sword||Larry Correia||15||60%|
|Strands of Sorrow||John Ringo||15||60%|
|The Discworld||Terry Pratchett||11||44%|
|Ancillary Mercy||Ann Leckie||9||36%|
What this says is that for every 100 votes the Sad Puppy generates for John C. Wright, they’ll generate 36 votes for Ann Leckie. I know that stat is suspect because not everyone who voted in the Sad Puppy list was a Sad Puppy, and that the numbers are so small it was easy for one person to get boosted up the list by a small group of fans. Still, this gives us something. What I’ll do is plug this into my chart of 40%, 60%, and 80% using the 450 Sad Puppy estimate to come up with:
|The Fifth Season|
|The Aeronaut’s Windlass||151||227||302|
|Agent of the Imperium|
|Honor At Stake||173||259||346|
|A Long Time Until Now||122||184||245|
Does this make any sense? I’m sure many will answer no. But look closely: could the remnants of the Sad Puppies, no matter how they’re impacted by the list, generate 300-150 votes for Jim Butcher this year? I find it hard to believe that they couldn’t produce that number. Remember, Butcher got 387 votes last year in the nomination stage. Some of that was Rabid Puppies (maybe up to 200), but where did the rest come from? And will all the Sad Puppy votes for Butcher vanish in just a year?
How about that Somewhither number—is it too big? This could also model some Sad Puppies being swayed over to the Rabid Puppy side, as would the Seveneves number. The Novik and Leckie numbers could represent the opposite happening: Sad Puppies who joined in 2015 and are now drifting over to more mainstream picks and choices. I think I’d go conservative with this, staying in the 40% band to model the dispersion effect.
So now I have predictions for each of the 3 groups. If I combine those, I get 27 different models. Each model may be flawed in itself (overestimating or underestimating a group), but when we start looking at trends that emerge across multiple models, that’s where this project has been heading. In predictive modelling, normally you make the computers do this and you hide all the messy assumptions behind a cool glossy surface. Then you say “As a result of 1,000 computer simulations, we determined that the Warriors will win 57% of the time.” For the record, the Chaos Horizon model now says the Warriors will win 100% of the time and that Steph Curry will be nominated for Best Related Work.
We could go on and do 100 more models based on different assumptions and see if trends keep emerging. This kind of prediction is messy, unsatisfying, and flawed, and the more you actually understand the nuts and bolts behind it, the more it makes you doubt predictive modelling at all. Of course, the only thing worse would be if predictive modelling was 100% (or even 90% or 80%) accurate. Then we’d know the future with 100% accuracy. Come to think of it, wouldn’t that make for a good SF series . . . Better get Isaac Asimov on the phone. Maybe I should argue that this series is eligible for “The Best SF Story of 2017” Hugo.
Tomorrow we’ll start combining the models and see if anything useful emerges.