2014 Nebula Award Prediction: Weighting
Weighting is one of the most difficult aspects of the statistical model. Our Linear Opinion Pool takes various indicator and combines them—but how do you know which indicator to trust the most?
This is the problem with any statistical model: the way the model is built is as critical as the data that goes into it. Statistics often mask the human bias of the people using those statistics. However, our model is just for fun—it’s not like millions of dollars are on the line, or that the Nebula has enough data to be truly accurate, or SFWA voters are predictable enough for this to be 100% reliable. If we get 70% reliability, that’d be great.
I weighted the model by measuring how accurate each indicator would be if we used that indicator—and only that indicator—to pick the Nebula. Those are then normalized against each other. Using data since 2000, this generated the following weights:
Note two disappointing facts: I had to zero out the Locus Awards column, since the Locus Awards seem to be coming out after the Nebula award. There’s also a zero for Amazon/Goodreads rating, as there wasn’t enough data to make a meaningful correlation.
Does this model pass the eye test? Well, the formula uses three main different categories:
1. Awards History: This makes sense for voters: they vote for names they are familiar and comfortable with. Unlike some other major literary awards—where winning once means you’ll never win a second time—the Nebula likes to give the same people the awards over and over again. At times, I think people voting for the name and not the book! All awards are biased, and this is one of the strongest ways the Nebula is biased.
2. Critic and Reader Response: Sometimes, though, a book is so buzzed about that it can overcome the lack of fame of the author. Conversely, a famous writer might write something that people dislike. These Indicators (#6-#10) try to track how people are feeling about the nominated book this year.
3. Awards Momentum: People like to vote on the winning side of history, so the more attention a book gets in awards season, particularly from the Hugo, the more likely it is to win. I think the web has actually increased the importance of this category—same year Hugo nominations was one of the most reliable indicators in the whole process. More nominations = more people read the book = more likely to vote for the book.
Pretty simple, huh? No model is perfect, though, and the model can’t take into account certain kinds of sentiment: “it’s this author’s time,” “this author is a jerk,” “this book is too political,” “this book isn’t SF,” and so forth.
The formula works out to be around 40% author’s history, 60% this year’s response, which seems roughly fair given the award.