The raison de’etre of Chaos Horizon has always been to provide numerical predictions for the Nebula and Hugo Awards for Best Novel based on data-mining principles. I’ve always liked odds, percentages, stats, and so forth. I was surprised that no one was doing this already for the major SFF awards, so I figured I could step into this void and see where a statistical exploration would take us.
Over the past few months, I’ve been distracted trying to predict the Hugo and Nebula slates. Now that we have the Nebula slate—and the Hugo is coming shortly—I can turn my attention back to my Nebula and Hugo models. Last year, I put together my first mathematical models for the Hugo and Nebulas. They both predicted eventual winner Leckie, which is good for the model. As I’ll discuss in a few posts, my currently model has around 67% accuracy over the last 15 years. Of course, past accuracy is not going to make things accurate in the future, but at least you know where the model stands. In a complex, multi-variable problem like this, perfect accuracy is impossible.
I’m going to rebuilding and updating the model over the next several weeks. There’s a couple tweaks I want to make, and I also wanted to bring Chaos Horizon readers into the process who weren’t around last year. Over the next few days, we’ll go through the following:
1. Guiding principles
2. The basics of the model
3. Model reliability
4. To-do list for 2015
Let’s get started today with the guiding principles for my Nebula and Hugo models:
1. The past predicts the future. Chaos Horizon uses a type of statistics called data-mining, which means I look for statistical patterns in past data to predict the future. There are other equally valid statical models such as sampling. In a sampling methodology, you would ask a certain number of Nebula or Hugo voters what there award votes were going to be, and then use that sample to extrapolate the final results, usually correcting for demographic issues. This is the methodology of Presidential voting polls, for instance. A lot of people do this informally on the web, gathering up the various posted Hugo and Nebula ballots and trying to predict the awards from that.
Data-mining works differently. You take past data and comb through it to come up with trends and relationships, and then you assume (and it’s only an assumption) that such trends will continue into the future. Since there is carryover in both the SFWA and WorldCon voting pools, this makes a certain amount of logical sense. If the past 10 years of Hugo data show that most of the time a SF novel always wins, you should predict a SF novel to win in the future. If 10 years of data show that the second novel in a series never wins, you shouldn’t predict a second novel to win.
Now, the data is usually not that precise. Instead, there is a historical bias towards SF novels, and first or stand alone novels, and past winners, and novels that do well on critical lists, and novels that do well in other awards, etc. What I do is I transform these observations into percentages (60% of the time a SF novel wins, 75% of the time the Nebula winner wins the Hugo, etc) and then combine those percentages to come up with a final percent. We’ll talk about how I combine all this data in the next few posts.
Lastly for this point, data-mining has difficult predicting sudden and dramatic changes in data sets. Huge changes in sentiment will be missed in what Chaos Horizon does, as that isn’t reflected in past statistical trends. Understand the limitations of this approach, and proceed accordingly.
2. Simple data means simple statistics. The temptation for any statistician is to use the most high-powered, shiny statistical toys on their data sets: multi-variable regressions, computer assisted Bayesian inferences, etc. All that has it’s place, and maybe in a few years we’ll try one of those out to see how far off it is from the simpler statistical modeling Chaos Horizon uses.
For the Nebulas and Hugos, though, we’re dealing with a low N (number of observations) but a high number of variables (genre, awards history, popularity, critical response, reader response, etc.). As a result, the project itself is—from a statistical reliability perspective—fatally flawed. That doesn’t mean it can’t be interesting, or that we can’t learn anything from close observation, but I never want to hide the relative lack of data by pretending my results are more solid than they seem. Low data will inevitably result in unreliable predictions.
Let’s think about what the N is for the Nebula Award. Held since 1966, 219 individual novels have been nominated for the Nebula. That’s our N, the total number of observations we have. We don’t get individual voting numbers for the Nebula, so that’s not an option for a more robust N. Compare that to something like the NCAA basketball tournament (since it’s going on right now). That’s been held since 1939. The field expanded to our familiar 64 teams in 1985. That means, in the tournament proper (the play-in round is silly), 63 games are contested every year since 1985. So, if you’re modeling who will an NCAA tournament game, you have 63 * (2014-1985) = 1827 data sets. Now, if we wanted to add in the number of games played in the regular season, we’d wind up with 347 teams (Division I teams) * 30 games each / 2 (they play each other, so we don’t want to use every game twice) = 5,205 more observations. That’s just one year of college basketball regular season games! Multiply that by 30 seasons, and you’re looking at an N of 150,000 in the regular season, plus an N of 2,000 for the postseason. You can do a lot with data sets that big!
So our 219 Nebula Best Novel observations looks pretty paltry. Let’s throw in the reality that the Nebulas have changed greatly over the last 40 years. Does 1970 data really predict what will happen in 2015? That’s before the internet, before fantasy became part of the process, etc. So, at Chaos Horizon, I primarily use the post 2000 data: new millennia, new data, new trends. That leaves us with an N of a paltry 87. From a statistical perspective, that should make everyone very sad. One option is to pack up and go home, to conclude that any trends we see in the Nebulas will be random statistical noise.
I do think, however, that the awards have some very clear trends (favoring certain kinds of novels, favoring past nominees and winners) that help settle down the variability. Chaos Horizon should be considered an experiment—perhaps a grand failed experiment, but those are the best kind—to see if statistics can get us anywhere. Who knows that but in 5 years I’ll have to conclude that no, we can’t use data-mining to predict the awards?
3. No black boxing the math. A corollary to point #2, I’ve decided to keep the mathematics on Chaos Horizon at roughly the high school level. I want anyone, with a little work, to be able to follow the way I’m putting my models together. As such, I’ve had to chose some simpler mathematical modeling. I think that clarity is important: if people understand the math, they can contest and argue against it. Chaos Horizon is meant to be the beginning of a conversation about the Hugos and Nebulas, not the end of one.
So I try to avoid the following statement: given the data, we get this prediction. Notice how that sentence isn’t logically constructed: how was the data used? What kind of mathematics was it pushed through? If you wanted to do the math yourself, could you? I want to write: given this data, and this mathematical processing of that data, we get this prediction.
4. Neutral presentation. To trust any statistical presentation, you have to trust that the statistics are presented in a fair, logical, and unbiased fashion. While 100% lack of bias is impossible as long as humans are doing the calculating, the attempt for neutrality is very important for me on this website. Opinions are great, and have their place in the SFF realm: to get those, simply go to another site. You won’t find a shortage of those!
Chaos Horizon is trying to do something different. Whether I’m always successful or not is for you to judge. Keep in mind that neutrality does not mean completely hiding my opinions; doing so is just as artificial as putting those opinions in the forefront. If you know some of my opinions, it should allow you to critique my work better. You should question everything that is put up on Chaos Horizon, and I hope to facilitate that questioning by making the chains of my reasoning clear. What we want to avoid at all costs is saying: I like this author (or this author deserves an award), therefore I’m going to up their statistical chances. Nor do I want to punish authors because I dislike them; I try and apply the same processing and data-mining principles to everyone who comes across my plate.
5. Chaos Horizon is not definitive. I hold that the statistical predictions provided on Chaos Horizon are no more than opinions. Stats like this are not a science; the past is not a 100% predictor of the future. These opinions are arrived at through a logical process, but since I am the one designing and guiding the process, they are my ideas alone. If you agree with the predictions, agree because you think the process is sound. If you disagree with the process, feel free to use my data and crunch it differently. If you really hate the process, feel free to find other types of data and process them in whatever way you see appropriate. Then post them and we can see if they make more sense!
Each of these principles is easily contestable, and different statisticians/thinkers may wish to approach the problem differently. If I make my assumptions, biases, and axioms clearly visible, this should allow you to engage with my model fully, and to understand both the strengths and weaknesses of the Chaos Horizon project.
I’ll get into the details of the model over the next few days. If you’ve got any questions, let me know.
It’s the last of the month, so time to update my popularity charts. Now that we have the Nebula slate, I’m debuting a new chart:
Nicholas Whyte over on From the Heart of Europe has been tracking similar data for several years now, although he uses Library Thing instead of Amazon. He’s got data for a few different awards going several years back. Like me, he’s noted that popularity on these lists is not a great indicator of winning. A few weeks ago (here and here) I took a close look at how Goodreads numbers track with Amazon and Bookscan. The news was disappointing: the numbers aren’t closely correlated. Goodreads tracks one audience, Amazon another, and BookScan a third. The ratio between Amazon rankings and Goodreads rankings can be substantial. Goodreads tends to overtrack younger (under 40), more internet-buzzed about books. You can see how Amazon shows McDevitt, Lui, Addison, and Leckie to be about the same level of popularity, whereas Goodreads has Leckie 10x more popular than McDevitt. What do we trust?
The real question is not who we trust, but how closely the Goodreads audience correlates either to the SFWA or WorldCon voters. It’s hard to imagine a book from the bottom of the chart winning over more popular texts, but McDevitt has won in the past, and I don’t think he was that much more popular in 2007 than in 2015. I think the chart is most useful when we compare like to like: if Annihilation and Ancillary Sword are selling to somewhat similar audience, VanderMeer has gotten more books out than Leckie. Hence, VanderMeer probably has an advantage. I’m currently not using these numbers to predict the Nebulas or Hugos, although I’d like to find a way to do so.
Now, on to the big chart. Here’s popularity and reader change for Goodreads for 25+ Hugo contenders, with Gannon and McDevitt freshly added:
One fascinating thing: no one swapped positions this month. At the very least, Goodreads is showing some month to month consistency. Weir continues to lap the field. Mandel did great in February but that didn’t translate to a Nebula nomination: momentum on these charts doesn’t seem to be a good indicator of Nebula success. I’ll admit I thought Mandel’s success on Goodreads was going to translate to a Nebula nomination. Instead, it was Cixin Liu, much more modestly placed on the chart, who grabbed the nomination. Likewise, City of Stairs was doing better than The Goblin Emperor, but it was Addison who got the nod. At least in this regard, critical reception seemed to matter more than this kind of popularity.
Remember, Chaos Horizon is very speculative in this first year: what tracks the award? What doesn’t? I don’t know yet, and I’ve been following different types of data to see what pans out.
Interestingly, McDevitt and Gannon debut at the dead bottom of the chart. That’s one reason I didn’t have them in my Nebula predictions. That’s my fault and my mistake; I need to better diversify my tracking by also looking at Amazon ratings. I’ll be doing that for 2015, and the balance of Amazon and Goodreads stats might give us better insight into the field.
AS always, if you want to look at the full data (which goes back to October 2014), here it is: Hugo Metrics.
Now that we have the nominations for this year’s Nebula Nominations for Best Novel, what are we to make of them?
The Goblin Emperor, Katherine Addison (Tor)
Trial by Fire, Charles E. Gannon (Baen)
Ancillary Sword, Ann Leckie (Orbit US; Orbit UK)
The Three-Body Problem, Cixin Liu, translated by Ken Liu (Tor)
Coming Home, Jack McDevitt (Ace)
Annihilation, Jeff VanderMeer (FSG Originals)
Let’s do what Chaos Horizon does, and look at some stats. What were the most predictive elements for the 2015 Best Novel Nebula?
- 83.3% of the nominees were science fiction.
- 66.7% of the nominated authors had previously been nominated for a Nebula for Best Novel.
- 33.3% of the nominated authors had previously won a Nebula for Best Novel.
- 50.0% of the nominees were either stand-alone novels or the first novel in a series.
- 66.7% of the nominees placed in the top part of my collated SFF Critics Meta-List.
- 16.7% of the nominees were Jack McDevitt.
Overall, the Nebulas Best Novel nominees were very traditional in 2015. After several years of being friendlier to fantasy, the Nebula snapped back to SF: we had 5 SF books and only one fantasy novel, although you may want to count Annihilation as cross-genre (weird/SF?). The Nebula had been creeping up to a 50/50 mix of fantasy and science fiction. This year, we saw none of that trend: three of the books (Leckie, Gannon, McDevitt) are far-future SF novels complete with spaceships and all the SF trimmings. The Cixin Liu, despite being a translation of a Chinese novel, may be the most traditional SF novel of the lot: an alien invasion novel along the lines of Arthur C. Clarke’s Childhood’s End. Liu even does away with more modern characterization, instead using the old 1950s technique of “characters as cameras” to drive us through the plot and the science.
The Nebulas went with 4 writers that had previously been nominated for the Best Novel Nebula (VanderMeer, McDevitt, Leckie, Gannon) and only 2 newcomers. 2 of our 6 nominees already have won the Nebula Best Novel award, with Leckie winning in 2014 and McDevitt back in 2007. The Nebula Best Novel category tends to draw heavily from past nominees and winners, and 2015 was no different. Since the SFWA voting membership doesn’t change much year-to-year, this means support from one year tends to carry over into the next year.
Case in point: Jack McDevitt, who now has have 12 (!) Best Novel Nebula nominations. The constant McDevitt nominations are the strangest thing that is currently happening in the Nebulas. That’s not a knock against McDevitt. I’ve read two of McDevitt’s book, The Engines of God and the Nebula winning Seeker. They were both solid space exploration novels: fast-paced, appealing characterization, and professionally done. They didn’t stand out to me, but there’s never anything wrong with writing books people want to read. Still, I’m not sure why McDevitt deserves 12 nominations while similar authors such as Peter F. Hamilton, Alistair Reynolds, Stephen Baxter, etc., are largely ignored by the SFWA voters. To put this in context: McDevitt has more Nebula Best Novel nominations than Neal Stephenson (1), William Gibson (4), and Philip K. Dick (5) combined.
Since 2004, when the era of McDevitt domination truly began, 73 different books have received Nebula nominations. 9 of those have been McDevitt novels. So, over the last 11 years, McDevitt alone constituted 12% of the total Nebula Best Novel field. I’m going to have to create a “McDevitt anomaly” to start accounting for the Nebula slates. Will Gannon fall into similar territory? There seems to be a block of SFWA voters who like a very specific kind of SF novel. This testifies to the inertia of the Nebula award; once they start voting in one direction, they continue to do so. The McDevitt nominations are useful because it reminds us how eccentric the Nebula can be: if you’re trusting the SFWA to come up with an unbiased list of the best 6 SFF novels of the year, you’re out of luck. The Nebula gives us the 6 SFF novels that the SFWA voters voted for: no more, no less.
I was pleased with how predictive my SFF Critics list was. Ancillary Sword and Annihilation placed 1-2 on that list and grabbed noms. The Goblin Emperor and The Three-Body Problem tied for third (along with 5 other novels, many of which didn’t stand a Nebula chance because of being last in a series, not being SFF-y enough, or not being published in the US). City of Stairs was a place behind those two, so that list at least predicted The Goblin Emperor over the Bennett. Neither Gannon nor McDevitt made the SFF Critics list. I’ll have to trust this list more in the future.
The demographics of the Best Novel award were also interesting, if predictable. 67% men / 33 % women is a little more male-slanted than normal, although the granularity of having only 6 nominations makes that easy to throw off. Along race/ethnic lines, you’re looking at 83% white / 17% Asian; I believe Cixin Liu is the first Asian author nominated for the Best Novel Nebula. Recent trends have been a little higher than that, depending on how you want to categorize race and ethnicity. Nationality of 83% American / 17% Chinese / 0% British is definitely a little unusual; this award has been friendlier to British authors in recent years. I’ll admit that I thought at least one British author would sneak in.
Any other statistical trends stand out to you?
The Goblin Emperor, Katherine Addison (Tor)
Trial by Fire, Charles E. Gannon (Baen)
Ancillary Sword, Ann Leckie (Orbit US; Orbit UK)
The Three-Body Problem, Cixin Liu, translated by Ken Liu (Tor)
Coming Home, Jack McDevitt (Ace)
Annihilation, Jeff VanderMeer (FSG Originals; Fourth Estate; HarperCollins Canada)
A fascinating list with a couple surprises. Annihilation, Ancillary Sword, and The Goblin Emperor were all well-received and well-reviewed texts. Any of those three could easily win. Expect all three of those to grab Hugo nominations later this year. McDevitt has a huge Nebula following, and this marks his 12th Nebula nomination for Best Novel. He won back in 2007 and I don’t see him winning again. Gannon scores his second Nebula nomination in a row for this by Fire series, but it’s very hard to pick up a Nebula for the second novel in series; I don’t see him as having much of a chance.
The Cixin Liu is the big surprise. The Nebula has never shown much flexibility towards works in translation in the past, but this was definitely was one of the most original and interesting hard SF novels of the year. As more people begin to read The Three-Body Problem, I think it’s chances of winning will increase. I expect this to be the biggest “novel of discussion” in the next six or so months, and that’s going to put Liu in real contention for a Hugo nomination as well.
My initial thoughts are that this category will be a showdown between The Three-Body Problem and Annihilation. SFWA voters won’t want to give the award to Leckie twice in a row, and the Nebula still—but just barely—leans SF.
Chaos Horizon only got 3 out of 6 right in my prediction: not terrible for my first year, but not great either. My formula is in definite need of refinement! Coming Home was 9th on my list and the Liu 19th. I didn’t figure the Gannon would make it because it was a sequel. The McDevitt and the Gannon nominations prove the strength of the SF voting block in the Nebulas, and I’ll have to adjust that area up for future predictions. It’s interesting that the Nebula didn’t go with a literary SFF novel this year: I thought Mandel or Mitchell would have made it.
The rest of the ballot:
We Are All Completely Fine, Daryl Gregory (Tachyon)
Yesterday’s Kin, Nancy Kress (Tachyon)
“The Regular,” Ken Liu (Upgraded)
“The Mothers of Voorhisville,” Mary Rickert (Tor.com 4/30/14)
Calendrical Regression, Lawrence Schoen (NobleFusion)
“Grand Jeté (The Great Leap),” Rachel Swirsky (Subterranean Summer ’14)
“Sleep Walking Now and Then,” Richard Bowes (Tor.com 7/9/14)
“The Magician and Laplace’s Demon,” Tom Crosshill (Clarkesworld 12/14)
“A Guide to the Fruits of Hawai’i,” Alaya Dawn Johnson (F&SF 7-8/14)
“The Husband Stitch,” Carmen Maria Machado (Granta #129)
“We Are the Cloud,” Sam J. Miller (Lightspeed 9/14)
“The Devil in America,” Kai Ashante Wilson (Tor.com 4/2/14)
“The Breath of War,” Aliette de Bodard (Beneath Ceaseless Skies 3/6/14)
“When It Ends, He Catches Her,” Eugie Foster (Daily Science Fiction 9/26/14)
“The Meeker and the All-Seeing Eye,” Matthew Kressel (Clarkesworld 5/14)
“The Vaporization Enthalpy of a Peculiar Pakistani Family,” Usman T. Malik (Qualia Nous)
“A Stretch of Highway Two Lanes Wide,” Sarah Pinsker (F&SF 3-4/14)
“Jackalope Wives,” Ursula Vernon (Apex 1/7/14)
“The Fisher Queen,” Alyssa Wong (F&SF 5/14)
I’ll be back with some more analysis tomorrow!
The Nebula nominating period closed on February 15, 2015, and the SFWA will announce their full Nebula slate sometime soon (within a week or so?). Here are some of the trends I’m keeping a close eye on:
1. How repetitive will the slate be? Both the Hugos and the Nebulas tend to repeat the same authors over and over again. See my extensive Report on this issue. While Leckie and VanderMeer have previously grabbed Best Novel nominations, and are likely to do so again, 2015 might yield an interesting crop of Nebula rookies. Of the 5 most popular recent Nebula Best Novel authors (McDevitt, Bujold, Hopkinson, Jemisin, and Mieville), only McDevitt has a novel out this year. Add in that heavy-hitters like Willis and Gaiman didn’t publish novels in 2015, and it seems like the field is more open than usual.
2. How literary will the slate be? 2014 was a strong year for literary SFF, with major novels from authors like Emily St. John Mandel, David Mitchell, Chang-rae Lee, and many others. The Nebula has been friendly to such texts in the past. How many will make this year’s slate? 1? 2? If 3 literary novels make the slate, will the internet explode?
3. Will the Nebulas nominate self-published and indie-published works? Last year, Nagata made the Nebula slate with a self-published novel. Will this trend continue? More and more authors are bypassing traditional publishing and taking their novels directly to the reading audience. The SFWA has recently changed their rules to allow self-published authors into the SFWA. Are we going to see a sea-change of more self-published novels and stories make future Nebula slates? Does it start this year?
4. What about the paywall issue? This is a problem fast reaching a critical point for the Hugos and the Nebulas. Do short stories, novelettes, and novellas that are locked behind paywalls—either in print journals or in online journals/ebooks that require a subscription fee—still stand a chance? Or does the open access provided by sites like Clarkesworld, Tor.com, or Strange Horizons get those stories in front of a larger audience, thus making them more likely award nominees?
5. Will the Nebulas go international? The Nebulas and the Hugos are, in theory, international awards. For the Nebulas, any book published in the USA is eligible, no matter the country of origin or original language. In practice, both awards go to either American or British writers, with a few Canadians thrown in here and there for good measure. Cixin Liu’s The Three-Body Problem brought Chinese SF to an American audience this year, and we’re seeing an increasing number of novels and short stories published in translation. Will this have any impact? I doubt it, but we’ll see.
I’m sure plenty of other issues—and controversies—will float to the surface over the next month. Demographics is a likely point of major discussion, as are the genre questions that always pop up this time of year. What other issues are you thinking about in regard to the forthcoming Nebula slate?
Time to get technical! Break out the warm milk and sleeping pills! Earlier in the week, I took a look at Amazon and Goodreads ratings of the major Hugo candidates. An astute viewer will notice that those rankings don’t exactly line up (nor do the number of ratings, but I’ll address that in a later post). Which is more representative? Which is more accurate?
At Chaos Horizon, I strive to do things: 1. To be neutral, and 2. Not to lie. By “not lie,” I mean I don’t try to exaggerate the importance of any one statistical measure, or to inflate the reliability of what is very unreliable data. So here’s the truth: neither the Goodreads or Amazon ratings are accurate. Both are samples of biased reading populations. Amazon over-samples the Amazon.com user-base, pushing us towards people who like to read e-books or who order theirs books online (would those disproportionally be SFF fans?). Goodreads is demographically biased towards a younger audience (again, would those disproportionally be SFF fans? Worldcon voters?). Stay tuned for more of these demographics issues in my next post.
As such, neither gives a complete or reliable picture of the public reaction to a book. If you follow Chaos Horizon, you’ll know that my methodology is often to gather multiple viewpoints/data sets and then to try to balance them off of each other. I’ve never believed the Hugo or Nebula solely reflects quality (which reader ratings don’t even begin to quantify). At the minimum, the awards reflect quality, reader reception, critical reception, reputation, marketing, popularity, campaigns, and past voting trends/biases. The Chaos Horizon hypothesis has always been that when a single book excels in four or five of those areas, then it can be thought of as a major candidate.
Still, can we learn anything? Let’s take a look at the data and the differences in review scores for 25 Hugo contenders:
The column on the far right is the most interesting one: it represents the difference between the Amazon score and the Goodreads score. As you can see, these are all over the place. There are some general trends we can note:
1. Almost everyone does better on Amazon than Goodreads. The average Amazon boost is .19 stars, and only three books out of 25 scored worse on Amazon than Goodreads. Amazon has a higher bar of entry to rate (you have to type a review, even if it’s one word; Goodreads lets you just enter a score), so I think more people come to Amazon if they love/hate a novel.
2. There doesn’t seem to be much of a pattern regarding who gets a higher ranking bump. Moving down the top of the list, you see a debut SF novel, a hard SF novel, an urban fantasy novel, a YA novel, etc. It’s a mix of genres, of men and women, of total number of ratings, and of left-leaning and right-leaning authors. I’d have trouble coming up with a cause for the bumps (or lack thereof). So, if I had to predict the size of a bump on Amazon, I don’t think I could come up with a formula to do it. I’ll note that since Amazon bought Goodreads, I think the audiences are converging; maybe in a few years there won’t be a bump.
3. If you want to use either ranking, you’d have to think long and hard about what audience each is reflecting, and what you’d want to learn from that audience’s reaction. It would take a major effort to correlate/correct the Amazon.com audience or Goodreads audience out to the general reading audience, and I’m not sure the effort would be worth it. Each would require substantial demographic corrections, and I’m not sure what you would gain from that correction. You’d have to make some many assumptions that you’d wind up with a statistics that is just as unreliable as Goodreads or Amazon.
I think “Reader Ratings” are one of the most tantalizing pieces of data we have—but also one of the least predictive. I’m not sure Amazon or Goodreads tells you anything except how the users of Amazon or Goodreads rated a book. So what does this mean for Chaos Horizon, a website dedicated to building predictive models for the Hugo and Nebula Awards?
That reader ratings are not likely to be useful in predicting awards. Long and short of it, Amazon and Goodreads sample different reading populations, and, as such, neither are fully representative of:
1. The total reading public
2. SFF fans
3. Worldcon Voters
4. SFWA voters
Neither is 100% useful (or even 75% . . . or 50%) reliable in predicting the Hugo and Nebula awards. So is it worth collecting the data? I’m still hopeful that once we have this data and the Hugo + Nebula slates (and eventually winners), I can start combing through it more carefully to see if it comes up with any correlations. For now, though, we have to reach an unsatisfying statistical conclusion: we cannot interpret Amazon or Goodreads ratings as predictive of the Hugos or Nebulas.
Here we are, with my last Nebula Prediction before the Nebula slate comes out in late February! In this post, I’m just going to look at my predicted slate. See my earlier predictions for other texts that might be in the mix.
Here’s what I see happening, based on the Chaos Horizon research into reviews, year-end lists, popularity, and past voting patterns. As always, I try to predict what is most likely to happen, not what should happen.
1. Annihilation, Jeff Vandermeer
2. Ancillary Sword, Ann Leckie
3. Station Eleven, Emily St. John Mandel
4. City of Stairs, Robert Jackson Bennett
5. The Goblin Emperor, Katherine Addison
6. The Bone Clocks, David Mitchell
The Nebula is much harder to predict than the Hugo, due to the smaller voting pool and the fact that the Nebula does not share final voting numbers. I thought about taking the easy way out and putting “Wildcard” in spot 6, but I’d figured I’d at least predict a full slate. I’ll be happy if I get 4 out of the 6 correct, and once the slate comes out, I’ll be able to improve the current Chaos Horizon model. Some thoughts:
1. Annihilation or Area X: The Southern Reach Trilogy by Jeff VanderMeer: VanderMeer’s book was probably the most raved about SFF book of 2014 in critical circles. Weird and strange, divisive and highly debatable, people either loved or hated this trilogy, particularly the first part. We’re talking about a novel that’s grabbed recent features in The Atlantic and The New Yorker. The Nebula is a writer’s award, so while that kind of coverage may not sway SFF fans, expect it to sway SFWA voters. On most of my metrics (sales, reviews, year-end lists), Annihilation is at or near the top. How the SFWA handles VanderMeer is an open question: will they aggregate votes to Area X or nominate only Annihilation?
2. Ancillary Sword by Ann Leckie: I think this is close to a sure thing. Since Leckie won last year, this means she has a ton of built-in support for this year. The Nebula tends to nominate the same authors over and over again (to the tune of 50% repeat nominees), and I don’t see any reason to expect this not to happen in Leckie’s case. Even if Ancillary Sword wasn’t quite the critical sensation Ancillary Justice was, a lot of SFWA readers will have read her sequel, and each voter can nominate up to 5 books. This is a safe, consensus pick; when people don’t what else to vote for, they’ll vote for Leckie.
3. A literary SFF novel: The Nebula has been relatively friendly towards literary fiction as of late, and I expect it to continue to do so this year. That probably means Station Eleven by Emily St. John Mandel, although The Bone Clocks by David Mitchell could jump in. Mitchell has a prior Nebula nom for Cloud Atlas, a movie based on the book that plenty of people saw, and a huge literary profile. Station Eleven was the buzzier novel in the last part of 2014, and I think that buzz, along with Mandel’s impressive recent sales, makes her a pretty good bet for a Nebula nom. I slotted The Bone Clocks in sixth place, although that’s little more than a place holder: I can’t think of any novel that has a better chance of making the slate.
4. One or two of the progressive/experimental fantasy novels: City of Stairs by Robert Jackson Bennett, The Goblin Emperor by Katherine Addison, or The Mirror Empire by Kameron Hurley. I lump these together because they’re sort of the same: fantasy novels from SFF writers that are self-consciously pushing the boundaries of what fantasy is. SFWA voters have liked this kind of novel in the past, and fantasy has been the hot genre for the Nebulas over the past couple years. Hurley has weaker rankings and sales compared to the other two, but she has a real chance based on how well-known she is to voters, including a prior Nebula nom and two 2014 Hugo wins for best fan writer and best related work.
5. A wild card: The Nebula loves reaching down into the broader field and pulling up a slightly more obscure text. Think of the Charles Gannon or Linda Nagata from last year, or Christopher Barzak from a few years ago. I think Girls at the Kingfisher Club has a shot. I think Lagoon (if it’s eligible since no U.S. publicaiton) has an outsider shot. Jack McDevitt always has to be in the mix based on his 10+ prior noms, and William Gibson may have some sentiment behind him. Don’t underestimate the old-school SF voting block of the SFWA. There could also be a novel I’ve never heard of; you just don’t know what will happen at the bottom of the slate.
So, that wraps up my pre-slate Nebula predictions! Click on my 2015 Nebula Prediction tab up top for even more info. I look forward to seeing the eventual slate! Any quibbles? Any other thoughts?
Here’s an update to the “Ratings” chart for the major Hugo candidates. What I’ve done is look at the Goodreads and Amazon ratings for each of 25 possible Hugo books, and sorted those out by Goodreads ratings. Here’s the data (as of January 31st); comments follow. Click on the chart for a better view.
I’ve never felt that Goodreads or Amazon ratings accurately measure the quality of the book. They probably measure something closer to “reader satisfaction.” Take some widely hailed classics of American literature: William Faulkner’s As I Lay Dying only manages a 3.73 on Goodreads (on 80,000 ratings) and 3.9 on Amazon (on 404 ratings). Toni Morrison’s Beloved scored a 3.71 on Goodreads (on 185,000 ratings) and a 3.9 on Amazon (on 900 ratings). Whether you like those books personally or not—they’re both difficult and divisive—a 3.7 rating is ridiculous. Moby-Dick does worse, grabbing a 3.41 rating on Goodreads (on 320,000 ratings). Huck Finn does a little better, at 3.78 (on 840,000 ratings). Unless you believe that the classics of American literature are awful—believe me, many of my students do—we have to take these ratings with a heavy dose of salt.
Remember, though, I’m casting a wide net to see if we can find anything that’s predictive. Maybe these will be, maybe not. Maybe they should be, maybe not. We can’t know until we try. So, the real question is this: can “reader satisfaction” tell us anything about the Hugos or a possible Hugo slate? I don’t know.
Some quick observations. You’ll note that sequels dominate the ratings. That’s a structural issue: everyone who didn’t like the first book bailed out on all future volumes, leaving only enthusiastic fans. As long as the book satisfies that audience, you’ll get great ratings.
After the sequels, The Martian does well, with a very strong 4.36/4.6 Goodreads/Amazon score. Bennett and Addison also put up good showings, with a 4.19/4.4 and 4.15/4.4; that definitely helps boost their Hugo chances over something that did more poorly, like The Mirror Empire way down at 3.66/4.0.
VanderMeer does surprisingly awful in this metric, scraping by with a 3.67/3.8 rating. That’s probably an example of a book being “unsatisfying” to many readers. Annihilation is a strange text, and you could go in expecting one kind of novel (more traditional science fiction?) and wind up with a strange, spooky, somewhat incomprehensible book of weird fiction. That’s going to push ratings down. I don’t expect this to hurt VanderMeer (people either love or hate the book), but it’s definitely interesting to note.
Most books are clustered in a fairly narrow range, from 4.2 to 3.8. I wouldn’t make too much of an issue of a slight difference like that; you can’t claim much by saying one book was ranked 4.1 and another 3.9.
And why do people hate California so much? I haven’t read it, but I don’t think I’ve seen an Amazon score below 3.0 for a professionally published book before. A 3.26/2.9 is truly awful.
Lastly, let me note that there are some inconsistencies between Amazon and Goodreads scores. That reflects the different constituencies of those two websites. I’ll be back later this week with a post on that very issue. Which is more reliable? Can we tell? Could we correlate these to actual sales? Chaos Horizon is on the case!
For this collated list, I’ve chosen 10 SFF websites, critics, magazines, etc., that are likely to be predictive of the 2015 Hugo and Nebula awards. This contrasts with my Best of 2014 Mainstream list, which included plenty of outlets that don’t know much about SFF.
I chose my lists using the following criteria:
1. According to my research, the list is by a major website that has been predictive of the Hugo and/or Nebula in the past. (Locus Magazine, io9, Tor.com).
2. The author of the list was a well-known SFF author writing for a publication (i.e. not their blog). (Jeff VanderMeer, Adam Roberts).
3. Lists by fanzines, fan writers, semi-prozines, or podcasts that have recently been nominated for the Hugo award. I figure if they’re that much part of the process, they’re likely to be influential/predictive. (Dribble of Ink, BookSmugglers, Strange Horizons, Coode Street Review, SF Signal).
Remember, the goal of Chaos Horizon is to predict who is most likely to win the Hugo and Nebula based on past voting patterns, not which novel should win the Hugo or Nebula. Don’t let my lists impact your vote: vote for the novels you think are most worthy of the awards.
Methodology: 1 point for showing up on a list. Since some of these lists are in themselves collations of multiple critics, I toyed around with a more complicated methodology: multiple points if there were more than 3 critics, etc. In the end, I was able to discard all of that: the order of the list didn’t change no matter how I counted. That let me go with the simplest methodology: 1 point for appearing on a list. Nice, clean, simple.
So who wins?
1. Ancillary Sword, Ann Leckie, 8 points
2. Annihilation, Jeff VanderMeer, 6 points
3. The Goblin Emperor, Katherine Addison, 5 points
3. The Magician’s Land, Lev Grossman, 5 points
3. Lagoon, Nnedi Okorafor, 5 points
3. Steles of the Sky, Elizabeth Bear, 5 points
3. The Bone Clocks, David Mitchell, 5 points
3. The Girls at the Kingfisher Club, Genevieve Valentine, 5 points
3. The Three-Body Problem, Cixin Liu, 5 points
10. City of Stairs, Robert Jackson Bennett, 4 points
10. All those Vanished Engines, Paul Park, 4 points
10. Broken Monsters, Lauren Beukes, 4 points
10. The Martian, Andy Weir, 4 points
10. The Peripheral, William Gibson, 4 points
15. A Man Lies Dreaming, Lavie Tidhar, 3 points
15. Europe in Autumn, Dave Hutchinson, 3 points
15. Half a King, Joe Abercrombie, 3 points
15. My Real Children, Jo Walton, 3 points
15. The Bees, Laline Paull, 3 points
15. The Book of Strange New Things, Michel Faber, 3 points
15. The Causal Angel, Hannu Rajaniemi, 3 points
15. The First Fifteen Lives of Harry August, Clair North, 3 points
15. The Girl in the Road, Monica Byrne, 3 points
15. Tigerman, Nick Harkaway, 3 points
15. Wolves, Simon Ings, 3 points
Even though Ancillary Sword wasn’t as hyped or well-received as Ancillary Justice, the math really worked in Leckie’s favor. Leckie has a huge “incumbent” advantage; everyone wanted to see what she did next, and since Sword wasn’t a total let-down (many thought it was the better-written book, even if a less exciting and innovative than Justice), it made an impressive 80% of the lists. I expect Leckie to easily cruise to Hugo + Nebula nominations this year.
VanderMeer places a strong second. Some of those votes were for Annihilation alone, others for the whole Area X/Southern Reach trilogy. I think VanderMeer is a near certainty for a Nebula nomination at this point, and I’ve got him as the favorite to win (voters won’t want to give Leckie two awards in a row). I’ll be interested to see how the Nebulas and Hugos handle this nomination, whether for Annihilation or the whole series.
The Goblin Emperor dominated the fanzine/fan writer lists. I don’t know how much said lists will impact the Nebulas, but I can imagine Addison sneaking into that award. Depending on how crowded and contentious the Hugo becomes, she also has a solid shot there.
The list gets more complicated as you move down. The Magician’s Land and Steles of the Sky are the final volumes of well-received fantasy trilogies. In the past, both the Nebula and the Hugo have shied away from honoring books like this. It does make a certain amount of sense to honor a trilogy by nominating the final work. Will it happen this year?
Lagoon wasn’t published in the United States this year, which really complicates its award chances. The Nebula specifies US publication in its rules: “1. All works first published in English, in the United States, during the calendar year” but that’s tempered with “2. Works first published in English on the Internet or in electronic form during the calendar year shall be treated as though published in the United States.” Is a UK e-book “electronic form?” Or do they mean a form accessible to American readers? Rules technicality aside, the lack of US publication means that most US readers haven’t had a chance to read the book, and thus won’t vote for it. Except for years where the Hugo was in the UK, I don’t think we’ve ever had a non-US published book make the final slate. Can Okorafor defy the trend?
That takes us down to The Bone Clocks, Girls at the Kingfisher Club, The Three-Body Problem, and City of Stairs as the next most likely Nebula noms (the Hugo will push up fan favorites instead of these books). Are Mitchell and Valentine speculative enough for the SFWA? Can a Chinese author edge his way into an English-language award? The “A” at the end of SFWA stands for “America,” and SFWA members haven’t voted for foreign-language books in the past. That leaves City of Stairs as perhaps the most likely candidate from this part of the list.
Who’s missing? Station Eleven roared to prominence in the last few months, after many of these list were put together. Expect Mandel to make a strong showing as more and more people read her book.
Since this is the first year of Chaos Horizon, we don’t know how predictive this list will be. Once the slates came out, I can start further refining this process. It’ll be interesting to see, though, how much the top of this list matches the eventual Nebula slate.
Here’s the raw data. The critics list is under the second tab: Hugo Metrics.
Lists included: Locus Magazine Recommended Reading List 2014, BookSmugglers, Coode Street Podcast, io9, SF Signal, Strange Horizons, Jeff VanderMeer writing for Electric Literature, Adam Roberts writing for The Guardian, Tor.com, and a A Dribble of Ink.
There’s a wealth of information there, including recommendations for categories that I don’t have the time to follow, like YA Novel, Novella, Novelette, and Short Story. In the past, most of the future Hugo and Nebula nominees have shown up on these lists. Part of that is because the lists are so long (20-30 suggestions each), but also because Locus pretty closely mirrors the sentiments of the SFWA and the Nebula.
Here’s there SF and Fantasy lists:
Novels – Science Fiction
•Ultima, Stephen Baxter (Gollancz; Roc 2015)
•War Dogs, Greg Bear (Orbit US; Gollancz)
•Shipstar, Gregory Benford & Larry Niven (Tor; Titan 2015)
•Chimpanzee, Darin Bradley (Underland)
•Cibola Burn, James S.A. Corey (Orbit US; Orbit UK)
•The Book of Strange New Things, Michel Faber (Hogarth; Canongate)
•The Peripheral, William Gibson (Putnam; Viking UK)
•Afterparty, Daryl Gregory (Tor; Titan)
•Work Done for Hire, Joe Haldeman (Ace)
•Tigerman, Nick Harkaway (Knopf; Heinemann 2015)
•Europe in Autumn, Dave Hutchinson (Solaris US; Solaris UK)
•Wolves, Simon Ings (Gollancz)
•Ancillary Sword, Ann Leckie (Orbit US; Orbit UK)
•Artemis Awakening, Jane Lindskold (Tor)
•The Three-Body Problem, Cixin Liu (Tor)
•The Causal Angel, Hannu Rajaniemi (Tor; Gollancz)
•The Memory of Sky, Robert Reed (Prime)
•Bête, Adam Roberts (Gollancz)
•Lock In, John Scalzi (Tor; Gollancz)
•The Blood of Angels, Johanna Sinisalo (Peter Owens)
•The Bone Clocks, David Mitchell (Random House; Sceptre)
•Lagoon, Nnedi Okorafor (Hodder; Saga 2015)
•All Those Vanished Engines, Paul Park (Tor)
•Annihilation/Authority/Acceptance, Jeff VanderMeer (FSG Originals; Fourth Estate; HarperCollins Canada)
•Dark Lightning, John Varley (Ace)
•My Real Children, Jo Walton (Tor; Corsair)
•Echopraxia, Peter Watts (Tor; Head of Zeus 2015)
•World of Trouble, Ben H. Winters (Quirk)
Novels – Fantasy
•The Widow’s House, Daniel Abraham (Orbit US; Orbit UK)
•The Goblin Emperor, Katherine Addison (Tor)
•Steles of the Sky, Elizabeth Bear (Tor)
•City of Stairs, Robert Jackson Bennett (Broadway; Jo Fletcher)
•Hawk, Steven Brust (Tor)
•The Boy Who Drew Monsters, Keith Donohue (Picador USA)
•Bathing the Lion, Jonathan Carroll (St. Martin’s)
•Full Fathom Five, Max Gladstone (Tor)
•The Winter Boy, Sally Wiener Grotta (Pixel Hall)
•The Magician’s Land, Lev Grossman (Viking; Arrow 2015)
•Truth and Fear, Peter Higgins (Orbit; Gollancz)
•The Mirror Empire, Kameron Hurley (Angry Robot US)
•Resurrections, Roz Kaveney (Plus One)
•Revival, Stephen King (Scribner; Hodder & Stoughton)
•The Dark Defiles, Richard K. Morgan (Del Rey; Gollancz)
•The Bees, Laline Paull (Ecco; Fourth Estate 2015)
•The Godless, Ben Peek (Thomas Dunne; Tor UK)
•Heirs of Grace, Tim Pratt (47North)
•Beautiful Blood, Lucius Shepard (Subterranean)
•A Man Lies Dreaming, Lavie Tidhar (Hodder & Stoughton)
•The Girls at the Kingfisher Club, Genevieve Valentine (Atria)
•California Bones, Greg van Eekhout (Tor)
Like I said, pretty comprehensive. Most of the major candidates are there, ranging from VanderMeer to Leckie to Addison to Bennett. Here are the snubs I noticed:
The Martian, Andy Weir: That’s a good indication that the “industry” doesn’t consider this a 2014 book.
Station Eleven, Emily St. John Mandel: A surprise. Maybe it caught fire too late in the year to make the list?
The First Fifteen Lives of Harry August, Clair North
Most mainstream fantasy novels: no Words of Radiance, no The Broken Eye, no Fool’s Assassin, no Prince of Fools, no The Emperor’s Blade’s, no The Slow Regard of Silent Things. It says something when you put together a list of 22 fantasy novels and leave out most of the fantasy best-sellers. Is Locus arguing that excellence can’t be achieved in mainstream epic fantasy? Or are they reflecting their audience’s lack of interest in epic series? Sure, there are a few fantasy series on the list—Robert Morgan, Elizabeth Bear, Lev Grossman, Kameron Hurley—but each of those is set up, on some level, as a challenge to more conventional epic fantasy.
There are several books that haven’t gotten an official US publication yet (or least they aren’t available on Amazon): Lagoon, A Man Lies Dreaming, Bete, and Wolves. You’d think publication would be truly international in 2014, but that’s not yet the case. Lagoon, in particular, would have had a Nebula and Hugo shot if had gotten a US publication. Without one, it’s probably not eligible for the Nebula, and thus can’t build momentum towards a Hugo.
Lastly, is The Bone Clocks really science fiction? I guess part of the novel takes place in the future, so that’s probably why they placed it in that category. It felt more like a horror/weird fiction/fantasy hybrid to me, but I guess classification doesn’t matter that much in the end.
I’ve been waiting for this list; now that we have it, I’ll update and finalize the Critics Meta-List.