It’s the last of the month, so time to update my popularity charts. Now that we have the Nebula slate, I’m debuting a new chart:
Nicholas Whyte over on From the Heart of Europe has been tracking similar data for several years now, although he uses Library Thing instead of Amazon. He’s got data for a few different awards going several years back. Like me, he’s noted that popularity on these lists is not a great indicator of winning. A few weeks ago (here and here) I took a close look at how Goodreads numbers track with Amazon and Bookscan. The news was disappointing: the numbers aren’t closely correlated. Goodreads tracks one audience, Amazon another, and BookScan a third. The ratio between Amazon rankings and Goodreads rankings can be substantial. Goodreads tends to overtrack younger (under 40), more internet-buzzed about books. You can see how Amazon shows McDevitt, Lui, Addison, and Leckie to be about the same level of popularity, whereas Goodreads has Leckie 10x more popular than McDevitt. What do we trust?
The real question is not who we trust, but how closely the Goodreads audience correlates either to the SFWA or WorldCon voters. It’s hard to imagine a book from the bottom of the chart winning over more popular texts, but McDevitt has won in the past, and I don’t think he was that much more popular in 2007 than in 2015. I think the chart is most useful when we compare like to like: if Annihilation and Ancillary Sword are selling to somewhat similar audience, VanderMeer has gotten more books out than Leckie. Hence, VanderMeer probably has an advantage. I’m currently not using these numbers to predict the Nebulas or Hugos, although I’d like to find a way to do so.
Now, on to the big chart. Here’s popularity and reader change for Goodreads for 25+ Hugo contenders, with Gannon and McDevitt freshly added:
One fascinating thing: no one swapped positions this month. At the very least, Goodreads is showing some month to month consistency. Weir continues to lap the field. Mandel did great in February but that didn’t translate to a Nebula nomination: momentum on these charts doesn’t seem to be a good indicator of Nebula success. I’ll admit I thought Mandel’s success on Goodreads was going to translate to a Nebula nomination. Instead, it was Cixin Liu, much more modestly placed on the chart, who grabbed the nomination. Likewise, City of Stairs was doing better than The Goblin Emperor, but it was Addison who got the nod. At least in this regard, critical reception seemed to matter more than this kind of popularity.
Remember, Chaos Horizon is very speculative in this first year: what tracks the award? What doesn’t? I don’t know yet, and I’ve been following different types of data to see what pans out.
Interestingly, McDevitt and Gannon debut at the dead bottom of the chart. That’s one reason I didn’t have them in my Nebula predictions. That’s my fault and my mistake; I need to better diversify my tracking by also looking at Amazon ratings. I’ll be doing that for 2015, and the balance of Amazon and Goodreads stats might give us better insight into the field.
AS always, if you want to look at the full data (which goes back to October 2014), here it is: Hugo Metrics.
Time for a quick study on Hugo/Nebula convergence. The Nebula nominations came out about a week ago: how much will those nominations impact the Hugos?
In recent years, quite a bit. Ever since the Nebulas shifted their rules around in 2009 (moving from rolling eligibility to calendar year eligibility; see below), the Nebula Best Novel winner usually goes on to win the Hugo Best Novel. Since 2010, this has happened 4 out of 5 times (with Ancillary Justice, Among Others, Blackout/All Clear, and The Windup Girl, although Bacigalupi did tie with Mieville). That’s a whopping 80% convergence rate. Will that continue? Do the Nebulas and Hugos always converge? How much of a problem is such a tight correspondence between the two awards?
The Hugos have always influenced the Nebulas, and vice versa. The two awards have a tendency to duplicate each other, and there’s a variety of reasons for that: the voting pools aren’t mutually exclusive (many SFWA members attend WorldCon, for instance), the two voting pools are influenced by the same set of factors (reviews, critical and popular buzz, etc.), and the two voting pools have similar tastes in SFF. Think of how much attention a shortlist brings to those novels. Once a book shows up on the Nebula or Hugo slates, plenty of readers (and voters) pick it up. In the nearly 50 years when both the Hugo and Nebula has been given, the same novel has won the award 23 out of 49 times, for a robust 47% convergence. As we’ll see below, this has varied greatly by decade: in some decades (the 1970s, the 2010s) the winner are basically identical. In other decades, such as the 1990s, there’s only a 20% overlap.
All of this is made more complex by which award goes first. Historically, the Hugo used to go first, often awarding books a Hugo some six months before the Nebula was award. Thanks to the Science Fiction Awards Database, we can find out that Paladin of Souls received its Hugo on September 4, 2004; Bujold’s novel received its Nebula on April 30, 2005. Did six months post-Hugo hype seal the Nebula win for Bujold?
Bujold benefitted from the strange and now defunct Nebula rule of rolling eligibility. The Locus Index to SF Awards gave us some insight on how the Nebula used to be out of sync with the Hugo:
The Nebulas’ 12-month eligibility period has the effect of delaying recognition of many works until nearly 2 years after publication, and throws Nebula results out of synch with other awards (Hugo, Locus) voted in a given calendar year. (NOTE – this issue will pass with new voting rules announced in early 2009; see above.)
SFWA has announced significant rules changes for the Nebula Awards process, eliminating rolling eligibility and limiting nominations to work published during a given calendar year (i.e., only works published in 2009 will be eligible for the 2010 awards), as well as eliminating jury additions. The changes are effective as of January 2009 and “except as explicitly stated, will have no impact on works published in 2008 or the Nebula Awards process currently underway.”
Since 2009, eligibility has been straightened out: Hugo and Nebula eligibility basically follow the same rules, and now it is the Nebula that goes first. The Nebula tends to announce a slate in late February, and then gives the award in early May. The Hugo announced a slate in mid April, and then awards in late August/early September, although those dates change very year.
Tl;dr: while it used to be the Hugos that influenced the Nebula, but, since 2010, it is now the Nebulas that influence the Hugos. We know that Nebula slates tend to come out while Hugo slate voting is still going on. This means that Hugo voters have a chance to wait until the Nebulas announce their nominations, and then adjust/supplement their voting as they wish. This year, there were about 3 weeks between the Nebula announcement and the close of Hugo voting: were WorldCon voters scrambling to read Annihilation and The Three-Body Problem in that gap? Remember, even a slight influence on WorldCon voters can drastically change the final slate.
But how much? Let’s take a look at the data from 2010-2014, or the post-rule change era. That’s not a huge data set, but the results are telling.
This chart shows how many of the Nebula nominations showed up on the Hugo ballot a few weeks later. You can see the it makes for around 40% on average. Don’t get fooled by the 2014 data: Neil Gaiman’s The Ocean at the End of the Lane made both the Nebula and Hugo slate, but Gaiman declined his Hugo nomination. If we factored him in, we’d be staring at that same 40% across the board.
40% isn’t that jarring, since that only means 2 out of the 5 Hugo nominees. If we consider the overlap between reading audiences, critical and popular acclaim, etc., that doesn’t seem too far out of line.
It’s the last column that catches my eye: 4/5 joint winners, or 80% joint winners in the last 5 years. Only John Scalzi managed to eek out a win over Kim Stanley Robinson, otherwise we’d be batting 100%. We should also keep in mind the tie between The City and the City and The Windup Girl in 2010.
Nonetheless, my research shows that the single biggest indicator of winning a Hugo from 2010-2014 is whether or not you won the Nebula that year. Is this a timeline issue: does the Nebula winner get such a signal boost on the internet in May that everyone reads it in time for the Hugo in August? Or are the Hugo/Nebula voting pools converging to the point that their tastes are almost the same? Were the four joint-winners in the 2010s so clearly the best novels of the year that all of this is moot? Or is this simply a statistical anomaly?
I’m keeping close eye on this trend. If Annihilation sweeps the Nebula and Hugos this year, the SFF world might need to take step back and ask if we want the two “biggest” awards in the field to move in lockstep. This has happened in the past. Let’s take a look at the trends of Hugo/Nebula convergence by decade in the field:
That’s an odd chart for you: the 1960s (only 4 years, though) had 25% joint winners, the 1970s jumped to 80%, we declined through the 1980s (50%) and the 1990s (20%), stayed basically flat in the 2000s (30%), and then jumped back up to 80% in the 2010s. Why so much agreement in the 1970s and 2010s with so much disagreement in the 1990s and 2000s? The single biggest thing that changed from the 2000s to the 2010s were the Nebula rules: is that the sole cause of present day convergence?
I don’t have a lot of conclusions to draw for you today. I think convergence is a very interesting (and complex) phenomenon, and I’m not sure how I feel about it. Should the Hugos and Nebulas go to different books? Should they only converge for books of unusual and universal acclaim? In terms of my own predictions, I expect the trend of convergence to continue: I think 2-3 of this year’s Nebula nominees will be on the Hugo ballot. If I had to guess, I’d bet that this year’s Nebula winner will also take the Hugo. Given this data, you’d be foolish to do anything else.
Now that we have the nominations for this year’s Nebula Nominations for Best Novel, what are we to make of them?
The Goblin Emperor, Katherine Addison (Tor)
Trial by Fire, Charles E. Gannon (Baen)
Ancillary Sword, Ann Leckie (Orbit US; Orbit UK)
The Three-Body Problem, Cixin Liu, translated by Ken Liu (Tor)
Coming Home, Jack McDevitt (Ace)
Annihilation, Jeff VanderMeer (FSG Originals)
Let’s do what Chaos Horizon does, and look at some stats. What were the most predictive elements for the 2015 Best Novel Nebula?
- 83.3% of the nominees were science fiction.
- 66.7% of the nominated authors had previously been nominated for a Nebula for Best Novel.
- 33.3% of the nominated authors had previously won a Nebula for Best Novel.
- 50.0% of the nominees were either stand-alone novels or the first novel in a series.
- 66.7% of the nominees placed in the top part of my collated SFF Critics Meta-List.
- 16.7% of the nominees were Jack McDevitt.
Overall, the Nebulas Best Novel nominees were very traditional in 2015. After several years of being friendlier to fantasy, the Nebula snapped back to SF: we had 5 SF books and only one fantasy novel, although you may want to count Annihilation as cross-genre (weird/SF?). The Nebula had been creeping up to a 50/50 mix of fantasy and science fiction. This year, we saw none of that trend: three of the books (Leckie, Gannon, McDevitt) are far-future SF novels complete with spaceships and all the SF trimmings. The Cixin Liu, despite being a translation of a Chinese novel, may be the most traditional SF novel of the lot: an alien invasion novel along the lines of Arthur C. Clarke’s Childhood’s End. Liu even does away with more modern characterization, instead using the old 1950s technique of “characters as cameras” to drive us through the plot and the science.
The Nebulas went with 4 writers that had previously been nominated for the Best Novel Nebula (VanderMeer, McDevitt, Leckie, Gannon) and only 2 newcomers. 2 of our 6 nominees already have won the Nebula Best Novel award, with Leckie winning in 2014 and McDevitt back in 2007. The Nebula Best Novel category tends to draw heavily from past nominees and winners, and 2015 was no different. Since the SFWA voting membership doesn’t change much year-to-year, this means support from one year tends to carry over into the next year.
Case in point: Jack McDevitt, who now has have 12 (!) Best Novel Nebula nominations. The constant McDevitt nominations are the strangest thing that is currently happening in the Nebulas. That’s not a knock against McDevitt. I’ve read two of McDevitt’s book, The Engines of God and the Nebula winning Seeker. They were both solid space exploration novels: fast-paced, appealing characterization, and professionally done. They didn’t stand out to me, but there’s never anything wrong with writing books people want to read. Still, I’m not sure why McDevitt deserves 12 nominations while similar authors such as Peter F. Hamilton, Alistair Reynolds, Stephen Baxter, etc., are largely ignored by the SFWA voters. To put this in context: McDevitt has more Nebula Best Novel nominations than Neal Stephenson (1), William Gibson (4), and Philip K. Dick (5) combined.
Since 2004, when the era of McDevitt domination truly began, 73 different books have received Nebula nominations. 9 of those have been McDevitt novels. So, over the last 11 years, McDevitt alone constituted 12% of the total Nebula Best Novel field. I’m going to have to create a “McDevitt anomaly” to start accounting for the Nebula slates. Will Gannon fall into similar territory? There seems to be a block of SFWA voters who like a very specific kind of SF novel. This testifies to the inertia of the Nebula award; once they start voting in one direction, they continue to do so. The McDevitt nominations are useful because it reminds us how eccentric the Nebula can be: if you’re trusting the SFWA to come up with an unbiased list of the best 6 SFF novels of the year, you’re out of luck. The Nebula gives us the 6 SFF novels that the SFWA voters voted for: no more, no less.
I was pleased with how predictive my SFF Critics list was. Ancillary Sword and Annihilation placed 1-2 on that list and grabbed noms. The Goblin Emperor and The Three-Body Problem tied for third (along with 5 other novels, many of which didn’t stand a Nebula chance because of being last in a series, not being SFF-y enough, or not being published in the US). City of Stairs was a place behind those two, so that list at least predicted The Goblin Emperor over the Bennett. Neither Gannon nor McDevitt made the SFF Critics list. I’ll have to trust this list more in the future.
The demographics of the Best Novel award were also interesting, if predictable. 67% men / 33 % women is a little more male-slanted than normal, although the granularity of having only 6 nominations makes that easy to throw off. Along race/ethnic lines, you’re looking at 83% white / 17% Asian; I believe Cixin Liu is the first Asian author nominated for the Best Novel Nebula. Recent trends have been a little higher than that, depending on how you want to categorize race and ethnicity. Nationality of 83% American / 17% Chinese / 0% British is definitely a little unusual; this award has been friendlier to British authors in recent years. I’ll admit that I thought at least one British author would sneak in.
Any other statistical trends stand out to you?
The Goblin Emperor, Katherine Addison (Tor)
Trial by Fire, Charles E. Gannon (Baen)
Ancillary Sword, Ann Leckie (Orbit US; Orbit UK)
The Three-Body Problem, Cixin Liu, translated by Ken Liu (Tor)
Coming Home, Jack McDevitt (Ace)
Annihilation, Jeff VanderMeer (FSG Originals; Fourth Estate; HarperCollins Canada)
A fascinating list with a couple surprises. Annihilation, Ancillary Sword, and The Goblin Emperor were all well-received and well-reviewed texts. Any of those three could easily win. Expect all three of those to grab Hugo nominations later this year. McDevitt has a huge Nebula following, and this marks his 12th Nebula nomination for Best Novel. He won back in 2007 and I don’t see him winning again. Gannon scores his second Nebula nomination in a row for this by Fire series, but it’s very hard to pick up a Nebula for the second novel in series; I don’t see him as having much of a chance.
The Cixin Liu is the big surprise. The Nebula has never shown much flexibility towards works in translation in the past, but this was definitely was one of the most original and interesting hard SF novels of the year. As more people begin to read The Three-Body Problem, I think it’s chances of winning will increase. I expect this to be the biggest “novel of discussion” in the next six or so months, and that’s going to put Liu in real contention for a Hugo nomination as well.
My initial thoughts are that this category will be a showdown between The Three-Body Problem and Annihilation. SFWA voters won’t want to give the award to Leckie twice in a row, and the Nebula still—but just barely—leans SF.
Chaos Horizon only got 3 out of 6 right in my prediction: not terrible for my first year, but not great either. My formula is in definite need of refinement! Coming Home was 9th on my list and the Liu 19th. I didn’t figure the Gannon would make it because it was a sequel. The McDevitt and the Gannon nominations prove the strength of the SF voting block in the Nebulas, and I’ll have to adjust that area up for future predictions. It’s interesting that the Nebula didn’t go with a literary SFF novel this year: I thought Mandel or Mitchell would have made it.
The rest of the ballot:
We Are All Completely Fine, Daryl Gregory (Tachyon)
Yesterday’s Kin, Nancy Kress (Tachyon)
“The Regular,” Ken Liu (Upgraded)
“The Mothers of Voorhisville,” Mary Rickert (Tor.com 4/30/14)
Calendrical Regression, Lawrence Schoen (NobleFusion)
“Grand Jeté (The Great Leap),” Rachel Swirsky (Subterranean Summer ’14)
“Sleep Walking Now and Then,” Richard Bowes (Tor.com 7/9/14)
“The Magician and Laplace’s Demon,” Tom Crosshill (Clarkesworld 12/14)
“A Guide to the Fruits of Hawai’i,” Alaya Dawn Johnson (F&SF 7-8/14)
“The Husband Stitch,” Carmen Maria Machado (Granta #129)
“We Are the Cloud,” Sam J. Miller (Lightspeed 9/14)
“The Devil in America,” Kai Ashante Wilson (Tor.com 4/2/14)
“The Breath of War,” Aliette de Bodard (Beneath Ceaseless Skies 3/6/14)
“When It Ends, He Catches Her,” Eugie Foster (Daily Science Fiction 9/26/14)
“The Meeker and the All-Seeing Eye,” Matthew Kressel (Clarkesworld 5/14)
“The Vaporization Enthalpy of a Peculiar Pakistani Family,” Usman T. Malik (Qualia Nous)
“A Stretch of Highway Two Lanes Wide,” Sarah Pinsker (F&SF 3-4/14)
“Jackalope Wives,” Ursula Vernon (Apex 1/7/14)
“The Fisher Queen,” Alyssa Wong (F&SF 5/14)
I’ll be back with some more analysis tomorrow!
The Nebula nominating period closed on February 15, 2015, and the SFWA will announce their full Nebula slate sometime soon (within a week or so?). Here are some of the trends I’m keeping a close eye on:
1. How repetitive will the slate be? Both the Hugos and the Nebulas tend to repeat the same authors over and over again. See my extensive Report on this issue. While Leckie and VanderMeer have previously grabbed Best Novel nominations, and are likely to do so again, 2015 might yield an interesting crop of Nebula rookies. Of the 5 most popular recent Nebula Best Novel authors (McDevitt, Bujold, Hopkinson, Jemisin, and Mieville), only McDevitt has a novel out this year. Add in that heavy-hitters like Willis and Gaiman didn’t publish novels in 2015, and it seems like the field is more open than usual.
2. How literary will the slate be? 2014 was a strong year for literary SFF, with major novels from authors like Emily St. John Mandel, David Mitchell, Chang-rae Lee, and many others. The Nebula has been friendly to such texts in the past. How many will make this year’s slate? 1? 2? If 3 literary novels make the slate, will the internet explode?
3. Will the Nebulas nominate self-published and indie-published works? Last year, Nagata made the Nebula slate with a self-published novel. Will this trend continue? More and more authors are bypassing traditional publishing and taking their novels directly to the reading audience. The SFWA has recently changed their rules to allow self-published authors into the SFWA. Are we going to see a sea-change of more self-published novels and stories make future Nebula slates? Does it start this year?
4. What about the paywall issue? This is a problem fast reaching a critical point for the Hugos and the Nebulas. Do short stories, novelettes, and novellas that are locked behind paywalls—either in print journals or in online journals/ebooks that require a subscription fee—still stand a chance? Or does the open access provided by sites like Clarkesworld, Tor.com, or Strange Horizons get those stories in front of a larger audience, thus making them more likely award nominees?
5. Will the Nebulas go international? The Nebulas and the Hugos are, in theory, international awards. For the Nebulas, any book published in the USA is eligible, no matter the country of origin or original language. In practice, both awards go to either American or British writers, with a few Canadians thrown in here and there for good measure. Cixin Liu’s The Three-Body Problem brought Chinese SF to an American audience this year, and we’re seeing an increasing number of novels and short stories published in translation. Will this have any impact? I doubt it, but we’ll see.
I’m sure plenty of other issues—and controversies—will float to the surface over the next month. Demographics is a likely point of major discussion, as are the genre questions that always pop up this time of year. What other issues are you thinking about in regard to the forthcoming Nebula slate?
What’s Hugo season without some impassioned discussion? 2015 is shaping up to be just as vehement a year as 2014. As I’m fond of saying, Chaos Horizon is an analytics, not an opinion, website. While that line can be delicate—and I sometimes don’t do a great job staying on the analytics side—neutrality has always been my goal. I want to figure out what’s likely to happen in the Hugo/Nebulas, not what should happen. If you want to find opinions about Hugo campaigning, you’ve got plenty of options.
At Chaos Horizon, I try to use data mining to predict the Hugo and Nebula awards. The core idea here is that the best predictor of the future is the past. I make the assumption that if certain patterns have been established in the awards, those are likely to continue; thus, if we find those patterns, we can make good predictions. There are flaws with this methodology—it can’t take into account year-by-year changes in sentiment, nor shifts in voting pools, and data mining tends to slight new or emerging authors—but it gives us a different way of looking at the Hugo and Nebula awards, one that I hope people find interesting. An analytics website like Chaos Horizon is most useful when used in conjunction with other more opinion driven websites to get a full view of the field.
What I need to work on is figuring out how to model the effectiveness of a campaign like “Sad Puppies 3” on the upcoming Hugo awards. For those out of the loop, a quick history lesson: over the past several years, we’ve seen several organized Hugo “campaigns” (for lack of a better word) that have placed various works—for various reasons—onto the final Hugo slate. Larry Correia’s “Sad Puppy” slate (we’re up to Sad Puppies 3 in 2015) has been the most effective, but the campaign to place Robert Jordan’s entire series The Wheel of Time also worked remarkably well in 2014. We’ve also seen some influence from eligibility posts (such as in Mira Grant’s case) on the final Hugo slate.
At this point, I think the effectiveness of campaigning is clear. If an author (or group of authors and bloggers) decides to push for a certain text (or texts) to make a final Hugo slate, and if they have a large and passionated enough web following, they can probably do so. Whether or not that is good for the awards is another question, and one that I’m not going to get into here. A quick google of “Hugo Award controversy” will find you plenty of more meaningful opinions than mine.
Instead, I want to focus on this question: How much influence are campaigns likely to have this year? Let’s refresh our memory on the Hugo rules, taken right from the Hugo website:
Nominations are easy. Each person gets to nominate up to five entries in each category. You don’t have to use them all, but you have the right to five. Repeating a nomination in the same category will not affect the result; for instance, if you nominate the same story five times for Best Short Story, it will count as only a single nomination for that story. When all the nominations are in, the Hugo Administrator totals the votes for each work or person. The five works/people with the highest totals (including any ties for the final position; see below) go through to the final ballot and are considered “Hugo Award nominees.”
So, from January 15th to March 10th, eligible WorldCon members are casting nominating ballots to choose the final slate for the 2015 Hugos. Who is eligible to vote?
Anyone who is or was a voting member of the 2014, 2015, or 2016 Worldcons by the end of the day (Pacific Time/GMT – 8) on January 31, 2015 is eligible to nominate. You may nominate only once, regardless of how many of those three Worldcons you are a member.
You can either be an attending member (i.e. someone who actually goes to the WorldCons) or a “supporting member,” which costs you $40 and allows you to participate in the Hugo process. That “supporting member” category has been the buzzed about issue. While $40 may seem like a lot, that nets you the “Hugo Voting Packet,” which includes most of the Hugo-nominated works (author/publishers decide if they’re included). If you like e-books, $40 is a decent bargain for a variety of novels, novellas, novelettes, and short stories. Supporting membership also lets you nominate for at least 2 years (the year you join and then the following year), even if you only get to vote on the final slate once. All around, that’s a pretty good deal.
EDIT (see comments): I’ve been told that the “Voting Packet” is not guaranteed by the Hugo Awards. This has been the practice for the last several WorldCons, but it depends more on rights-holders and the individual WorldCon committees as to whether that it will happen in any given year. So don’t join for the sole reason of grabbing a packet!
That’s a fair amount of info to wade through, and it shows how the Hugo nomination process is relatively complex. Nominations combine last year’s Hugo voters with a new crop of attending and supporting members. That year-to-year carry-over means you don’t start fresh, and this is why the Hugo often feels very repetitive: if voters voted for an author the previous year, they can vote for that author again. I’d go farther than that: they’re very likely to vote for that same author again. This is one of the reason I have Leckie predicted so high for 2015.
So, preliminaries aside, let’s start looking at some data. What did it take to get onto the ballot in 2014? In this case, I’m mining the data from the 2014HugoStatistics, which gives us all the gritty details on what went down last year.
That chart puts into sharp relief why campaigns work: you can get a Hugo Best Novel nomination for fewer than 100 votes. The other categories are even less competitive: a mere 50 votes to make the Short Story final slate? Given the way the modern internet works, putting together a 50 vote coalition isn’t that difficult.
Now, how did the various texts from the Wheel of Time campaign and the Sad Puppy 2 campaign perform?
A couple notes: Sad Puppy 2 got their top nominees in well above the minimums, particularly in Correia’s case. You can also see that, even within the Sad Puppy 2 campaign, different authors received different numbers of votes. 184 voted for Correia, but only 91 (less than 50%) followed his suggestion and also voted for Hoyt.
What can we learn from this chart?
1. Correia grabbed 11.5% of the vote and Jordan around 10%. Correia also ran a Sad Puppy 1 campaign in 2013 that netted him 101 votes and 9% (he placed 6th, just missing the final ballot). Using that data, I could predict 2015 in two different ways: I could average those campaigns out, and argue that a vigorous Hugo campaign will average around 10% of the total vote. While a campaign brings supporters in, it also brings in an opposition party that wants to resist that vote. 10% seems a pretty reasonable estimate. The other way to model this is to note that Correia’s number of voters increased from 101 in 2013 to 184, an impressive 80% increase. If Correia matches that increase this year, he’d jump from 184 to 330 votes. In an earlier post, I estimated the total nomination ballots for this year to be around 2350 (that’s pure guesswork, sadly). 330/2350 = 14.0%. Either way, the model gets us in the same ball park: Sad Puppies 3 is likely, at the top end, to account for between 10% and 15% of the 2015 Hugo nominating vote. For good or bad, that will be enough to put the top Sad Puppy 3 texts into the Hugo slate.
2. The data shows that the Sad Puppy 2 campaign fell off fairly fast from the most popular authors like Correia to less popular authors like Toregersen (60% of Correia’s total) and Hoyt (50% of Correia’s total) to Vox Day (33% of Correia’s total). Torgersen and Vox Day made the final slate based on the relatively weakness of the Novella and Novelette categories. While I don’t track categories like Novella, Novelette, or Short Story on Chaos Horizon (there’s not enough data, and I don’t know the field well enough), I expect a similar drop-off to occur this year. If you want to assess the impact of the whole Sad Puppy 3 slate, think about which authors are as popular as Correia and which aren’t.
If we put those two pieces of data together, we get my “Hugo Campaign Model”:
1. A Hugo campaign like “Sad Puppies 3” will probably account for 10-15% of the 2015 nominating vote.
2. The “Sad Puppies 3” slate will fall off quickly based on the popularity of the involved authors.
How does that apply to the 2015 Sad Puppy Novel slate? Brad Torgersen (running it this year instead of Correia) put forth 5 novels:
The Dark Between the Stars– Kevin J. Anderson – TOR
Trial by Fire – Charles E. Gannon – BAEN
Skin Game – Jim Butcher – ROC
Monster Hunter Nemesis – Larry Correia – BAEN
Lines of Departure – Marko Kloos – 47 North (Amazon)
Based on my modeling, I expect Monster Hunter Nemesis and Skin Game to make the 2015 Hugo slate. Butcher is even more popular than Correia. As such he should hold on to (or even improve upon) most of Correia’s campaign vote. The other authors are not as popular, and will probably hold on to between 60%-30% of the Sad Puppy 3 vote. They’ll probably wind up in the 8-12 spots, just like Hoyt did last year.
As for the other categories—you’ve got me there. If I had to guess, I’d pick the 2 most popular Sad Puppy 3 choices for each category (I don’t even know how to begin doing that) and predict them as making the final slate. That’s sort of how the math worked last year, with 2 “Sad Puppy” slate nominees making into Novelette and Novella. It’s a more robust slate this year, which might actually hurt the chances of more texts making it (by dividing the vote)
Of course, there could be a major change in the nominating pool this year: more voters mean less predictability. Still, given the lack of “huge” 2014 SFF books (The Martian probably isn’t eligible, and we didn’t get a Martin/Mieville/Willis/Bujold/Vinge/Stephenson book this year), I don’t anticipate it being particularly difficult to make the 2015 slate. It’s also hard to predict the “Sad Puppy” support doubling or tripling in just a year’s time. It’s also exactly difficult to imagine “Sad Puppy” support collapsing by 50% or 75% percent. We’ll find out soon enough, though.
So, that’s the model I’m going to us to handle Hugo campaigns. Satisfied? Unsatisfied?
The 2015 SFF Award season has kicked off in earnest with the shortlists of the British Science Fiction Award and the Kitschies, joining the Philip K. Dick Award. All are lightly predictive of the Nebula and Hugo, with the lists usually overlapping by one or two nominees each (winning a BSFA or Kitshie is less predictive).
Each SFF award has different rules, biases, histories, and tastes. Since both the BSFA and Kitschies are British awards, they tend to lean in a British direction; both lean heavily towards SF instead of fantasy. To predict the Hugos and Nebulas, we’re looking for multiple nominations running up through Hugo and Nebula season. So where are we at today?
The BSFA first. The BSFA is a fan-voted award by the members of the British Science Fiction Award:
The BSFA awards are presented annually by the British Science Fiction Association, based on a vote of BSFA members and – in recent years – members of the British national science fiction convention Eastercon. They are fan awards that not only seek to honour the most worthy examples in each category, but to promote the genre of science fiction, and get people reading, talking about and enjoying all that contemporary science fiction has to offer.
BSFA Best Novel:
Nina Allan, for The Race, published by Newcon Press
Frances Hardinge, for Cuckoo Song, published by Macmillan
Dave Hutchinson, for Europe in Autumn, published by Solaris
Simon Ings, for Wolves, published by Gollancz
Anne Leckie, for Ancillary Sword, published by Orbit
Claire North, for The First Fifteen Lives of Harry August, published by Orbit
Nnedi Okorafor, for Lagoon, published by Hodder
Neil Williamson, for The Moon King, published by Newcon Press
That short-list follows the BSFA traditions: a heaping slice of British-focused SF with a few American novels sprinkled in. Allan, Hutchinson, Ings, and the Okorafor have been very popular in the UK critical scene, and none of them have crossed over to the US. Expect that to be a theme in this year’s awards. While the internet is bringing the global SFF scene closer together than ever, publishing has yet to catch up. I’m looking at a BSFA short list where half of the books are either unavailable in the US or were published here with minimal promotion (such as Europe in Autumn or The Race). The Lagoon would be a strong Nebula candidate if it had received any kind of American release. Now, it’s going to get buried (in the US at least) behind Okorafor’s 2015 release of The Book of Phoenix.
The Kischies are a smaller, quirkier award that is a juried award (i.e. impossible to predict) based on open nominations. Here’s there description:
The Kitschies reward the year’s most progressive, intelligent and entertaining works that contain elements of the speculative or fantastic. Now in our sixth year, we are proud to be sponsored by Fallen London, the award-winning browser game of a dark and mysterious London, designed by Failbetter Games.
The Kitschies’ 2014 finalists were selected from 198 submissions, from over 40 publishers and imprints. Congratulations to all who made the shortlists, and thanks to everyone who submitted a title for consideration.
Kitschies Novel (Red Tentacle award):
Lagoon, by Nnedi Okorafor (Hodder & Stoughton)
Grasshopper Jungle, by Andrew Smith (Egmont)
The Peripheral, by William Gibson (Viking)
The Way Inn, by Will Wiles (4th Estate)
The Race, by Nina Allen (NewCon Press)
Okorafor and Allan show up again, although Allan’s last name is misspelled. Never a great sign for your chance of winning! The Peripheral is the slightly unexpected choice. If Gibson can grab 2-3 other award nominations this season, his Hugo chances will be greatly improved.
Normally, I’d move Okorafor and Allan up in my Hugo and Nebula predictions as a result of their strong showing here. Without print copies available to an American audience, I don’t think these nominations help their chances. To nominate a book you’ve got to have read it . . .
Niall Harrison over at Strange Horizons—who knows the British SF scene far, far better than I do—has some good analysis of these awards. He mentions that VanderMeer’s Annihilation missed both awards, as it did the PKD. I’m not worried (yet) about VanderMeer’s chances—the BSFA and Kitschies awards have usually been SF focused, and VanderMeer’s book is more of an “inbetween” genres book. I expect VanderMeer’s Nebula nomination to fuel his Hugo chances, just like what happened with Jo Walton’s Among Others a few years ago. A lot of Hugo voters wait until after the Nebula noms come out to nominate, and anything that shows up on the Nebula list usually gets a big Hugo boost. Sofia Samatar almost made last year’s Hugos based on that kind of Nebula updraft.
I expect Okorafor to do very well in the British awards this year. That’s going to kick off quite a conversation about the differences between US and British and world publishing, which should make for some interesting reading.
Over the past few posts, I’ve been looking at the correlation between Amazon data, Goodreads data, and the mythical “actual books sold” data that we don’t have. It would be nice if either Amazon or Goodreads correlated with that data, because then we’d be able to get a good estimate of actual sales.
Unfortunately, it appears that both the Goodreads and the Amazon data is demographically unreliable. That makes a certain amount of sense: websites cater to very specific audiences, and specific audiences don’t reflect the general reading public. Goodreads seem to be lean very young (under 40), and, according to Quantcast, has around a 70/30 female/male demographic. Amazon seems more neutral in terms of gender, but leans older (over 40) and wealthier (i.e. people who have enough money to buy lots of books online).
I’ve got one more set of data to present to you: for the past 5 months, I’ve been collecting data points that would compare Goodreads to BookScan numbers. BookScan is a point-of-sale recording service to see how many books are sold; instead of estimating from sampling, they actually try to count how many books are sold by different venues. They claim to cover some 80-90% of the market, although that’s probably inflated. Here’s a good article from Forbes that can serve as an introduction. I’ve heard plenty of authors say that Bookscan grabs less than 25% of their sales, and I think the more sell you through untraditional means (at cons, through small bookstores, etc.), the worse the BookScan numbers are.
Most BookScan data is locked behind a huge paywall—but Publisher’s Weekly prints weekly Hardcover bestseller lists on their website. They try to make it difficult (they only print # books sold this year), and they don’t include e-books. Still, if we were to take that data and compare it to Goodreads data . . . we’d have something.
This is exactly what I’ve done. Since early November, I’ve been tracking (weekly) any SFF book (broadly defined, and also including horror) that has shown up on the Top Hardcover Fiction list. I’ve then been comparing that to the number of Goodreads ratings for that week to see if there’s a sensible correlation between the two numbers.
While this isn’t perfect—some authors may sell a higher proportion of e-books than others—we’ll at least have a rough look at “actual physical” sales versus Goodreads ratings.
Only six SFF novels showed up on the Hardcover Bestseller list in the last 4 months. I put down the publication date because more people buy a book right when it comes out than read it right when it comes out; a long book like Revival might take people a month to read, so it may take a while for Goodreads ratings to catch up with sales. “Last Data” is the data for the week when the book fell off the cart; Gibson fell off quickly (two weeks), while Rothfuss stuck around longer. King is still going strong. The Hardcover column is the total amount of Hardcover sales as given by Publisher’s Weekly for that week; the Goodreads column is the number of Goodreads ratings from that same week.
Lastly we have the interesting column: the ratio of Hardcover sales/Goodreads ratings. In an ideal world, that would have been close the same number of everyone.
It is not. Goodreads is tracking the Rothfuss and Mandel in a totally different way than it is King, Rice, Koontz, or King. Perhaps this is because Mandel and Rothfuss are selling more to Goodreads’ specific audience (younger, female). Perhaps this is because the books are shorter. Perhaps this is because King and Rice sell more in places like Wal-Mart or Target, whose readers aren’t using Goodreads. Perhaps King and Rice are selling primarily to older readers, who are less inclined to use internet websites to record their reading habits.
In a complex statistical case like this, it probably comes down to a multitude of factors. With just 6 books, we don’t have enough data to hash that out. What we can say is that someone like Mandel is overperforming (compared to the average) on Goodreads in an enormous way. Even if we consider that a young author like Mandel might have a 50/50 Hardcover/e-book split in her sales (thus meaning she’s sold around 120,000 copies of Station Eleven, which seems reasonable), almost 25% of her total readers rated the book on Goodreads. That is astonishingly high. In contrast, King–whose probably tripled or quadrupled Mandel’s sales–only has about 5% of his readers on Goodreads.
That’s an enormous gap, and re-enforces what we learned in the last post: Goodreads is not a reliable indicator of total readers. It’s tracking Mandel and King in totally different fashion, and to compare Mandel to King via Goodreads makes Mandel seem more popular than she is and King less popular.
That doesn’t mean Goodreads is useless: it just means that it tracks a specific demographic. Whether that demographic is more in touch with the Hugo/Nebula awards is an open question.
One last chart for true stat geeks: let’s see what’s happened to the Amazon/Goodreads ratio over time. Not a ton of data here, as only 4 SFF books had a decent run on the Bestseller chart. Here it is:
You can see that Rice and King have reasonably shaped curves that are converging to around 15 in King’s case and at about 30 in Rice’s case. Mandel and Rothfuss have basically straight lines: they were popular on Goodreads to start, and haven’t changed at all. That re-enforces my last point: Goodreads treats King and Rice fundamentally differently than Mandel or Rothfuss.
With enough time and data—which we don’t have—we might be able to get a better sense of why books are tracked in different ways. Perhaps it would be a simple demographic correction (authors over 40 have this kind of ratio, authors under 40 have this kind of ratio). However, since Publisher’s Weekly doesn’t share enough data, we’re stuck. So be careful when looking at Goodreads numbers; they reflect a young audience, and are misleading when making comparisons between a Mandel and a Gibson.
I won’t lie: I’m a little disappointed that the Goodreads data isn’t more reliable. Given the large sample size, I’d hoped that Goodreads would flatten out any demographic bias. It doesn’t appear to do so, so any Goodreads numbers should be approached with healthy skepticism.
Next up for Chaos Horizon: start collecting Amazon data to see if that’s a better match to Publisher’s Weekly. Check back in a couple months, and we’ll see if that data lines up any better!
Time for another of my “boring but important” mathematical posts. One thing I’d like to know—and I think many other SFF observes as well—is how much SFF books actually sell. For many entertainment industries, this kind of information is readily available. There are great sites like Box Office Mojo for films or TV by the Numbers for TV. Both are free, well-designed, and easy to use.
But the book industry? They either don’t offer the information or, if they do, lock it behind paywalls. BookScan purports to track a fair portion of the field, but they have exorbitant rates (here’s an enrollment form asking $2,000 for a membership through Publishers Marketplace) and also have strict terms of service that would prevent anyone from broadly sharing that data in public. Other sites like BookStats (an annual survey of publishers) offer equally steep rates (that begin at an eye-popping $2,995 and only head up!).
What does all of this mean? That we don’t have access to free, reliable sales data for the book industry. I think this is a huge mistake on the part of the book industry; freely sharing data hasn’t hurt the movie industry, at it lets viewers hotly debate their favorite movies and the tricky relationship between sales and quality. If people are talking about your industry, they’re involved in it—and likely growing it. The more readers are locked out of conversations about books, the more likely they are to drift over to other industries that allow for fuller participation. People like numbers, and charts, and debating; they like to see how their favorite movie or TV show or book is selling.
Sadly, transparency has never been a feature of the book industry. So that means a site like Chaos Horizon is left to patch together popularity estimates through frustratingly inadequate and inaccurate techniques. Since we don’t have point-of-sales tracking, we’re left moving to a space like Amazon or Goodreads, which samples the reading public through the number of reader reviews. Since both those websites are fairly big, you can argue that both websites sample a large enough portion of the population to be statically meaningful. If a book sells 10,000 copies, and 1,000 people rank it on Goodreads, that’s a pretty solid 10%. Amazon tends to samples at a lower rate, so they might only grab 1% of the total readership. Still, that’s better than nothing . . .
If there wasn’t bias built into the Goodreads and Amazon user bases. I’m using bias in a purely statistical sense here, to indicate a demographic issue that skews from the norm. So, let’s say 50% of readers are men and 50% are women. To make a good sample, you’d have to be sure to have 50% men and 50% women in your sample (or you could correct your sample after it was done, if you like fancy math). Simple enough? The same with age, income level, etc., all the basic demographic categories.
If you take a look at the demographic information for Goodreads, you’ll see that it skews pretty substantially. If you head over to Quantcast, a web demographic site (why is this free but I can’t look at book sales numbers?), the report Goodreads as having a 71% women / 29% men visit ratio. That’ll definitely skew the data. All this info is at the bottom of the Quantcast page; if you click through it, you’ll see that Goodreads skews towards women, towards people from age 18-34, and towards the people with either undergraduate or graduate education. All of that makes a certain amount of sense: younger people are more likely to use social media, and avid book readers are probably more likely to be college educated. This means, though, that every bit of Goodreads data is going to be biased towards certain audience tastes.
Amazon’s demographic bias is harder to find. They’ve opted out of Quantcast, I’ve read several studies that suggest Amazon is biased towards older, high-income users, and the highly-educated. CBS News echoes all of that, and also reports Amazon as gender neutral. That’s from 2010 (when Quantcast data was still available for Amazon); I don’t know if it has changed or not. Still, that means the demographics between Amazon and Goodreads are hopelessly off: young versus old, average income versus high income, 70/30 gender versus 50/50. At least they converge on education level!
How much does bias like that matter in practice? An enormous amount. Let’s look at a comparison of # of Goodreads ratings to # of Amazon ratings for the 2015 Hugo contenders:
Take a look at the far right column: that’s the ratio of Goodreads/Amazon ratings. That range is what we’d call in statistical terms “a hot mess.” You range from Beukes having a 60x multiplier down to Gibson having a paltry 8x. Even if you toss out Beukes as an outlier, it would appear that some writers get ranked at a 4 time higher rate on Goodreads than Amazon. What’s happening?
This is demographic bias at work. Goodreads favors certain books and dislikes other books (in sampling terms). Since the Goodreads readership is younger + female, a book by Gibson (older + male) shows up much lower on the list. Books that appeal to the Goodreads demographic (presumably female-friendly books that market/cater to a slightly younger audience) do very well on Goodreads. Books catering to an older or more male audience tend to worse, at a rate of about 3 – 4 times (at the most extreme; most books are a little less than that).
So, what do we conclude? That Goodreads is biased. That Amazon is biased. If you wanted to correlate the Goodreads number to the Amazon numbers, you’d have to multiply the least Goodreads friendly books by around 3 or 4 times. But why correlate one set of biased numbers to another set of biased numbers? That’s statistically pointless: if you want to know the younger, more female audience, use Goodreads. If you want the older, richer, more gender-neutral audience, use Amazon.
So, we’re swirling around the question: can we correlate Amazon or Goodreads numbers to actual sales? What we’ve learned in this post is, according to Quantcast data either as openly accessible or reported by CBS:
1. In demographic terms, Goodreads is biased towards women, younger readers (18-34), and the highly educated.
2. In demographic terms, Amazon is biased towards older readers, higher income readers, and the highly educated.
3. This results in substantial differences (up to around 4 times) at the rate which they review books.
That’s all well and good. We now have a way to compare Amazon to Goodreads numbers through demographic correction (if we wanted to). But how do those two sets of biased numbers actually sync up with sales? I have some Bookscan data I’ll be sharing with you in the next post!
Time to get technical! Break out the warm milk and sleeping pills! Earlier in the week, I took a look at Amazon and Goodreads ratings of the major Hugo candidates. An astute viewer will notice that those rankings don’t exactly line up (nor do the number of ratings, but I’ll address that in a later post). Which is more representative? Which is more accurate?
At Chaos Horizon, I strive to do things: 1. To be neutral, and 2. Not to lie. By “not lie,” I mean I don’t try to exaggerate the importance of any one statistical measure, or to inflate the reliability of what is very unreliable data. So here’s the truth: neither the Goodreads or Amazon ratings are accurate. Both are samples of biased reading populations. Amazon over-samples the Amazon.com user-base, pushing us towards people who like to read e-books or who order theirs books online (would those disproportionally be SFF fans?). Goodreads is demographically biased towards a younger audience (again, would those disproportionally be SFF fans? Worldcon voters?). Stay tuned for more of these demographics issues in my next post.
As such, neither gives a complete or reliable picture of the public reaction to a book. If you follow Chaos Horizon, you’ll know that my methodology is often to gather multiple viewpoints/data sets and then to try to balance them off of each other. I’ve never believed the Hugo or Nebula solely reflects quality (which reader ratings don’t even begin to quantify). At the minimum, the awards reflect quality, reader reception, critical reception, reputation, marketing, popularity, campaigns, and past voting trends/biases. The Chaos Horizon hypothesis has always been that when a single book excels in four or five of those areas, then it can be thought of as a major candidate.
Still, can we learn anything? Let’s take a look at the data and the differences in review scores for 25 Hugo contenders:
The column on the far right is the most interesting one: it represents the difference between the Amazon score and the Goodreads score. As you can see, these are all over the place. There are some general trends we can note:
1. Almost everyone does better on Amazon than Goodreads. The average Amazon boost is .19 stars, and only three books out of 25 scored worse on Amazon than Goodreads. Amazon has a higher bar of entry to rate (you have to type a review, even if it’s one word; Goodreads lets you just enter a score), so I think more people come to Amazon if they love/hate a novel.
2. There doesn’t seem to be much of a pattern regarding who gets a higher ranking bump. Moving down the top of the list, you see a debut SF novel, a hard SF novel, an urban fantasy novel, a YA novel, etc. It’s a mix of genres, of men and women, of total number of ratings, and of left-leaning and right-leaning authors. I’d have trouble coming up with a cause for the bumps (or lack thereof). So, if I had to predict the size of a bump on Amazon, I don’t think I could come up with a formula to do it. I’ll note that since Amazon bought Goodreads, I think the audiences are converging; maybe in a few years there won’t be a bump.
3. If you want to use either ranking, you’d have to think long and hard about what audience each is reflecting, and what you’d want to learn from that audience’s reaction. It would take a major effort to correlate/correct the Amazon.com audience or Goodreads audience out to the general reading audience, and I’m not sure the effort would be worth it. Each would require substantial demographic corrections, and I’m not sure what you would gain from that correction. You’d have to make some many assumptions that you’d wind up with a statistics that is just as unreliable as Goodreads or Amazon.
I think “Reader Ratings” are one of the most tantalizing pieces of data we have—but also one of the least predictive. I’m not sure Amazon or Goodreads tells you anything except how the users of Amazon or Goodreads rated a book. So what does this mean for Chaos Horizon, a website dedicated to building predictive models for the Hugo and Nebula Awards?
That reader ratings are not likely to be useful in predicting awards. Long and short of it, Amazon and Goodreads sample different reading populations, and, as such, neither are fully representative of:
1. The total reading public
2. SFF fans
3. Worldcon Voters
4. SFWA voters
Neither is 100% useful (or even 75% . . . or 50%) reliable in predicting the Hugo and Nebula awards. So is it worth collecting the data? I’m still hopeful that once we have this data and the Hugo + Nebula slates (and eventually winners), I can start combing through it more carefully to see if it comes up with any correlations. For now, though, we have to reach an unsatisfying statistical conclusion: we cannot interpret Amazon or Goodreads ratings as predictive of the Hugos or Nebulas.