Over the past two weeks, Chaos Horizon has been looking into the idea of Repeat Nominees and the Hugo and Nebula Awards for Best Novel, 2001-2014. Remember, Chaos Horizon is a website dedicated to providing predictions for the Hugo and Nebula Best Novel awards, and I want these predictions to be based on more than my opinions about what the “best” books of the year are. If you want those kinds of opinions, the internet is crawling with them.
Instead, Chaos Horizon takes the position that we can better understand the Hugo and Nebula awards by data-mining past awards to find patterns concerning nominations and winners. While this won’t allow us to know with 100% certainty how future awards will go—that’s not how statistics work—this will allow us to make informed guesses as to what the nominees and winners will be in the future.
The basic hypothesis I’m working with is that there are 7 or 8 determining factors which factor into these awards. Roughly speaking, these are: past awards history, critical reception, reader reception, popularity/sales, marketing and web footprint, genre, demographic concerns, and reputation. Some of these are incredibly hard to quantify (reputation, for instance); others are slippery (genre); others are changing rapidly (demographics); and others are mine-fields of conflicting opinions (critical and reader response). Nonetheless—and perhaps foolishly—I believe we can wade into these factors and make some sense of them.
So, in regards to “repeat nominations”—one aspect of awards history—what have we learned in Parts 1 to 6, and how can this information be applied? For those who didn’t read Parts 1 to 6 (and they got pretty technical!), here’s what I think we can conclude:
Conclusion #1: The Hugo and Nebula Best novel slates are substantially biased towards authors who have previously received a Best Novel nomination, to the tune of 65% for the Hugo and 50% for the Nebula.
Application #1: When I make a prediction for the Hugo Slate, my prediction should be 2/3 previously nominated authors, and 1/3 rookie authors. For the Nebula, I should go 1/2 previously nominated authors, 1/2 rookie authors.
Conclusion #2: The Hugo Award slate favors super-repeaters, authors who get nominated for the Best Novel award over and over again. In 2001-2014, the top 7 Hugo authors accounted for 45% of the total ballot. The Nebula award does no show the same bias towards super-repeaters.
Application #2: When putting together a prediction for the Hugo slate, I need to pay special attention to the authors who have previously received more than 4 nominations.
Conclusion #3: Winning the Nebula Award is biased towards past winners and repeat nominees, with 64% of the winners having previously appeared on the ballot and 43% having won before. Proportionally, the Hugo did not show the same bias towards past winners and repeat nominees.
Application #3: When predicting the Nebula winner, I need to strongly factor in past Nebula nominations and wins.
Conclusion #4: There is no strong evidence to suggest that Hugo or Nebula Best Novel nominees need to have been nominated in other Hugo and Nebula categories before snagging a Best Novel nomination or win.
Application #4: I need to be careful about predicting authors “jumping” from the Short Story, Novelette, or Novella categories to the top of the slate. While it happens, it’s not the advantage you would think. Or, in other words, I need to keep on open mind towards authors completely new to the Hugo and Nebula process.
Conclusion #5: Despite the Hugo and Nebula favoring “repeat nominees,” even the repeaters don’t get every novel nominated. Most repeaters only manage a 25%-50% nomination rate, no matter how popular.
Application #5: I can’t blindly put popular authors onto my watchlist, but I need to analyze how each specific novel was received, including factors such as genre and critical/reader response.
That’s a fairly fruitful study, yielding some specific application that can help improve my watchlists and predictions. As I continue to do these, hopefully Chaos Horizon can become more and more useful as a resource to the SFF community.
What this information doesn’t tell us is whether this “bias” is good or bad. Maybe you believe that there are only 20-25 exception writers at work today in SFF today, and that the centralization of the awards reflects the excellence at the top of the field. Or maybe you believe there are hundreds of interesting SFF writers, and that some are unfairly excluded from the awards because of this centralization towards past winners and nominees. Chaos Horizon, nor any kind of statistical analysis, can answer those questions for you.
Here’s my Excel worksheets with the data I used. Let me know if you have any questions about methodologies or how I came up with my numbers.
So, finally, what do you think of the results? Did you expect the Hugos and Nebulas to work in this way? Was some of the information surprising? Is there additional information we need about repeat nominees and these awards? How will this factor in predicting the Hugos and Nebulas? Has this hurt or helped your perception of some of the chances of the 2014 candidates?
Thanks for the reading, and stay tuned for the next Chaos Horizon report, where I’m going to tackle the question of genre and the Hugo and Nebula awards! How biased are these awards towards science fiction? How much do they hate fantasy? We’ll find out soon . . .
Almost done with this report, I swear. In the comments, it was suggested that it would be a good idea to look at nomination %. This’ll give a great piece of context for the “repeat nominees” charts: obviously someone who publishes a novel every year can get more nominations than someone who publishes a novel every 5 years. However, the author who publishes a novel every 5 years might have a higher nomination percentage than a more prolific author. I limited this study to the repeat nominees, those authors who had multiple nominations in the 2001-2014 time period.
So how does this shake out for the Hugo and Nebula Best Novel Awards, 2001-2014? The Hugo chart first:
***Gaiman turned down two nominations. I’ve included those in his percentage, because the voters did vote Gaiman into the slate.
What this chart shows is the author’s Nomination % (number of novels nominated divided by number of novels published) in the 2001-2014 period. I did make one caveat: I only counted novels published after the author’s first nomination. I figured that this first Best Novel nomination brought the author into the spotlight, and that their later novels received more attention and had a much better chance of being nominated. When an author had their initial nomination before the year 2000, I counted all the novels they published between 2001 and 2014. I pulled all this information off of sfadb.com and isfdb.org.
What does this chart show us? That individual authors have greatly different publishing habits, from the incredibly prolific Stross to rather unprolific Willis. There is some good information here: when Willis or Martin publish another novel, they’re very likely to be nominated again. Kim Stanley Robinson, on the other hand, doesn’t stand quite as good of a chance. This kind of information is very relevant when putting together a Hugo Watchlist.
It would be possible to get even deeper into this chart. Mieville, for instance, published two YA adult novels that are on the chart; YA novels don’t that well on the Hugos. Likewise, a bunch of Stross’s novels are urban fantasy, and urban fantasy books don’t do as well as science fiction in the Hugos. Any credible watchlist or prediction has to take all those complexities into consideration.
A similarly interesting chart. The Nebula repeaters have a slightly worse nomination %, which is in line with the Nebula not being as friendly towards repeat nominations. Once again, we can use this chart to improve any Nebula Watchlists that Chaos Horizon puts together.
This report on Repeat Nominees has gotten more complicated by the post: there’s a lot of information to sift through here, and a number of different ways to slice the statistical pie. In Parts 5 and Parts 6, I’m going to provide some additional detail that commenters asked for. If you’d like to know anything more—and provided I can come up with a decent way of finding/presenting the data—just ask.
In Part 1, we identified a certain number of “rookies,” nominees for the Best Hugo or Best Nebula Novel Award (2001-2014) who had previously not been nominated for the award. If you recall, there were 25 “rookies” for the Hugo and 44 for the Nebula, or roughly 35% for the Hugo and 50% for the Nebula. In the comments, Niall asked whether or not these rookies had prior success on other parts of the Hugo and Nebula ballot, thus making them familiar to voters.
This is a very solid hypothesis, one we could call the “moving on up” idea: writers would first get nominated for Best Short Story (or Novelette, or Novella), and then would eventually “graduate” to the Best Novel slate. Data shows that this isn’t necessarily the case: you don’t need to have previously been nominated for any Hugo or Nebula categories to make the slate. Here’s some charts:
The numbers are relatively straightforward: of the 25 Hugo rookies, only 6.5 had received downballot Hugo nominations (I counted Brandon Sanderson as the .5; he shared his nomination with Robert Jordan for Wheel of Time). For the 44 Nebula Best Novel rookies, 15 had received a prior Nebula nomination in another category. I was generous in my counting, including such categories as “Best Related Work.” If this were limited to only fiction categories, that would cut the numbers down by a little more.
So, what do we learn? That being downballot on the awards doesn’t necessarily help you get up into the Best Novel category. Hugo and Nebula voters don’t necessarily insist that you’ve had success in the Short Story, Novelette, or Novella categories before you make it to the top of the ballot. Lots of pure rookies make the ballot—good news if you’re a rookie, but maybe disappointing if you’ve got some wins in the other categories.
This is a case where the raw statistics might be a little misleading. While they show 75% of the Hugo rookies never have been nominated before, we must acknowledge that this is a much larger pool of writers than those who have previously been nominated for Hugos. So, if we estimate that the pool of no-Hugo nominations as around 500 (I just made that number up, but it’s probably in the ballpark of the novels that could be considered “credible” Hugo contenders), and the pool of writers who have been nominated for downballot Hugos at around 100, you can see that there is a statistical advantage to being on other parts of the Hugo ballot. I’m still surprised; I was expecting a greater advantage. I thought the awards would be more hospitable to downballot success, as appearing on other parts of the ballot would make you more familiar to Hugo and Nebula voters. While it helps, it doesn’t seem to help that much.
Let’s think about a couple of examples: Elizabeth Bear has had good success on other parts of the Hugo ballot, with 4 nominations and 4 wins, 2 for Best Fancast, one for Best Novelette, and one for Best Short Story. She hasn’t had any luck, however, at cracking the top of the ballot. Ken Liu is going to be a great author to keep your eye on for 2016 awards. His debut novel, The Grace of Kings, is due out in April 2015. Liu has been enormously successful in the other Hugo and Nebula fiction categories: 3 Hugo noms, 2 wins, and 6 Nebula noms, 1 win. He’d be a prime example of someone you would expect to “graduate” to the Best Novel slates: but will he? Before this study, I might have Liu down as a “good” bet. Now, I’m not so sure.
So, in conclusion: while being on other parts of the Hugo and Nebula ballot helps, it’s not an enormous help, and we’ll have to be cautious predicting Best Novel nominations based solely on short fiction nominations or wins.
We trundle along—now we’re up to looking at the way “repeaters” impact winning the Nebula Award. Parts 1 and Parts 2 discussed how the Nebula ballot is centralized: not as much as the Hugo, but still in a substantial way. Roughly speaking, 50% of the Nebula nominees have already received a Nebula nomination for Best Novel. Do these “repeaters” stand a better chance of winning?
In the case of the Nebula, the answer is a resounding yes. In the 2001-2014 period, 6 winners had already won the Nebula Award for Best Novel: Bear, Bujold, Haldeman, Le Guin, Willis, and Robinson. That’s a robust 43%; in contrast, the Hugo had 27% repeat winners from 2001-2014.
An additional 3 winners had previously been on the Best Novel ballot: Asaro, McDevitt, and Walton. So, all told, 9 out of 14 (64%) of the winners had previously appeared on the Nebula Best Novel ballot. When we consider the 50%/50% split of the slate, that means there is a substantial bias towards past winners and nominees.
2 more winners, Gaiman and Bacigalupi, had previously appeared on other parts of the Nebula ballot. There were only 3 “rookies” who won in their first time appearing on the ballot: Moon, Chabon, and Leckie. Let’s look at this visually:
Compared to the Hugo chart from Part 3, this shows more centralization towards repeaters. That’s an interesting reversal: it’s easier to get on the Nebula slate as a rookie than the Hugo, but it’s easier to win the Hugo as a rookie than the Nebula. Perhaps this reflects the different make-up of the voting groups: the SFWA is made up of authors, and probably more likely to vote for “one of their own.” Remember, we’re talking about statistical bias here, not absolute numbers. Even if only 5% of the voting pool is swayed by familiarity, that can have a substantial impact on winning.
One thing you see in the Nebula that you don’t see as often in the Hugo is the “lifetime achievement win.” Let’s isolate Ursula K. Le Guin’s 2009 win for Powers. Le Guin’s credentials are untouchable; she’s likely one of the 10 most influential SFF writers of all time, and has made essential contributions to both Science Fiction (the Hainish cycle) and Fantasy (Earthsea). Before Powers, Le Guin had won an impressive 5 Hugos and 5 Nebulas. All of that said, Powers doesn’t make much sense as a Nebula winner. It’s the third volume of a Young Adult series; historically, sequels and Young Adult books haven’t done very well in the Nebulas. I don’t think many readers would identify Powers as a “central” or “essential” Le Guin novel; if I was recommending Le Guin books to read, this might end up near the bottom. If, for some god-forskaen reason, you haven’t read any Le Guin, start with Left Hand of Darkness. So why did Powers win?
We can’t know for sure, but I suspect this was a way of honoring Le Guin’s whole career. Even though she had already won 3 Best Novel Nebulas, the SFWA voters figured she needed another. A number of literary awards work this way. The Pulitzer Prize has actually done this quite a bit, periodically giving an author an award at the end of their career as an acknowledgement of all the great writing they’ve done. William Faullkner—my favorite author, for the record—has two Pulitzer Prizes for lesser novels, A Fable and The Reivers, while his best novels went unrecognized.
Although the SFWA gives a Grandmaster award, which Le Guin won in 2003, the Powers Nebula might be just another way of honoring a long and distinguished career. I think Haldeman’s win for The Accidental Time Machine also falls into this “lifetime achievement award” category. While Bear, Robinson, and Willis all won for stronger novels, I think there was also a little “sentimental” bump to their wins. Even a small bump can greatly change the outcome.
Part of these Chaos Reports is simply gathering information for future predictions. While I don’t think the Nebula always goes to an author for their career, it does happen sometimes. I wouldn’t be stunned to see this happen again in the next 5 or 10 years. Who would be a likely lifetime sentimental winner? George R. R. Martin? No matter the actual quality of A Dream of Spring, I think it’ll stand a good chance of winning. Could Vandermeer get a “career” boost this year for Annihilation?
That speculation aside, let’s look at whether getting lots of nominations correlates to wins:
Not much to be learned here. There are several writers—Jemisin, Hopkinson, Mieville—who got 3 nominations but haven’t managed a win yet. Writers with only 2 nominations actually did better. Thus, you can’t necessarily correlate the number of nominations to number of wins. McDevitt is the true testimony to this: 9 nominations, 1 win. That works out to an 11.1% win percentage. The baseline chance of winning a Nebula, assuming no bias—i.e. just drawing one of the six names from a hat—is 16.7%, so McDevitt actually did worse than that.
So, to sum up: the Nebula slate is less centralized than the Hugo slate, with more of a tendency to nominate a wider array of authors. Still, that doesn’t translate to those “rookie” authors having a solid chance of winning. Instead, the Nebula award more often goes to the “repeaters,” particularly past winners. To think about that numerically, the Nebula slate is roughly 50%/50% prior nominees/new nominees. However, the breakdown for winning the Nebula is 65%/35%. In contrast, the Hugo was roughly 67%/33% for nominees, and 67%/33% for winners. One of those is proportional (the Hugo), the other is not. The plot thickens!
Today, we’ll be continuing our look at repeat nominees and the Hugo and Nebula Award for Best Novel, 2001-2014. Part 1 looked at how often the Hugo and Nebula voters nominate writers who have already received Best Novel nominations, to the tune of 65% for the Hugo and 50% for the Nebula. In Part 2, we looked at how “repeat nominees” dominate the slates, particularly for the Hugo. The 7 most popular Hugo writers received 45% of all the possible nominations between 2001-2014. The Nebula was far more evenly distributed, with the most popular writers taking home 24% of the slots, and that number was greatly inflated by Jack McDevitt’s 9 Best Novel nominations.
So far, I’ve only looked at nominations, and not chances of winning. Do these “repeaters” dominate the Hugo and Nebula wins? Or do rookies and one-time nominees stand a chance?
Hugos: Between 2001-2014, there were 15 Hugo winners (Bacigalupi and Mieville tied in 2010). At the time of their win, 4 of those authors had previously won the Hugo for Best Novel: Willis, Gaiman, Vinge, and Bujold. 5 winners had previously been nominated for Best Novel: Scalzi, Mieville, Wilson, Sawyer, and Rowling. So, all told, 60% of the Best Novel Hugo winners had prior history in the Best Novel category.
4 of the winners were pure Hugo rookies at the time of their win: Leckie, Walton, Chabon, and Clarke. The other 2 winners—Bacigalupi and Gaiman (for his initial American Gods win; by the time he wins for The Graveyard Book, he’s a “repeater”)—had success “downballot.” Bacigalupi had several prior nominations for his stories, and Gaiman had a Best Related Book nomination for one of the Sandman volumes. To look at that visually:
All in all, that seems a fairly reasonably distribution: 66% of the total winners have Hugo history, and 33% are rookies. If we correlate that to the stats from Part 1 of the report (65% of the ballot is repeaters), we’ll see that there isn’t much of a bias once you make it into the slate. While it’s harder to get into the slate as a rookie, a rookie has just as good a chance, proportionally, of winning as a repeater. Hugo voters are fairer at picking a winner than picking the slate.
How does this correlate, though, to Part 2 of our report, the pool of “repeaters” who got 2 or Hugo nominations in the 2001-2014 period? On the surface, you’d think that this group would dominate the winner’s list: after all, they grabbed most of the slate spots. Here’s the data:
8 of the 15 winners were from this list of repeaters; that means 7 of the winners came from people who were only nominated once in the 2001-2014 period. That’s a 53%/47% split, or basically a coin flip. Even though these repeaters dominated the slate, they didn’t dominate the winner’s circle. This may suggest an interesting hypothesis: getting nominated is more biased towards reputation/award history, while winning depends more on the quality of the individual novel. Toss Gaiman out, and these “repeaters” actually have a worse chance of winning—proportionally, of course—than the non-repeaters.
The top of the repeaters list—those 7 authors who dominated 45% of the slate—only managed to win 33% of the Hugos, so dominating the slate doesn’t necessarily lead to winning the Hugo for Best Novel. Stross is perhaps the best example of this: he is the most nominated author between 2001-2014, but has no Best Novel Hugos to show for it. Don’t feel too sad for him, though; he picked up 3 Best Hugo Novella Awards in that time period.
Remember, the pool of winners (15) is a very small sample size, and we shouldn’t put too much stock in specific numbers. Instead, it’s the trends that matter. The main trend for the Hugo seems to be this: past Hugo Best Novel history doesn’t seem to matter much when it comes to actually winning the award. Given that making the slate seems biased towards past Best Novel nominees, this is an interesting result.
I’ll take a look at the winners of the Nebula in the next post.
In Part 1 of this report, we discussed how the Hugo and Nebula Award for Best Novel are heavily weighted towards writers who have previously been nominated for those awards, to the tune of 65% for the Hugo and 50% for the Nebula. While those numbers are interesting—and perhaps eye-opening—they don’t tell us how centralized these awards are. Is this a bunch of different writers receiving 2 nominations each, or few select writers receiving 6, 7, or more nominations?
Today, we’ll look at how frequently the most popular writers were nominated in the 2001-2014 time period for the Hugo and Nebula award. The methodology here is simple: I took the lists of the Hugo and Nebula Best Novel nominees form 2001-2014 and counted the number of awards each received. Here are the results.
Hugo Awards: From 2001-2014, 37 unique authors (counting Jordan/Sanderson for Wheel of Time as one author) were nominated for a total of 72 Hugo Award Best novel slots. 24 of those authors received only one nomination, and the 13 other authors shared the remaining 48 nominations. Here’s the list of the “repeaters”:
Table 1: Number of Nominations for Best Novel Hugo Award for Repeat Nominees, 2001-2014
This list would have been even more pronounced if Neil Gaiman hadn’t turned down two Hugo nominations, one for Anasazi Boys and one for The Ocean at the End of the Lane. Even without that, there is still a very definite centralization in the Hugo Award for Best Novel. The top 7 candidates (Stross, Mieville, Sawyer, Bujold, Grant, Scalzi, and Wilson, all of whom have at least 4 nominations in the past 14 years) racked up an impressive 33 nominations (out of 72 total), for 45.8% of the Hugo award slate. The rest of the SFF publishing world received 39 nominations, for 54.2% of the slate.
Nebula Awards: The Nebula is rather more balanced. In the 2001-2014 period, 61 unique authors were nominated for a total of 87 Best Nebula novel slots. 46 authors received only one nomination each, with the remaining 15 “repeaters” sharing 41 nominations. Here’s the list:
Table 2: Number of Nominations for Best Novel Nebula Award for Repeat Nominees, 2001-2014
With the exception of Jack McDevitt’s world-crushing domination of the Nebula nominations, that’s a pretty evenly distributed list: a fair amount of authors getting 2 or 3 nominations, but no one (but McDevitt) getting 4 or 5 nominations. The top 5 “repeater nominees” (McDevitt, Bujold, Hopkinson, Jemisin, and Mieville, each of whom at least 3 nominations) managed 21 nominations between them, accounting for 24.1% of the total nominations. That number is a little misleading since McDevitt alone accounted for 10% of 2001-2014 field. As a side note, I have no idea why McDevitt has done so well in the Nebulas. In any statistical analysis of the Nebulas, his domination distorts the numbers, and certainly makes the Chaos Horizon predictions more difficult. If anyone has insight into his success, please share.
So, in conclusion: the Hugo is heavily centralized around a small number of repeat nominees. The Nebula, with the exception of Jack McDevitt, is spread out over a much greater number of authors, and demonstrates only mild centralization.
In the next part of this report, we’ll look at what impact repeat nominations have on the chances of actually winning the Hugo or Nebula for Best novel.
One of the basic hypotheses of Chaos Horizon is that the Hugo and Nebula Awards for Best Novel are predictable. That doesn’t mean we can predict them with 100% accuracy, but that by studying past patterns in the Hugo and Nebula awards, we can make statistical estimates as to what is likely to happen in the future.
One of the things that makes Chaos Horizon possible is that both the Hugo and the Nebula are very repetitive: they tend to nominate the same authors over and over again. In this report, we’ll look at how extensive repeat nominations are in the 2001-2014 time period. Like my last report on Dates and the Hugo award, I’ll break this up into several posts:
1. Number of repeat nominees in the Hugo and Nebula Award for Best Novel, 2001-2014 (this post)
2. Percentage of Hugo and Nebula Best Novel nominations going to the same authors, 2001-2014
3. The impact of repeat nominees on winning the Hugo Best Novel award, 2001-2014
4. The impact of repeat nominees on winning the Nebula Best Novel Award, 2001-2014
5. Conclusions and discussion
Methodology: Since this is a simpler study than the Hugo Awards date study, the methodology is simpler. I cross-referenced the list of Hugo and Nebula winners with the excellent Science Fiction Awards Database (sfadb.com) to see if each of the nominees from 2001-2014 had received a prior nomination for Best Novel. I kept the Hugo and Nebula award lists separate; i.e. I didn’t count someone as a repeat nominee for the Hugo if they had received a prior Nebula nomination, and vice versa.
With regard to dates, I’m using the 2001-2014 period so we can get a picture of how the Hugos and Nebulas operate in the 21st century. I believe that the awards have changed substantially since the 1980s and 1990s, and including data from previous eras (before there were e-books or widespread internet usage) would skew the data.
Results: The first thing we’ll look at is the number of repeat nominees in both the Hugo and Nebula awards.
For the Hugo Awards for Best Novel, there were 72 nominated books between 2001-2014. For 47 of those 72, the authors had received at least one prior nomination for the Best Hugo Novel Award. This left only 25 “first timer” nominees in that 14 year period, or less than 2 per year. The Hugo is a difficult slate to crack into, but once you make it into the club, you’re very likely to get nominated again.
The Nebula is a little friendlier to first timers. In the 2001-2014 time period, there were 87 total books nominated. 43 of those were by repeat nominees, and 44 by first timers. Let’s take a look at that visually:
We can conclude that the Hugo Award is heavily weighted towards past nominees, and the Nebula Award is roughly weighted 50/50. This is very strong bias towards past nominees, and it is certainly one of the defining features of these awards.
The best indicator—by a long shot—of future award success is past award success. I’ll leave it up to you to decide whether or not this level of repeat nomination is healthy for the awards.
In our next post, we’ll look at how centralized these awards are: exactly how many of these repeaters are there, and are there “super repeaters” that dominate the Hugo and Nebula slates?
One of the more common critiques—that’s probably too strong a word, let’s use “comments,” because no one is trying to be hostile—about Chaos Horizon is that it’s too early to start thinking about the 2015 Hugo and Nebula awards. So why I am I predicting Hugo and Nebula slates so early?
As a reader, I’m interested in the Hugo and Nebula awards because they allow me to keep track of the trends—and controversies—going on in the SFF world. Over the past 4 or 5 years, I’ve tried to read each of the Hugo and Nebula nominees, but I noticed I was falling farther and farther behind. When the Nebula nominees would come out in late February, I’d find myself having to buy 4 or 5 books, and then having to buy another 2 or 3 books when the Hugo slate got announced. By then, I’d be so far behind I wouldn’t have time to read all of the nominees before awards season. Because of this, I ended up missing out on the conversations and arguments that go along with choosing the Hugos and Nebulas.
Last year, I decided to get ahead of the curve, and read the major SFF books as they came out. I went looking for resources on potential Hugo and Nebula nominees, and there wasn’t much out there. Many SFF review sites are very enthusiastic about the genre (as they should be), and end up recommending lots and lots of books. I don’t have the time (or money) to read 30-40 new SFF novels a year; I need to contain my SFF reading to about one book a month, both for my pocketbook and sanity.
Thus Chaos Horizon. By looking for past trends in the Nebula and Hugos, I figured I could come out with the most likely nominees myself. That way, I’ll save myself a little bit of time and money by getting a jump on the award season. While I’m not going to be 100% accurate—that’s an impossibility—but if I can predict (and read) a good chunk of the eventual nominees by the end of the year, I’ll only have to buy a few books when the slates come out. I also think readers need time to process all these books. It’s not easy to zip through The Bone Clocks and Echopraxia, so if I’m going to read them, I want to read them over Christmas and other vacations, when I have some actual time to dedicate to them. Better to have a good list by October than to have to wait until February.
Is there a downside to thinking about the awards early? Some could argue that this is going to slight books that come out later in the year, but I’m not so sure. Won’t drawing attention to contenders as they come out leave plenty of space for contenders from the end of the year? Anyways, most Hugo noms are published between May and October, so that prime season is almost over. Another objection could be that predicting early ends up utilizing things like reputation, marketing, and past awards history, rather than the actual content of the novel. I think that objection is 100% true—but that’s also what nets award nominations. The Hugos and Nebulas are stuffed with repeat nominees, and that statistical consistency is what makes Chaos Horizon possible.
So tl;dr: I started Chaos Horizon because I don’t have enough money or time to buy and read all the Nebula/Hugo nominated books when the slates are announced. I wanted to get a start on selecting and reading these books earlier, and thus the site. Questions? Objections?
Over the past several days, Chaos Horizon has been looking at the correlation between US publication dates and the frequency of being nominated for or winning a Hugo Award for Best Novel, 2001-2014. Today, we’ll wrap up that report and open the floor for discussion and questions. Here are the previous posts (with charts and data!): Part 1, Part 2, Part 3.
Based on the previous posts, I believe the conclusion we can reach is simple: there is a definite “publication window” that extends from May to October. About 75% of Hugo nominees come from this window, as do 85% of the winners. May and September were the best Hugo-winning months, perhaps correlating to the start of the Summer and Christmas book-buying seasons.
1. How does this “window” correlate with the number of SFF books published per month? That’s not an easy statistic to find, although we can make a rough estimate based on Locus Magazines list of SFF books per month. I trust LocusMag—they’ve been making this list for a long time, so there methodology is likely consistent—but this estimate is gong to be very rough. We should only pay attention to the trends in this chart, not precise numbers:
This is what we might expect: there is a definite spike in books published right before the Christmas book-buying season, a drop off in December in January, and a slight spike during the Summer book-buying season. Since more books are published in May, September, and October, it should come as no surprise more Hugo nominations and winners come from that time period.
From a publisher’s perspective, it might be that the Summer season is being neglected—it looks like everyone wants to publish in September and October. If I were an author, I might prefer to published in May: there’s a softer market (fewer titles to compete with), and maybe more of a chance for publicity/to be read.
2. Are we looking at a self-fulfilling prophecy? Do publishers believe that May-October are the best months for publishing potential Hugo books? In other words, do publishers hold their Hugo books until this window, thus biasing the stats as a result? Would publishers be better off trying other months, in an attempt to break through to an audience that needs books to read?
3. Is the internet changing the importance of publication dates? If so, how? Do e-books provide more immediate access than print books, and would that alter the publication window? Could publishers extend the window by dropping e-book prices later in the year?
4. How much stock can we place in this study, given the relatively small amount of data: 68 nominees and 14 winners? Is this too small of a data set to draw reliable conclusions from?
5. Is it fair to only think about US publication dates? How would UK (or international) publication dates factor in?
Lastly, are there any concerns or issues you’d like to raise about this study? Statistics can be incredibly misleading, as they depend enormously both on the data set and the statistical model being set up by the analyst (in this case, me). Chaos Horizon is committed to transparency in all reports. How else could the study be set up? How could we provide a more complete picture of publication dates and the Hugo Award?
This methodology post is unlikely to be much of interest to the casual reader, but I’m recording this information in case anyone wants to double check the data, or to call into question the kind of data I used. It is very easy to mislead the public using statistics, and Chaos Horizon is trying to avoid that by providing maximum transparency on all studies and reports. If you have questions, ask in the comments or e-mail me at email@example.com.
Date Range: Why 2001-2014? I used this date range because 2001 marks a substantial shift in the Hugo awards. Prior to 2001, the Hugo award for Best Novel was basically a SF award, with all prior awards having been Science Fiction novels. J.K. Rowling wins for Harry Potter and the Goblet of Fire in 2001, and this opens up the Hugos to all sorts of different genres and types of books, and can be thought of as starting the “modern” era of the award. There is also undeniable convenience to starting studies with the new millennia. It’s also hard to believe that the book market back in something like 1994 was the same as now: no internet, no e-books, vastly different audience and buying habits. The farther we go back in time, the more we cloud the statistics.
September 2014 is when the study was made, thus marking the upper part of the date limit.
Limitations: I limited myself to US publication dates in this study, although the Hugo encompasses both the American, British, and international authors and voters. No novel in translation was nominated for the Hugo Award from 2001-2014, so the exclusion of international publication dates seems justified.
British publication dates were trickier, and I initially explored them in some detail. That data is present on the third page of the Excel spreadsheet. British dates were not as readily accessible, and even when I could find them I had no real way of double-checking them. Furthermore, some texts were published simultaneously in the UK; in the case of British authors, some texts were published earlier; and in the case of American authors, some texts were published later. Those discrepancies introduced a great deal of uncertainty into the project, as it wasn’t clear which date should be used. British publication dates likely greatly impacted the years the WorldCon was in the UK, and had less impact when the WorldCon was in the US. If anyone can think of a clever way to find and handle British publication dates, I’m all ears.
Sources: To find the publication dates, I utilized three main sources. First, I used the International Science Fiction database, found at www.isfdb.org, to come up with an initial publication date. Probably the most in-depth resource for finding information about different SFF book editions, I utilized the first available date for US print editions in this study, excluding limited availability special editions.
Second: I cross-checked that isfdb date with Amazon. While we can debate some of Amazon’s sale practices, there is no doubt about the wide variety of book-related information their site offers. Since they are a professional book-seller, they have a huge stake in providing accurate data. Again, I tried to find the earliest published print edition, and, whenever possible, to match the ISBN of that edition against the isfdb.org info.
Interestingly—and frustratingly—the isfdb.org and amazon.com information often disagreed. Of the 68 dates provided, there were discrepancies in 20 of them. However, these were often very minor: isfdb.org reporting a March publication date, and amazon.com reporting a late February date. In general, amazon.com usually reported earlier publication dates by a few weeks.
Third: If the isfdb.org date and the amazon.com date disagreed, I went to the Barnes and Noble website to resolve the issue. Like amazon.com, this provides a wealth of information, and I trust their database because that’s how they make their money. In almost all instances, the amazon.com date agreed with the bn.com, so I went with the amazon/bn publication date. All disagreements are marked in the Excel spreadsheet.
Any discrepancies were only a matter of weeks (pushing a book from June to July), and are unlikely to cause major changes in the analysis. Still, you might want to avoid placing too much stock in any individual month; I believe the ranges of the seasons are more reliable.
Other possible sources: I tried out several other possible sources for publication data before discarding them. Both WorldCat and the Library of Congress, two major sources for cataloging books, only provided publication month, and I wanted as precise as information as possible.
Notes: Four nominated texts were excluded from the study. Robert Jordan and Brandon Sanderson’s The Wheel of Time is a series of 14 novels published over decades. Connie Willis won for Blackout/All Clear, two novels published during the same year. I could have used both dates, but I decided to go with neither to keep the data clear. Two books, both from the 2005 Hugos held in Glasgow, did not receive American releases prior to their year of nomination; those were River of Gods by Ian McDonald and The Algebraist by Iain M. Banks.
Weakness of the Study: With only 68 pieces of data, we’re falling far short of a substantial data set. As a result, small changes in the data—an individual author publishing in October rather than September—may affect the final results unduly. Since each individual novel accounts for around 1.5% of the total data, take everything with a grain of salt. While I feel it likely the broader conclusions are accurate, the specifics of months, particularly for the winners, probably needs to be de-emphasized. We shouldn’t place all that much stock that Jo Walton published Among Others in January rather than February, for instance.
While I could expand the data back another decade, and likely pick up 50+ more dates, I’ve decided not to go that route. I feel that the publishing market in the 1990s was substantially different than the publishing market in the 2000s, and that this additional data would not contribute much to the study. If someone else feels otherwise, and would like to chart that data, feel free. Send me a link if you do the analysis.
Here’s a link to the Excel spreadsheet that gathers all the data: Hugo Dates Study.
I think that sums up methodology questions. Let me know if you need any other information.