Periodically, Chaos Horizon publishes extensive reports on various issues relating to SFF awards. One important context for this year’s Hugo controversy is the question of nomination numbers. New readers who are coming into the discussion may be unaware of how strange (for lack of a better word) the process is, and how few votes it has historically taken to get a Hugo nomination, particularly in categories other than Best Novel. As a little teaser of the data we’ll be looking at, consider this number: in 2006, it only took 14 votes to make the Best Short Story Hugo final ballot.
While those numbers have risen steadily over the past decade, they’re still shockingly low: in 2012, it took 36 votes in the Short Story category; in 2013, it took 34 votes; in 2014, we jumped all the way to 43. This year, with the Sad Puppy/Rabid Puppy influence, the number tripled to 132. That huge increase causes an incredible amount of statistical instability, to the point that this year’s data is “garbage” data (i.e. confusing) when compared to other years.
Without having a good grasp of these numbers and trends, many of the proposed “fixes”—if a fix is needed at all, and that this isn’t something that will work itself out over 2-3 years via the democratic process—might exacerbate some of the oddities already present within the Hugo. The Hugo has often been criticized for being an “insider” award, prone to log-rolling, informal cliques, and the like. While I don’t have strong opinions on any of those charges, I think it’s important to have a good understanding of the numbers to better understand what’s going on this year.
Chaos Horizon is an analytics, not an opinion, website. I’m interested in looking at all the pieces that go into the Hugo and other SFF awards, ranging from past patterns, biases, and oddities, to making future predictions as what will happen. I see this as a classic multi-variable problem: a lot of different factors go into the yearly awards, and I’ve been setting myself the task of trying to sort through some (and only some!) of them. Low nominating numbers are one of the defining features of the Hugo award; that’s just how the award has worked in the past. That’s not a criticism, just an observation.
I’ve been waiting to launch this report for a little while, hoping that the conversation around this year’s Hugo to cool off a little. It doesn’t look like that’s going to happen. The sheer flood of posts about this year’s Hugos reveal the desire that various SFF communities have for the Hugo to be the “definitive” SFF award, “the award of record.” File 770 has been the best hub for collecting all these posts; check them out if you want to get caught up on the broader conversation.
I don’t think any award can be definitive. That’s not how an award works, whether it’s the Hugo, the Pulitzer, or the Nobel prize. There are simply too many books published, in too many different sub-genres, to too many different types of fans, for one award to sort through and “objectively” say this is the best book. Personally, I don’t rely on the Hugo or Nebula to tell me what’s going on in the SFF field. I’ve been collating an Awards Meta-List that looks at 15 different SFF awards. That kind of broad view is invaluable if you want to know what’s happening across the whole field, not only in a narrow part of it. Lastly, no one’s tastes are going to be a perfect match for any specific award. Stanislaw Lem, one of my favorite SF authors, was never even nominated for a Hugo or Nebula. That makes those awards worse, not Lem.
Finally, I don’t mean this report to be a critique of the Worldcon committees who run the Hugo award. They have an incredibly difficult (and thankless) job. Wrestling with an award that has evolved over 50 years must be a titanic task. I’d like to personally thank them for everything they do. Every award has oddities; they can’t help but have oddities. Fantasizing about some Cloud-Cuckoo-Land “perfect” SFF award isn’t going to get the field anywhere. This is where we’re at, this is what we’ve have, so let’s understand it.
So, enough preamble: in this report we’ll be looking at the last 10 years of Hugo nomination data, to see what it takes to get onto the final Hugo ballot.
Background: If you already know this information, by all means skip ahead.
TheHugoAwards.org themselves provide an intro to the Hugos:
The Hugo Awards, presented annually since 1955, are science fiction’s most prestigious award. The Hugo Awards are voted on by members of the World Science Fiction Convention (“Worldcon”), which is also responsible for administering them.
Every year, the attending or supporting members of the Worldcon go through a process to nominate and then vote on the Hugo awards. There are a great many categories (it’s changed over the years; we’re at 16 Hugo categories + the Campbell Award, which isn’t a Hugo but is voted on at the same time by the same people) ranging from Best Novel down to more obscure things like Best Semiprozine and Best Fancast.
If you’re unfamiliar with the basics of the award, I suggest you consult the Hugo FAQs page for basic info. The important bits for us to know here are how the nomination process works: every supporting and attending member can vote for up to 5 things in each category, and each of those votes counts equally. This means that someone who votes for 5 different Best Novels has 5 times as much influence as a voter who only votes for 1. Keep that wrinkle in mind as we move forward.
The final Hugo Ballot is made up of the 5 highest vote getters in each category, provided that they reach at least 5% total votes. This 5% rule has come into play several times in the last few years, particularly in the Short Story category.
Methodology: I looked through the Hugo Award nominating stats, archived at TheHugoAwards.org, and manually entered the highest nominee, the lowest nominee, and the total number of ballots (when available) for each Hugo category. Worldcon voting packets are not particularly compatible with data processing software, and it’s an absolute pain to pull the info out. Hugo committees, if you’re listening, create comma separated value files!
I chose 10 years as a range for two reasons. First, the data is easily available for that time range, and it gets harder to find for earlier years. The Hugo website doesn’t have the 2004 data readily linked, for instance. While I assume I could find it if I hunted hard enough, it was already tedious enough to enter 10 years of data. Second, my fingers get sore after so much data entry!
Since the Worldcon location and organizing committees change every year, the kind of data included in the voting results packet varies from year to year as well. Most of the time, they tell us the number of nominating ballots per category; some years they don’t. Some have gone into great detail (number of unique works nominated, for instance), but usually they don’t.
Two methodological details: I treated the Campbell as a Hugo for the purposes of this report: the data is very similar to the rest of the Hugo categories, and they show up on the same ballot. That may irk some people. Second, there have been a number of Hugo awards declined or withdrawn (for eligibility reasons). I marked all of those on the Excel spreadsheet, but I didn’t go back and correct those by hand. I was actually surprised at how little those changes mattered: most of the time when someone withdrew, it affected the data by only a few votes (the next nominee down had 20 instead of 22 votes, for instance). The biggest substantive change was a result of Gaiman’s withdrawal last year, which resulted in a 22 vote swing. If you want to go back and factor those in, feel free.
Thanks to all the Chaos Horizon readers who helped pull some of the data for me!
Here’s the data file as of 5/5/2015: Nominating Stats Data. I’ll be adding more data throughout, and updating my file as I go. Currently, I’ve got 450 data points entered, with more to come. All data on Chaos Horizon is open; if you want to run your own analyses, feel free to do so. Dump a link into the comments so I can check it out!
Results: Let’s look at a few charts before I wrap up for today. I think the best way to get a holistic overview of the Hugo Award nominating numbers is to look at averages. Across all the Hugo categories and the Campbell, what were the average number of ballots per category, the votes per top nominee (i.e. the work that took #1 in the nominations), and the votes per low nominee (the work that took place #5 in the nominations)? That’s going to set down a broad view and allow us to see what exactly it takes (on average) to get a Hugo nom.
Of course, every category works differently, and I’ll be more closely looking at the fiction categories moving forward. The Hugo is actually many different awards, each with slightly different statistical patterns. This makes “fixing” the Hugos by one change very unlikely: anything done to smooth the Best Novel category, for instance, is likely to destabilize the Best Short Story category, and vice versa.
On to some data:
This table gives us a broad holistic view of the Hugo Award nominating data. What I’ve done is taken all the Hugo categories and averaged them. We have three pieces of data for each year: average ballots per category (how many people voted), average number of votes for the high nominee, and average votes for the low nominee. So, in 2010, an average of 362 people voted in each category, and the top nominee grabbed 88 votes, the low nominee 47.
Don’t worry: we’ll get into specific categories over the next few days. Today, I want the broad view. Let’s look at this visually:
2007 didn’t include the number of ballots per category, thus the missing data in the graph. You can see in this graph that the total number of ballots is fairly robust, but that the number of votes for our nominated works are pretty low. Think about the space between the bottom two lines as the “sweet spot”: that’s how many votes you need to score a Hugo nomination in any given year. If you want to sweep the Hugos, as the Puppies did this year in several categories, you’d want to be above the Average High Nom line. For most years, that’s meant fewer than 100 votes. In fact, let’s zoom in on the High and Low Nom lines:
This graphs let us set mathematical patterns that are hard to see when just looking at numbers. Take your hand and cover up everything after 2012 on Chart #2: you’ll see a steady linear increase in the high and low ranges over those 8 years, rising from about 60 to 100 for the high nominee and 40 to 50 for the low nominee. Nothing too unusual there. If you’re take your hand off, you’ll see an exponential increase from 2012-2015: the numbers shoot straight up. That’s a convergence of many factors: the popularity of LonCon, the Puppies, and the increased scrutiny on the Hugos brought about by the internet.
What does all this mean? I encourage you to think and analyze this data yourself, and certainly use the comments to discuss the charts. Don’t get too heated; we’re a stats site, not a yell at each other site. There’s plenty of those out there. :).
Lastly, this report is only getting started. Over the next few days—it takes me a little bit of time to put together such data-heavy posts—I’ll be drilling more deeply into various categories, and looking at things like:
1. How do the fiction categories work?
2. What’s the drop off between categories?
3. How spread out (i.e. how many different works are nominated) are the categories?
What information would be helpful for you to have about the Hugos? Are you surprised by these low average nomination numbers, or are they what you’d expect? Is there a discrepancy between the “prestige” of the Hugo and the nomination numbers?
A sub-category of my broader genre study, this post addresses the increasing influence of “literary fiction” on the contemporary Hugo and Nebula Awards for Best Novel, 2001-2014. I think the general perception is that the awards, particularly the Nebula, have begun nominating novels that include minimal speculative elements. Rather than simply trust the general perception, let’s look to see if this assumption lines up with the data.
Methodology: I looked at the Hugo and Nebula nominees from 2001-2014 and ranked the books as either primarily “speculative” or “literary.” Simple enough, right?
Defining “literary” is a substantial and significant problem. While most readers would likely acknowledge that Cloud Atlas is a fundamentally different book than Rendezvous with Rama, articulating that difference in a consistent manner is complicated. The Hugos and Nebulas offer no help themselves. Their by-laws are written in an incredibly vague fashion that does not define what “Science Fiction or Fantasy” actually means. Here’s the Hugo’s definition:
Unless otherwise specified, Hugo Awards are given for work in the field of science fiction or fantasy appearing for the first time during the previous calendar year.
Without a clear definition of “science fiction or fantasy,” it’s left up to WorldCon or SFWA voters to set genre parameters, and they are free to do so in any way they wish.
All well and interesting, but that doesn’t help me categorize texts. I see three types of literary fiction entering into the awards:
1. Books by literary fiction authors (defined as having achieved fame before their Hugo/Nebula nominated book in the literary fiction space) that use speculative elements. Examples: Cloud Atlas, The Yiddish Policeman’s Union.
2. Books by authors in SFF-adjacent fields (primarily horror and weird fiction) that have moved into the Hugo/Nebulas. These books often allow readers to see the “horror” elements as either being real or imagined. Examples: The Drowning Girl, Perfect Circle, The Girl in the Glass.
3. Books by already well-known SFF authors who are utilizing the techniques/styles more commonplace to literary fiction. Examples: We Are All Completely Besides Ourselves, Among Others.
That’s a broad set of different texts. To cover all those texts—remember, at any point you may push back against my methodology—I came up with a broad definition:
I will classify a book as “literary” if a reader could pick the book up, read a random 50 page section, and not notice any clear “speculative” (i.e. non-realistic) elements.
That’s not perfect, but there’s no authority we can appeal to make these classifications for us. Let’s see how it works:
Try applying this to Cloud Atlas. Mitchell’s novel consists of a series of entirely realistic novellas set throughout various ages of history and one speculative novella set in the future. If you just picked the book up and started reading, chances are you’d land in one of the realistic sections, and you wouldn’t know it could be considered a SFF book.
Consider We Are All Completely Beside Ourselves, Karen Joy Fowler’s reach meditation on science, childhood, and memory. Told in realistic fashion, it follows the story of a young woman whose parents raised a chimpanzee alongside her, and how this early childhood relationship shapes her college years. While this isn’t the place to decide if Fowler deserved a Nebula nomination—she won the National Book Award and was nominated for the Booker for this same book, so quality isn’t much of a question—the styles, techniques, and focus of Fowler’s book are intensely realistic. Unless you’re told it could be considered a SF novel, you’d likely consider it plain old realistic fiction.
With this admittedly imperfect definition in place, I went through the nominees. For the Nebula, I counted 13 out of 87 nominees (15%) that met my definition of “literary.” While a different statistician would classify books differently, I imagine most of us would be in the same ball park. I struggled with The City & The City, which takes place in a fictional dual-city and that utilizes a noir plot; I eventually saw it as being more Pychonesque than speculative, so I counted it as “literary.” I placed The Yiddish Policeman’s Union as literary fiction because of Chabon’s earlier fame as a literary author. After he establishes the “Jews in Alaska” premise, large portions of the book are straightly realistic. Other books could be read either as speculative or not, such as The Drowning Girl. Borderline cases all went into the “literary” category for this study.
Given that I like the Chabon and Mieville novels a great deal, I’ll emphasize I don’t think being “literary” is a problem. Since these kinds of books are not forbidden by the Hugo/Nebula by-laws, they are fair game to nominate. These books certainly change the nature of the award, and there are real inconsistencies—no Haruki Murakami nominations, no The Road nomination—in which literary SFF books get nominated.
As for the Hugos, only 4 out of 72 nominees met my “literary” definition. Since the list is small, let me name them here: The Years of Rice and Salt (Robinson’s realistically told alternative history), The Yiddish Policeman’s Union, The City & The City, and Among Others. Each of those pushes the genre definitions of speculative fiction. Two are flat out alternative histories, which has traditionally been considered a SFF category, although I think the techniques used by Robinson and Chabon are very reminiscent of literary fiction. Mieville is an experimental book, and the Walton is a book as much “about SFF” as SFF. I’d note that 3 of those 4 (all but the Robinson) received Nebula nominations first, and that Nebula noms have a huge influence on the Hugo noms.
Let’s look at this visually:
Even with my relatively generous definition of “literary,” that’s not a huge encroachment. Roughly 1 in 6 of the Nebula noms have been from the literary borderlands, which is lower than what I’d expected. While 2014 had 3 such novels (the Folwer, Hild, and The Golem and the Jinni), the rest of the 2010s had about 1 borderline novel a year.
The Hugos have been much less receptive to these borderline texts, usually only nominating once the Nebula awards have done. We should note that both Chabon and Walton won, once again reflecting the results of the Nebula.
So what can we make of this? The Nebula nominates “literary” books about 1/6 times, or once per year. The Hugo does this much more infrequently, and usually when a book catches fire in the Nebula process. While this represent a change in the awards, particularly the Nebula, this is nowhere as rapid or significant as the changes regarding fantasy (which are around 50% Nebula and 30% Hugo). I know some readers think “literary” stories are creeping into the short story categories; I’m not an expert on those categories, so I can’t meaningfully comment.
I’m going to use the 15% Nebula and 5% Hugo “literary” number to help shape my predictions. I may have been overestimating the receptiveness of the Nebula to literary fiction; this study suggests we’d see either Mitchell or Mandel in 2015, not both. Here’s the full list of categorizations. I placed a 1 by a text if it met the “literary” definition: Lit Fic Study.
Yesterday, we looked at the Nebula slate; today, we’ll look at the Nebula winners. I show seven fantasy novels (out of 50 winners total; there was a tie in 1967) as having won the Nebula Award:
1982: Claw of the Conciliator, Gene Wolfe
1988: Falling Woman, Pat Murphy
1991: Tehanu, Ursula K. Le Guin
2003: American Gods, Neil Gaiman
2005: Paladin of Souls, Lois McMaster Bujold
2009: Powers, Ursula K. Le Guin
2012: Among Others, Jo Walton
Interestingly, the 1980s were better for winning than the 1990s (we’ll see that also reflected in the Hugo in the upcoming days), and things have picked up a great deal in the last 15 years for fantasy. This is a pretty broad slice of fantasy: we have secondary world novels with Bujold and Le Guin, contemporary fantasy with Walton and Gaiman, and Wolfe’s nearly unclassifiable Dying Earth style book. Here’s the data and charts:
The chart is pretty zig-zaggy because we’re dealing with such small numbers (10 per decade), although you do see a gradual increase over time in the direction of fantasy wins. Still, the “win” chart is nowhere near as dramatic as the “nominee” chart, proving that it’s easier to get nominated as a fantasy novel than to win as a fantasy novel.
We can conclude that fantasy novels tend to underperform once they reach the slate: since 1980, fantasy novels have made up 32% of the slate but only account for 20% of the wins. That’s a statistically significant bias against fantasy novels winning, something I need to take into account for my future predictions.
In an odd way, the more fantasy novels get nominated, the harder it can be for a fantasy novel to win, as the fantasy vote ends up getting split between the slate. 2013 is a perfect example of this: one SF novel faced off against 5 fantasy novels. 2312 ended up winning, because I imagine all the “the Nebula should go to a SF novel” SFWA voters voted for Robinson, and the fantasy votes were spread our across the other 5. If we’re considering genre alone, fantasy books are at a disadvantage of winning. Of course, genre alone does not determine the winner, as many other factors—familiarity, reception, popularity, demographics, etc.—also come into play.
In a statistical study like this, you have to think about what the “baseline” might be, i.e. what the stats would be like without bias. Is the Nebula an award moving toward a 50/50 split between fantasy and science fiction? Why should/would 50/50 be the baseline? Isn’t fantasy more popular than science fiction, at least in terms of readers in 2014? What about critical prestige? What about the nebulous and nearly impossible to define idea of “tradition”? How about the bias towards well-known authors? How about potential biases regarding gender? What about the bias against books in a series or sequels?
All of these are going to factors in the eventual fantasy/science fiction split the Nebula arrives at, and all these factors change over time. Trying to cross-correlate all those variables before we have a basic understanding is only going to result in mass confusion. As Chaos Horizon slowly builds up its data sets, the best we can do is think about the statistical moment we’re at right now, as predicting even the next 5 years is very difficult. So, to sum up the situation for the Nebula:
1. The Nebula slate breaks into three eras: a 15 year period (1966-1980) where fantasy was largely excluded, a 30 year period (1980-2010) where fantasy was around 25%-30% of the slate, and a more recent era (2010-2014) where fantasy has overtaken SF on the slate. Are the last 5 years a statistical aberration or something that is likely to continue?
2. The Nebula winners have been more consistent since 1980, accounting for around 20% of the wins, with a general increase in % of winners over time. Be aware that these conclusions are shakier because the numbers are smaller. Nonetheless, fantasy novels have underperformed on the slate, winning at a smaller proportion than their SF peers.
Tomorrow, we’ll look at how the Hugo Award nominees have shaped up! Any questions so far?
It’s that time again: today we launch the most ambitious Chaos Horizon report yet, a look at genre and the Hugo and Nebula Awards for Best Novel. Note the lack of dates: we’re doing the whole thing, from the beginning of both awards to the present!
When the Hugo Awards started in 1953 and the Nebula Awards in 1966, they were exclusively science fiction awards, and this SF bias is still part of the Hugo/Nebula DNA. Consider their names or the statues they give: I imagine it’s odd to win a rocket ship for a fantasy novel.
Over time, both awards have changed, moving beyond SF to other speculative genres: fantasy, literary speculative fiction, slipstream, urban fantasy, horror, etc. In this report, we’ll try to quantify and chart out that change. Like always, I’m going to break this into several parts:
Part 1: Introduction and Methodology (this post)
Part 2: Genre and the Nebula Award Best Novel nominees
Part 3: Genre and the Nebula Award Best Novel winners
Part 4: Genre and the Hugo Award Best Novel nominees
Part 5: Genre and the Hugo Award Best Novel winners
Part 6: A Closer Look: Fantasy Sub-Genres in the Nebulas
Part 7: A Closer Look: Fantasy Sub-Genres in the Hugos
Part 8: A Closer Look: Literary Fiction in the Nebulas and Hugos
Whew, that’s a lot of upcoming posts! And, of course, I’ll entertain any requests for clarification/further data that my readers might have; it’s very helpful to have different eyes on statistical project, as this can help me see my own blind spots and biases.
Basic Methodology: For this report, I assigned a genre for each of the almost 600 nominees for the Hugo or Nebula award for Best Novel. That breaks down to 311 Nebula nominees and 288 Hugo nominees, with some obvious overlap between the two awards. For the first pass—we’ll look deeper at some of the categories—I used one of three categories: Science Fiction, Fantasy, and the dreaded Other. So that brings us to our most basic methodological question: how do you define genre?
Defining Genre: For a term we use ubiquitously in our day to day lives, genre is a surprisingly slippery concept. The first temptation is to try to define a genre structurally. So, for instance:
Science Fiction: Takes place in the future, involves advanced technology and/or aliens.
Fantasy: Takes place in the past with no advanced technology, involves magic and/or dragons and other non-real races or creatures.
Okay, we’re good to go, aren’t we? Nice, basic definitions. Classify away! Once you do this, though, exceptions start popping up all over the place. What about Alternate History novels? Those are traditionally classified as Science Fiction, but they don’t fit my definition. I’d have to go back and expand that. Same thing happens with Fantasy: Harry Potter doesn’t take place in the past, but in the present moment. Okay, maybe magic is the defining feature of Fantasy. Then again . . . does every Conan story involve magic? There are also plenty of stories where Conan just gets revenge by hacking up people, no magic involved. What about Steampunk? Is that Fantasy or Science Fiction?
What you’ll find is that any structural definition of Science Fiction or Fantasy blurs at the edges. This is because these are living genres, changing over time and with the different ideas/aesthetics of writers and readers. What “Fantasy” means today is different than what “Fantasy” meant 50 years ago, and it will continue to change in the future.
This is further complicated by the marketplace. From a branding perspective, fantasy was not well-regarded in the 1960s, and many fantasy writers threw a thin gloss of science fiction onto their fantasy books to help this problem. Anne McCaffery, Tanith Lee, Jack Vance, Marion Zimmer Bradley, etc., all did this at times, and you wound up with something like the “sword and planet” sub-genre that can be difficult and deeply unsatisfying to classify.
Maybe we should switch to some notion of authorial intent, and focus on the emotion the book tries to evoke: if the book tries to evoke horror, it’s horror. If heroism, it’s fantasy. If a sense of wonder at a technological future, science fiction. Once again, you’ll very quickly run into problems with this, and you’ll end up being what we call the “Genre Police” trying to re-enforce borders that are constantly being overwritten, and you’ll end up saying things like, “This isn’t real fantasy!” I find that to be one of the more uninteresting observations a critic can make about a work of literature.
All is not lost. Just because blue blurs into green at the edges doesn’t mean that blue and green aren’t distinct colors, and this is also true for genres like fantasy and science fiction. While it might be hard to classify something like The Dying Earth, the difference between Ancillary Sword and A Game of Thrones is very obvious.
I’ve talked in circles, though, and haven’t resolved the basic problem: how do we define genre?
I ultimately went with a “reader-reception” theory of genre: if the majority of readers at the time of publication thought a book was fantasy or science fiction, that’s what I classified it as. There are several benefits to this approach:
1. It removes my bias from the equation; we don’t want my opinion as to whether or not Claw of the Conciliator is SF or F determining the data.
2. For the most part, this is easy to ascertain (see below).
3. It provides a historical look at genre, rather than reclassifying novels based on our present-day definitions of genre.
To measure reader-reception of genre, I primarily used the Locus Awards classifications. The Locus Awards, a source I often really heavily on, are an annual vote by the readers of Locus Magazine regarding the best SFF novels (and stories, for that matter) of the year. Beginning in 1978, they broke their vote into two categories: Best Fantasy Novel and Best Science Fiction Novel. This information is readily available at the Science Fiction Awards Database.
So, my main form of classification was to look up each Hugo/Nebula nominee on this list, and just go with the Locus voters. I figure these were the most informed SFF fans of their era, and if they believed Claw of the Conciliator was Fantasy, who am I to doubt them?
These leaves me with two problems: what to do before 1978, and what if a nominee didn’t make the Locus list? In those cases, I did the following:
1. If I was familiar with the novel (i.e., if I’d read it) and the classification was obvious (in my opinion, sadly), I assigned a classification. For the Hugos, this was easy, as 99% of the books are unquestionably SF. For the Nebula, it’s a little more difficult, but I only had to deal with 1966-1977.
2. If I was unfamiliar with the novel, I went to Amazon.com and read the book description and reader reviews. If I felt it was obvious (spaceship on the cover, description talking about magic), I went ahead and assigned a genre.
3. If I still felt the genre was still unclear, I marked it as “Other.” Better to have uncertainty in the data than pretend it’s 100% accurate. I’ll note any “borderline” cases in upcoming posts.
Is this classification of genre perfect? Absolutely not, but I don’t think any classification of genre would be. Out of the 600 classifications I had to make, I’d say about 5% of them were difficult. That’s actually not too bad for a data set. While a different researcher might classify books differently, these slight variations won’t throw the results off that badly.
So, I’ll be back tomorrow with the Nebula Award and Nominees! But today’s question is the one of genre: what do you think of genre, and what do you think the best way to classify it is?
One of the basic hypotheses of Chaos Horizon is that the Hugo and Nebula Awards for Best Novel are predictable. That doesn’t mean we can predict them with 100% accuracy, but that by studying past patterns in the Hugo and Nebula awards, we can make statistical estimates as to what is likely to happen in the future.
One of the things that makes Chaos Horizon possible is that both the Hugo and the Nebula are very repetitive: they tend to nominate the same authors over and over again. In this report, we’ll look at how extensive repeat nominations are in the 2001-2014 time period. Like my last report on Dates and the Hugo award, I’ll break this up into several posts:
1. Number of repeat nominees in the Hugo and Nebula Award for Best Novel, 2001-2014 (this post)
2. Percentage of Hugo and Nebula Best Novel nominations going to the same authors, 2001-2014
3. The impact of repeat nominees on winning the Hugo Best Novel award, 2001-2014
4. The impact of repeat nominees on winning the Nebula Best Novel Award, 2001-2014
5. Conclusions and discussion
Methodology: Since this is a simpler study than the Hugo Awards date study, the methodology is simpler. I cross-referenced the list of Hugo and Nebula winners with the excellent Science Fiction Awards Database (sfadb.com) to see if each of the nominees from 2001-2014 had received a prior nomination for Best Novel. I kept the Hugo and Nebula award lists separate; i.e. I didn’t count someone as a repeat nominee for the Hugo if they had received a prior Nebula nomination, and vice versa.
With regard to dates, I’m using the 2001-2014 period so we can get a picture of how the Hugos and Nebulas operate in the 21st century. I believe that the awards have changed substantially since the 1980s and 1990s, and including data from previous eras (before there were e-books or widespread internet usage) would skew the data.
Results: The first thing we’ll look at is the number of repeat nominees in both the Hugo and Nebula awards.
For the Hugo Awards for Best Novel, there were 72 nominated books between 2001-2014. For 47 of those 72, the authors had received at least one prior nomination for the Best Hugo Novel Award. This left only 25 “first timer” nominees in that 14 year period, or less than 2 per year. The Hugo is a difficult slate to crack into, but once you make it into the club, you’re very likely to get nominated again.
The Nebula is a little friendlier to first timers. In the 2001-2014 time period, there were 87 total books nominated. 43 of those were by repeat nominees, and 44 by first timers. Let’s take a look at that visually:
We can conclude that the Hugo Award is heavily weighted towards past nominees, and the Nebula Award is roughly weighted 50/50. This is very strong bias towards past nominees, and it is certainly one of the defining features of these awards.
The best indicator—by a long shot—of future award success is past award success. I’ll leave it up to you to decide whether or not this level of repeat nomination is healthy for the awards.
In our next post, we’ll look at how centralized these awards are: exactly how many of these repeaters are there, and are there “super repeaters” that dominate the Hugo and Nebula slates?
Over the past several days, Chaos Horizon has been looking at the correlation between US publication dates and the frequency of being nominated for or winning a Hugo Award for Best Novel, 2001-2014. Today, we’ll wrap up that report and open the floor for discussion and questions. Here are the previous posts (with charts and data!): Part 1, Part 2, Part 3.
Based on the previous posts, I believe the conclusion we can reach is simple: there is a definite “publication window” that extends from May to October. About 75% of Hugo nominees come from this window, as do 85% of the winners. May and September were the best Hugo-winning months, perhaps correlating to the start of the Summer and Christmas book-buying seasons.
1. How does this “window” correlate with the number of SFF books published per month? That’s not an easy statistic to find, although we can make a rough estimate based on Locus Magazines list of SFF books per month. I trust LocusMag—they’ve been making this list for a long time, so there methodology is likely consistent—but this estimate is gong to be very rough. We should only pay attention to the trends in this chart, not precise numbers:
This is what we might expect: there is a definite spike in books published right before the Christmas book-buying season, a drop off in December in January, and a slight spike during the Summer book-buying season. Since more books are published in May, September, and October, it should come as no surprise more Hugo nominations and winners come from that time period.
From a publisher’s perspective, it might be that the Summer season is being neglected—it looks like everyone wants to publish in September and October. If I were an author, I might prefer to published in May: there’s a softer market (fewer titles to compete with), and maybe more of a chance for publicity/to be read.
2. Are we looking at a self-fulfilling prophecy? Do publishers believe that May-October are the best months for publishing potential Hugo books? In other words, do publishers hold their Hugo books until this window, thus biasing the stats as a result? Would publishers be better off trying other months, in an attempt to break through to an audience that needs books to read?
3. Is the internet changing the importance of publication dates? If so, how? Do e-books provide more immediate access than print books, and would that alter the publication window? Could publishers extend the window by dropping e-book prices later in the year?
4. How much stock can we place in this study, given the relatively small amount of data: 68 nominees and 14 winners? Is this too small of a data set to draw reliable conclusions from?
5. Is it fair to only think about US publication dates? How would UK (or international) publication dates factor in?
Lastly, are there any concerns or issues you’d like to raise about this study? Statistics can be incredibly misleading, as they depend enormously both on the data set and the statistical model being set up by the analyst (in this case, me). Chaos Horizon is committed to transparency in all reports. How else could the study be set up? How could we provide a more complete picture of publication dates and the Hugo Award?
This methodology post is unlikely to be much of interest to the casual reader, but I’m recording this information in case anyone wants to double check the data, or to call into question the kind of data I used. It is very easy to mislead the public using statistics, and Chaos Horizon is trying to avoid that by providing maximum transparency on all studies and reports. If you have questions, ask in the comments or e-mail me at firstname.lastname@example.org.
Date Range: Why 2001-2014? I used this date range because 2001 marks a substantial shift in the Hugo awards. Prior to 2001, the Hugo award for Best Novel was basically a SF award, with all prior awards having been Science Fiction novels. J.K. Rowling wins for Harry Potter and the Goblet of Fire in 2001, and this opens up the Hugos to all sorts of different genres and types of books, and can be thought of as starting the “modern” era of the award. There is also undeniable convenience to starting studies with the new millennia. It’s also hard to believe that the book market back in something like 1994 was the same as now: no internet, no e-books, vastly different audience and buying habits. The farther we go back in time, the more we cloud the statistics.
September 2014 is when the study was made, thus marking the upper part of the date limit.
Limitations: I limited myself to US publication dates in this study, although the Hugo encompasses both the American, British, and international authors and voters. No novel in translation was nominated for the Hugo Award from 2001-2014, so the exclusion of international publication dates seems justified.
British publication dates were trickier, and I initially explored them in some detail. That data is present on the third page of the Excel spreadsheet. British dates were not as readily accessible, and even when I could find them I had no real way of double-checking them. Furthermore, some texts were published simultaneously in the UK; in the case of British authors, some texts were published earlier; and in the case of American authors, some texts were published later. Those discrepancies introduced a great deal of uncertainty into the project, as it wasn’t clear which date should be used. British publication dates likely greatly impacted the years the WorldCon was in the UK, and had less impact when the WorldCon was in the US. If anyone can think of a clever way to find and handle British publication dates, I’m all ears.
Sources: To find the publication dates, I utilized three main sources. First, I used the International Science Fiction database, found at www.isfdb.org, to come up with an initial publication date. Probably the most in-depth resource for finding information about different SFF book editions, I utilized the first available date for US print editions in this study, excluding limited availability special editions.
Second: I cross-checked that isfdb date with Amazon. While we can debate some of Amazon’s sale practices, there is no doubt about the wide variety of book-related information their site offers. Since they are a professional book-seller, they have a huge stake in providing accurate data. Again, I tried to find the earliest published print edition, and, whenever possible, to match the ISBN of that edition against the isfdb.org info.
Interestingly—and frustratingly—the isfdb.org and amazon.com information often disagreed. Of the 68 dates provided, there were discrepancies in 20 of them. However, these were often very minor: isfdb.org reporting a March publication date, and amazon.com reporting a late February date. In general, amazon.com usually reported earlier publication dates by a few weeks.
Third: If the isfdb.org date and the amazon.com date disagreed, I went to the Barnes and Noble website to resolve the issue. Like amazon.com, this provides a wealth of information, and I trust their database because that’s how they make their money. In almost all instances, the amazon.com date agreed with the bn.com, so I went with the amazon/bn publication date. All disagreements are marked in the Excel spreadsheet.
Any discrepancies were only a matter of weeks (pushing a book from June to July), and are unlikely to cause major changes in the analysis. Still, you might want to avoid placing too much stock in any individual month; I believe the ranges of the seasons are more reliable.
Other possible sources: I tried out several other possible sources for publication data before discarding them. Both WorldCat and the Library of Congress, two major sources for cataloging books, only provided publication month, and I wanted as precise as information as possible.
Notes: Four nominated texts were excluded from the study. Robert Jordan and Brandon Sanderson’s The Wheel of Time is a series of 14 novels published over decades. Connie Willis won for Blackout/All Clear, two novels published during the same year. I could have used both dates, but I decided to go with neither to keep the data clear. Two books, both from the 2005 Hugos held in Glasgow, did not receive American releases prior to their year of nomination; those were River of Gods by Ian McDonald and The Algebraist by Iain M. Banks.
Weakness of the Study: With only 68 pieces of data, we’re falling far short of a substantial data set. As a result, small changes in the data—an individual author publishing in October rather than September—may affect the final results unduly. Since each individual novel accounts for around 1.5% of the total data, take everything with a grain of salt. While I feel it likely the broader conclusions are accurate, the specifics of months, particularly for the winners, probably needs to be de-emphasized. We shouldn’t place all that much stock that Jo Walton published Among Others in January rather than February, for instance.
While I could expand the data back another decade, and likely pick up 50+ more dates, I’ve decided not to go that route. I feel that the publishing market in the 1990s was substantially different than the publishing market in the 2000s, and that this additional data would not contribute much to the study. If someone else feels otherwise, and would like to chart that data, feel free. Send me a link if you do the analysis.
Here’s a link to the Excel spreadsheet that gathers all the data: Hugo Dates Study.
I think that sums up methodology questions. Let me know if you need any other information.
In Part 1 of this Chaos Horizon report, we looked at the relationship between US publication dates and Hugo Best Novel nominations from 2001-2014. Now, we can turn our eyes to actually wining the Hugo Best Novel for that same date range. Here’s a breakdown of winners by month for 2001-2014:
A couple of notes: I didn’t include the 2011 Hugo winner, Connie Willis’s Black Out/All Clear, because it was published as two separate volumes, one in February and one in October. I felt that this dual publication was an exceptional case; including it would muddy the analysis. That still leaves us with 14 winners, because Paolo Bacigalupi and China Mieville tied in 2010 for The Windup Girl and The City & The City.
It appears that May and September are far and away the best months for Hugo winners, at least for the 2001-2014 time period. With only 14 winners, we shouldn’t put a huge amount of stock in this chart, but May and September make a certain amount of sense. May is the beginning of the summer book buying season, and September the beginning of the Fall/Christmas book buying season: having your book published early in those cycles might maximize exposure and sales. The more people know about your book, the better a chance to win.
So—going back to Part 1—even though May (10), June (9), July (9), and October (8) yielded the most nominations, only May yielded a good number of winners. In terms of ratio, September was by far the best, with 4 out of the 6 September nominees going on to win.
Let’s look at the amount of winners per season, 2001-2014:
Except for the dismal winter, that’s a pretty even bar graph. Summer does dip a little, but that’s an exaggerated due to the small number of winners (14). Essentially, I’d estimate a novel has roughly the same chance of winning from the Spring, Summer, or Fall.
The window is still in full effect, though. Almost all our winners come from that May-October period, 2001-2014:
The only two novels to win out of that time period were authors with already established reputations: Jo Walton for Among Others, published January 2011, and Robert Charles Wilson’s Spin, published in April 2005.
So, tl;dr: if I were publishing a novel and wanting to win the Hugo, I’d request a release date of either May or September.
Tomorrow, we’ve got the boring Part 3: Methodology and Data.
As part of its continued statistical analysis of the Hugo and Nebula Awards, Chaos Horizon is happy to present its first ever report. Today, we’ll be looking at the impact of publication date on the chances of being nominated for and winning the Hugo Award for Best Novel. Since this is going to be detailed, I’ll break the report down into four posts:
1. Analysis of publication date and the chances of being nominated for the Hugo, 2001-2014 (Monday 9/22)
2. Analysis of publication date and the chances of winning the Hugo, 2001-2014 (Tuesday 9/23)
3. Methodology (the boring part!) (Wednesday 9/24)
4. Conclusions and discussion (Thursday 9/25)
Introduction: The Hugo Award for Best Novel is a Science Fiction and Fantasy (SFF) award given annually at the World Science Fiction Convention (WorldCon). Voted on by the attendees of WorldCon, the Hugo has been awarded since 1953. For more details, see the Hugo awards website.
In general, all SFF books published in the previous year are eligible for the Hugo award. Nominations are due early the following year, often by March, and voting takes place at the actual WorldCon, usually in August, although the exact timeline can vary slightly. For this report, we’ll be considering initial US publication dates—the date a book is first released in print—and the chances of getting nominated for or winning the Hugo award.
Today’s research question is simple: are some publication dates better than others for Hugo nominations and/or wins? Some believe that January is too early to be published, as voters will forget about the novel when nomination season rolls around. Likewise, December may be too late, as readers won’t have enough time to read and process the book before nominations are due. Does a statistical analysis confirm these expectations?
Findings: When it comes to receiving a Hugo nomination, Chaos Horizon’s statistical analysis suggests that there is a “publication window” that extends from May to October. Let’s take a look at the data, which I generated by looking up the initial print US publication dates for 68 nominated novels between 2001 and 2014:
As you can see, there is a definite peak during the middle of the year. May (10 nominations), July (9 nominations), and June (9 nominations) were the best months, with October (8 nominations) also providing a solid option. November (2 nominations) and December (a sad 0 nominations) were the worst months. February was surprising, with 6 nominations, showing that all months—except December—have some promise.
When we break this down by season, the trend is even clearer.
We have a nice bell-shaped curve, with nominations peaking in the summer months and falling off on either side. I think the conclusion is pretty obvious: Summer is the best time to be published if you want a Hugo nom, with late Spring and early Fall being your other viable alternatives.
The window for maximum Hugo nomination chances extends from May to October, and the difference is pretty stark:
Nearly 75% of the nominees come from that May-October window, and only roughly 25% come from outside of it. While there may be other reasons to publish early in the year—a less competitive marketplace, for instance—when it comes to getting nominated for the Hugo, your best chances lie in publishing between May and October. Still, 28% is nothing to sneeze at. Life exists outside the “publication window,” and SFF readers are capable of finding good novels whenever they are published.
Tomorrow, we’ll consider what effect publication date has on winning the Hugo.