The Hugo is a strange award. One Hugo matters a great deal—the Best Novel. It sells copies of books, and defines for the casual SFF fan the “best” of the field. The Novella, Novelette, and Short Story also carry significant weight in the SFF field at large, helping to define rising stars and major works. Some of the other categories feel more like insider awards: Editor, Semiprozine. Others feel like fun ways to nod at the SFF fandom (Fanzine). All of them work slightly differently, and there’s a huge drop off between categories. That’s our point of scrutiny today, so let’s get to some charts.
First, let’s get some baseline data out there: the total number of nominating ballots per year. I also included the final voting ballots. Data gets spotty on the Hugo website, thus the blank spots. If anyone has that data, point me in that direction!
I pulled that data off the HugoAward.org site, save for the flagged 895, which I grabbed from this File 770 post.
Now, how popular is each category? How many of those total nominators nominate in each category? First up, the averages for 2006-2015:
I included to averages for you: the 2006-2015 average, and then the 2006-2013 average. This shows how much the mix of Loncon, the Puppy vote, and increased Hugo scrutiny have driven up these numbers.
What this table also shows is how some categories are far more popular than others. Several hundred more people vote in the Novel category than in the next most popular category of Dramatic Long, and major categories like Novella and Novelette only manage around 50% of the Novel nominating vote. That’s a surprising result, and may show that the problem with the Hugo lies not in the total number of voters, but in the difficulty those voters have in voting in all categories. I’ve heard it mentioned that a major problem for the Hugo is “discovery”: it’s difficult to have a good sense of the huge range of novellas, novelettes, short stories, etc., and many people simply don’t vote in the categories they don’t know. It’d be interesting to have a poll: how many SFF readers actually read more than 5 new novels a year? 5 new novellas? I often don’t know if what I’m reading is a novella or a novelette, and does the lack of clarity in this categories hurt turnout?
Let’s look at this visually:
Poor Fan Artist category. That drop off is pretty dramatic across the award. Are there too many categories for people to vote in?
Let’s focus in on 2015, as that’s where all the controversy is this year. I’m interested in the percentage of people who voted for each category, and the number of people who sat out in each category.
Table 4: Percentage of Voters and “Missing Votes” per Hugo Category, 2015 Only
The chart at the top tells us a total of 2122 nominated in the Hugos, but no category managed more than 87% of that total. The missing votes columns is 2122 minus the number of people who actually nominated. I was surprised at how many people sat out each category. Remember, each of those people who didn’t vote in Best Novel, Best Short Story, etc., could have voted up to 5 times! In the Novella category alone, 5000 nominations were left on the table. If everyone who nominated in the Hugos had nominated in every category, the Puppy sweeps most likely wouldn’t have happened.
Again, let’s take a visual look:
That chart re-enforces the issue in the awards: less than 50% turnout in major categories like Novella, Short Story, and Novelette.
What to conclude from all of this? Total number of ballots isn’t as important as to who actually nominates in each category. Why aren’t people nominating in things like Short Story? Do the nominations happen too early in the year? Are readers overwhelmed by the sheer variety of works published? Do readers not have strong feelings about these works? Given the furor on the internet over the past few weeks, that seems unlikely. If these percentages could be brought up (I have no real idea how you’d do that), the award would immediately look very different.
Tomorrow, we’ll drill more deeply into the Fiction categories, and look at just how small the nominating numbers have been over the past decade.
Periodically, Chaos Horizon publishes extensive reports on various issues relating to SFF awards. One important context for this year’s Hugo controversy is the question of nomination numbers. New readers who are coming into the discussion may be unaware of how strange (for lack of a better word) the process is, and how few votes it has historically taken to get a Hugo nomination, particularly in categories other than Best Novel. As a little teaser of the data we’ll be looking at, consider this number: in 2006, it only took 14 votes to make the Best Short Story Hugo final ballot.
While those numbers have risen steadily over the past decade, they’re still shockingly low: in 2012, it took 36 votes in the Short Story category; in 2013, it took 34 votes; in 2014, we jumped all the way to 43. This year, with the Sad Puppy/Rabid Puppy influence, the number tripled to 132. That huge increase causes an incredible amount of statistical instability, to the point that this year’s data is “garbage” data (i.e. confusing) when compared to other years.
Without having a good grasp of these numbers and trends, many of the proposed “fixes”—if a fix is needed at all, and that this isn’t something that will work itself out over 2-3 years via the democratic process—might exacerbate some of the oddities already present within the Hugo. The Hugo has often been criticized for being an “insider” award, prone to log-rolling, informal cliques, and the like. While I don’t have strong opinions on any of those charges, I think it’s important to have a good understanding of the numbers to better understand what’s going on this year.
Chaos Horizon is an analytics, not an opinion, website. I’m interested in looking at all the pieces that go into the Hugo and other SFF awards, ranging from past patterns, biases, and oddities, to making future predictions as what will happen. I see this as a classic multi-variable problem: a lot of different factors go into the yearly awards, and I’ve been setting myself the task of trying to sort through some (and only some!) of them. Low nominating numbers are one of the defining features of the Hugo award; that’s just how the award has worked in the past. That’s not a criticism, just an observation.
I’ve been waiting to launch this report for a little while, hoping that the conversation around this year’s Hugo to cool off a little. It doesn’t look like that’s going to happen. The sheer flood of posts about this year’s Hugos reveal the desire that various SFF communities have for the Hugo to be the “definitive” SFF award, “the award of record.” File 770 has been the best hub for collecting all these posts; check them out if you want to get caught up on the broader conversation.
I don’t think any award can be definitive. That’s not how an award works, whether it’s the Hugo, the Pulitzer, or the Nobel prize. There are simply too many books published, in too many different sub-genres, to too many different types of fans, for one award to sort through and “objectively” say this is the best book. Personally, I don’t rely on the Hugo or Nebula to tell me what’s going on in the SFF field. I’ve been collating an Awards Meta-List that looks at 15 different SFF awards. That kind of broad view is invaluable if you want to know what’s happening across the whole field, not only in a narrow part of it. Lastly, no one’s tastes are going to be a perfect match for any specific award. Stanislaw Lem, one of my favorite SF authors, was never even nominated for a Hugo or Nebula. That makes those awards worse, not Lem.
Finally, I don’t mean this report to be a critique of the Worldcon committees who run the Hugo award. They have an incredibly difficult (and thankless) job. Wrestling with an award that has evolved over 50 years must be a titanic task. I’d like to personally thank them for everything they do. Every award has oddities; they can’t help but have oddities. Fantasizing about some Cloud-Cuckoo-Land “perfect” SFF award isn’t going to get the field anywhere. This is where we’re at, this is what we’ve have, so let’s understand it.
So, enough preamble: in this report we’ll be looking at the last 10 years of Hugo nomination data, to see what it takes to get onto the final Hugo ballot.
Background: If you already know this information, by all means skip ahead.
TheHugoAwards.org themselves provide an intro to the Hugos:
The Hugo Awards, presented annually since 1955, are science fiction’s most prestigious award. The Hugo Awards are voted on by members of the World Science Fiction Convention (“Worldcon”), which is also responsible for administering them.
Every year, the attending or supporting members of the Worldcon go through a process to nominate and then vote on the Hugo awards. There are a great many categories (it’s changed over the years; we’re at 16 Hugo categories + the Campbell Award, which isn’t a Hugo but is voted on at the same time by the same people) ranging from Best Novel down to more obscure things like Best Semiprozine and Best Fancast.
If you’re unfamiliar with the basics of the award, I suggest you consult the Hugo FAQs page for basic info. The important bits for us to know here are how the nomination process works: every supporting and attending member can vote for up to 5 things in each category, and each of those votes counts equally. This means that someone who votes for 5 different Best Novels has 5 times as much influence as a voter who only votes for 1. Keep that wrinkle in mind as we move forward.
The final Hugo Ballot is made up of the 5 highest vote getters in each category, provided that they reach at least 5% total votes. This 5% rule has come into play several times in the last few years, particularly in the Short Story category.
Methodology: I looked through the Hugo Award nominating stats, archived at TheHugoAwards.org, and manually entered the highest nominee, the lowest nominee, and the total number of ballots (when available) for each Hugo category. Worldcon voting packets are not particularly compatible with data processing software, and it’s an absolute pain to pull the info out. Hugo committees, if you’re listening, create comma separated value files!
I chose 10 years as a range for two reasons. First, the data is easily available for that time range, and it gets harder to find for earlier years. The Hugo website doesn’t have the 2004 data readily linked, for instance. While I assume I could find it if I hunted hard enough, it was already tedious enough to enter 10 years of data. Second, my fingers get sore after so much data entry!
Since the Worldcon location and organizing committees change every year, the kind of data included in the voting results packet varies from year to year as well. Most of the time, they tell us the number of nominating ballots per category; some years they don’t. Some have gone into great detail (number of unique works nominated, for instance), but usually they don’t.
Two methodological details: I treated the Campbell as a Hugo for the purposes of this report: the data is very similar to the rest of the Hugo categories, and they show up on the same ballot. That may irk some people. Second, there have been a number of Hugo awards declined or withdrawn (for eligibility reasons). I marked all of those on the Excel spreadsheet, but I didn’t go back and correct those by hand. I was actually surprised at how little those changes mattered: most of the time when someone withdrew, it affected the data by only a few votes (the next nominee down had 20 instead of 22 votes, for instance). The biggest substantive change was a result of Gaiman’s withdrawal last year, which resulted in a 22 vote swing. If you want to go back and factor those in, feel free.
Thanks to all the Chaos Horizon readers who helped pull some of the data for me!
Here’s the data file as of 5/5/2015: Nominating Stats Data. I’ll be adding more data throughout, and updating my file as I go. Currently, I’ve got 450 data points entered, with more to come. All data on Chaos Horizon is open; if you want to run your own analyses, feel free to do so. Dump a link into the comments so I can check it out!
Results: Let’s look at a few charts before I wrap up for today. I think the best way to get a holistic overview of the Hugo Award nominating numbers is to look at averages. Across all the Hugo categories and the Campbell, what were the average number of ballots per category, the votes per top nominee (i.e. the work that took #1 in the nominations), and the votes per low nominee (the work that took place #5 in the nominations)? That’s going to set down a broad view and allow us to see what exactly it takes (on average) to get a Hugo nom.
Of course, every category works differently, and I’ll be more closely looking at the fiction categories moving forward. The Hugo is actually many different awards, each with slightly different statistical patterns. This makes “fixing” the Hugos by one change very unlikely: anything done to smooth the Best Novel category, for instance, is likely to destabilize the Best Short Story category, and vice versa.
On to some data:
This table gives us a broad holistic view of the Hugo Award nominating data. What I’ve done is taken all the Hugo categories and averaged them. We have three pieces of data for each year: average ballots per category (how many people voted), average number of votes for the high nominee, and average votes for the low nominee. So, in 2010, an average of 362 people voted in each category, and the top nominee grabbed 88 votes, the low nominee 47.
Don’t worry: we’ll get into specific categories over the next few days. Today, I want the broad view. Let’s look at this visually:
2007 didn’t include the number of ballots per category, thus the missing data in the graph. You can see in this graph that the total number of ballots is fairly robust, but that the number of votes for our nominated works are pretty low. Think about the space between the bottom two lines as the “sweet spot”: that’s how many votes you need to score a Hugo nomination in any given year. If you want to sweep the Hugos, as the Puppies did this year in several categories, you’d want to be above the Average High Nom line. For most years, that’s meant fewer than 100 votes. In fact, let’s zoom in on the High and Low Nom lines:
This graphs let us set mathematical patterns that are hard to see when just looking at numbers. Take your hand and cover up everything after 2012 on Chart #2: you’ll see a steady linear increase in the high and low ranges over those 8 years, rising from about 60 to 100 for the high nominee and 40 to 50 for the low nominee. Nothing too unusual there. If you’re take your hand off, you’ll see an exponential increase from 2012-2015: the numbers shoot straight up. That’s a convergence of many factors: the popularity of LonCon, the Puppies, and the increased scrutiny on the Hugos brought about by the internet.
What does all this mean? I encourage you to think and analyze this data yourself, and certainly use the comments to discuss the charts. Don’t get too heated; we’re a stats site, not a yell at each other site. There’s plenty of those out there. :).
Lastly, this report is only getting started. Over the next few days—it takes me a little bit of time to put together such data-heavy posts—I’ll be drilling more deeply into various categories, and looking at things like:
1. How do the fiction categories work?
2. What’s the drop off between categories?
3. How spread out (i.e. how many different works are nominated) are the categories?
What information would be helpful for you to have about the Hugos? Are you surprised by these low average nomination numbers, or are they what you’d expect? Is there a discrepancy between the “prestige” of the Hugo and the nomination numbers?