# Building the Nebula Model, Part 5

This post continues my discussion on building my 2015 Nebula Best Novel prediction. See Part 1 for an introduction, Part 2 for a discussion of my Indicators, Part 3 for a discussion of my methodology/math, and Part 4 for a discussion of accuracy.

Taken together, those posts should help explain to anyone new to Chaos Horizon how Chaos Horizon works. This post wraps things up with a to-do list for 2015. To update my model for 2015, this is what I need to do:

1. Update the data sets with the results of the 2014 Nebulas, Hugos, and everything else that happened last year.

2. Rethink the indicators, and possibly replace/refine some of them.

3. Reweight the indicators.

4. Test model reliability using the reweighed indicators.

5. Use the indicators to build probability tables for the 2015 Nebulas.

6. Run the probability tables through the weights to come up with the final results.

I won’t be able to get all of this done until mid-April. It doesn’t make any sense to run the numbers until the Hugo noms come out, and those will be coming out this Saturday (April 4th).

# Building the Nebula Model, Part 4

This post continues my discussion on building my 2015 Nebula Best Novel prediction. See Part 1 for an introduction, Part 2 for a discussion of my Indicators, and Part 3 for a discussion of my methodology/math.

Now to the only thing anyone cares about: how reliable is the model?

Here’s my final Nebula prediction from 2014:

1. Ann Leckie, *Ancillary Justice* (25.8%) (winner)

2. Neil Gaiman, *The Ocean at the End of the Lane* (20.7%)

3. Nicola Griffith, *Hild* (11.2%)

4. Helene Wecker, *The Golem and the Jinni* (10.6%)

5. Karen Joy Fowler, *We Are All Completely Beside Ourselves* (9.8%)

6. Linda Nagata, *The Red: First Light* (8.2%)

7. Sofia Samatar, *A Stranger in Olondria* (7.7%)

8. Charles E. Gannon, *Fire with Fire* (6.0%)

As you can see, my model attaches % chances to each nominee, deliberately avoiding the certainty of proclaiming one work the “sure” winner. This reflects how random the Nebula has been at times. There have been some true left-field winners (*The Quantum Rose*, for instance) that should remind us statistical certainty is not a possibility in this case.

Broadly speaking, I’m seeking to improve our sense of the odds from a coin-flip/random model to something more nuanced. For 2014, a “coin-flip” (i.e. randomly picking a winner) would have given a 12.5% chance to each of these 8 nominees. My prediction doubled those odds for Leckie/Gaiman, and lessened them for everyone else. While that lacks the crisp assurance of “this person will definitely win,” I think it correctly reflects how variable and unpredictable the Nebula really is.

A fundamental weakness of my model is that it *does not* take into account the specific excellence/content of a book. I’ve done that deliberately. My thought process is that if you want analysis/forecasts based on the content/excellence of a book, you can find that elsewhere on the web. I want Chaos Horizon to do something different, not necessarily imitate what’s already being done. I don’t know how, in the more abstract/numerical terms that Chaos Horizon uses, to measure the relative quality of a Leckie versus a Gaiman. I don’t think Amazon or Goodreads measures quality in a compelling enough fashion to be useful for Chaos Horizon, although I’m happy to listen to counter-arguments.

Even if we could come up with an objective measure of quality, how would we correlate that measurement to the Nebulas? Some of my indicators do (either directly or indirectly) mirror excellence/content, but they do so at several removes. If a book gets lots of nominations, I’m accepting that SFF readers (and voters) probably like it. If SFF readers like it, it’s more likely to win awards. Those are pretty tepid statements. I’m not, at least for the purposes of Chaos Horizon, analyzing the books for excellence/content myself. I believe that an interesting model could be built up by doing that—anyone want to start a sister website for Chaos Horizon?

Lastly, I’ve tried to avoid inserting too much of my opinion into the process. That’s not because I don’t value opinion; I really like opinion driven web-sites on all sides of the SFF discussion. Opinion is a different model of prediction than what I use. I think the Nebula/Hugo conversation is best served by having a number of different analyses from different methodological and ideological perspectives. Use Chaos Horizon in conjunction with other predictions, not as a substitute for them.

I posted last year about how closely my model predicted the past 14 years of the Nebulas. The formula was 70% successful at predicting the winner. Not terrible, but picking things that have already happened doesn’t really count for much.

I’ll wrap up this series of posts with my “To-Do” list for the 2015 Nebula Model.

# Building the Nebula Model, Part 3

This post continues my discussion on building my 2015 Nebula Best Novel prediction. See Part 1 for an introduction, and Part 2 for a discussion of my Indicators.

Now that we have 12 different indicators, how do I combine them together? This is where the theory gets sticky: how solidly do I want to treat each indicator? Am I try to find correlations between them? Do I want to pick one as the “main” indicator as my base, and then refine that through some recursive statistical process? Do I treat each indicator as independent, or are some dependent on each other? Do I treat them as opinions or facts? How complicated do I allow the math to be, given the low N we have concerning the Nebulas?

I thought about this, and read some math articles and scoured the internet, and I decided to use an interesting statistical tool: the Linear Opinion Pool. Under this model, I treat my data mining results as opinions, and combine them together using a Reliability Factor, to get a weighted combined percentage score. This keeps us from taking the data mining results too seriously, and it allows us to weigh a great number of factors without letting one of them dominate.

Remember, one of my goals on Chaos Horizon is to keep the math transparent (at a high school level). I want everyone who follows Chaos Horizon to be able to understand and explain how the math works; if it doesn’t, it becomes sort of a mysterious black box that lends an air of credibility and mystery to the statistics that I don’t want.

Here’s a basic definition of a Linear Opinion Pool:

a weighted arithmetic average of the experts’ probability distributions. If we let F

_{i}(x) denote expert i’s probability distribution for an uncertain variable of interest (X), then the linear opinion pool F_{c}(x) that results from combining k experts is:

where the weight assigned to F_{i}(x) is w_{i}, and Σw_{i}= 1.Although the linear opinion pool is a popular and intuitive combination method with many useful properties, there is no method for assigning weights that is derived entirely from first principles. One can, however, interpret the weights in a variety of ways, and each interpretation lends itself to a particular way to calculate the weights.

This is a model often used in risk analysis, where you have a number of competing opinions about what is risky, and you want to combine those opinions to find any possible overlap (while also covering your ass from any liability). There’s plenty of literature on the subject; just google “Linear Opinion Pool” for more reading.

We have the probability distributions from my data mining. What weights do I use? That’s always the challenge in a Linear Opinion Pool. For Chaos Horizon, I’ve been weighting by how often that Indicator has actually chosen the Nebula in the past. So, if you used that Indicator and that Indicator alone to guess, how often would you actually be right? Not every Indicator comes into play every year, and sometimes an Indicator doesn’t help (like if all the nominated novels previously had Nebula nominations). We’ll be looking at all that data late in April.

Now, on to my mathematical challenge: can I explain this in easy to understand terms?

A Linear Opinion Pool works this way: you walk into a bar and everyone is talking about the Nebula awards. You wander around, and people shout out various odds at you: “3 out of 4 times a prior Nebula nominee wins” or “70% of the time a science fiction novel wins” and so forth. Your head is spinning from so much information; you don’t know who to trust. Maybe some of those guesses overlap, maybe some of those don’t. All of them seems like experts—but how expert?

Instead of getting drunk and giving up, you decide to sum up all the opinions. You figure, “Hell, I’ll just add all those probabilities up, and then divide by the total number of suggestions.” Then you begin to have second doubts: that guy over in the corner is really wasted and doesn’t seem to know what he’s talking about. I sidle over and ask his friend: how often has that guy been right in the past? He says 5% of the time, but that guy over there—the one drinking gin and tonics—is right 50% of the time. So I figure I better weight each opinion based on how correct they’ve been in the past. I add things up using those weights, and viola!, I’ve got my prediction.

Advantages:

It’s mathematically easy to calculate; no fancy software needed.

This allows me to add more indicators (opinions) very easily.

This treats my data mining work as an “opinion,” not a fact, which I think is closer to reality.

The weighting component allows me to dial up or dial down indicators easily.

The simple mathematics reflects the relative low amount of data.

The methodology is easy for readers to follow.

Disadvantages

It’s not as mathematically rigorous as other statistical models.

The weighting component introduces a human element into the model which may be unreliable.

Because this treats my data mining results as “opinions,” not “facts,” it may compromise the reliability of the model for some readers.

Because it is simple, it lacks the flashiness and impressiveness of grander statistical models.

When we’re dealing with statistical modeling, the true test is the results. A rigorous model that is wrong all the time is worse than a problematic model that is right all the time. In my next post, we’ll talk about past accuracy. Here’s my older posts on the Linear Opinion Pool and weighting if you want some more info.

As a last note, let me say that following the way model is constructed is probably more interesting and valuable than the final results. It’s the act of thinking through how different factors might fit together that is truly valuable. Process, not results.

# Building the Nebula Model, Part 2

This post continues my discussion on building my 2015 Nebula Best Novel prediction. See Part 1 for an introduction.

My model combines a number of factors (which I’m calling indicators) of past Nebula Best Novel success to come up with an overall percentage.

In 2014, I used 12 different indicators of Nebula success based on Nebula Data from 2001-2014. They were as follows:

Indicator #1: Nominee has previously been nominated for a Nebula. (84.6%)

Indicator #2: Nominee has previously been nominated for a Hugo. (76.9%)

Indicator #3: Nominee has previously won a Nebula award for best novel. (46.1%)

Indicator #4: Nominee was the year’s most honored nominee (Nebula Wins + Nominations + Hugo Wins + Nominations). (53.9%)

Indicator #5: Nominated novel was a science fiction novel. (69.2%).

Indicator #6: Nominated novel places in the Locus Awards. (92.3%)

Indicator #7: Nominated novel places in the Goodreads Choice Awards. (100%)

Indicator #8: Nominated novel appears on the Locus Magazine Recommended Reading List. (92.3%)

Indicator #9: Nominated novel appears on the Tor.com or io9.com Year-End Critics’ list. (100%)

Indicator #10: Nominated novel is frequently reviewed and highly scored on Goodreads and Amazon. (unknown%)

Indicator #11: Nominated novel is also nominated for a Hugo in the same year. (73.3%)

Indicator #12: Nominated novel is nominated for at least one other major SF/F award that same year. (69.2%)

NOTE: These percentages have not yet been updated with the 2014 results. Leckie’s win in 2014 will lower the % value of Indicators #1-4 and raise the % value of Indicators #5-12. That’s on my to-do list over the next few weeks.

To come up with those percentages, I looked up the various measurables about Nebula nominees (past wins, placement on lists, etc.) using things like the Science Fiction Award Database. I then looked for patterns in that data (strong correlations to winning the Nebula), and then turned those patterns into the percentage statements you see above.

Using those statements, I calculate the probability for each of the 2015 nominees for each Indicator. So, for example, take Indicator #1: Nominee has Previously Been Nominated for a Nebula. Such novels win the Nebula a robust 84.6% percent of the time. Of this year’s 6 nominees, 4 have previously been nominated for a Nebula (Leckie, VanderMeer, McDevitt, Gannon). If I considered no other factors, each would wind up with a (84.6% / 4) = 21.2% chance to win the Nebula. Our two fist timers (Liu and Addison) have to split the paltry remnants ((100% – 84.6%)) / 2 = 7.7% each.

I like it when my indicators make some logical sense: a prior Nebula nominee is more familiar to the SFWA voting audience, and thus has an easier time grabbing votes. That bias is reflected in the roughly 13% advantage prior nominees gain in a category. That is a significant bump, but not an overwhelming one. It would be pretty unsatisfying to end there. Past Nebula noms are just one possible indicator: by doing the same kind of calculation for all 12 of my indicators, and then combining them together, we get a more robust picture. Leckie had never been nominated for a Nebula before last year, but she won anyway; she dominated many of the other indicators, and that’s what pushed her to the top of my prediction.

So, that’s the basic methodology: I find past patterns, translate those into percentage statements, and then use those percentages to come up with a probability distribution for the current year. I then combine those predictions together to come up with my final prediction.

I’ve got to make a couple tweaks to my Indicators for 2015. First off, I was never able to get Indicator #10 to work properly. Finding a correlation between Amazon/Goodreads ratings or scores and Nebula/Hugo wins has so far, at least for me, proved elusive. I also think I need to add an Indicator about “Not being a sequel”; that should help clarify this year, where the Leckie, McDevitt, and Gannon novels are all later books in a series. I’m tossing around adding a “Didn’t win a Best Novel Nebula the previous year” concept, but I’ll see how things work out. EDIT: This would be there to reflect how rare back to back Nebula wins are. That has only happened 3 times (Delany, Pohl, Card), and hasn’t happened in 30 years. This’ll factor in quite a bit this year: is Leckie looking at back to back wins, or will voters want to spread the Nebula around?

I’m always looking for more indicators, particularly if they can yield high % patterns. Let me know if you think anything should be added to the list. The more Indicators we have, the more balanced the final results, as any one indicator has less of an impact on the overall prediction.

You’ll notice that my Indicators break into four main parts: Past Awards History, Genre, Current Year Critical/Reader Reception, and Current Year Awards. Those four seem the big categories that determine (in this kind of measure) whether or not you’re a viable Nebula candidate.

In the next post, we’ll talk about how this data gets weighted and combined together.

# Building the Nebula Model, Part 1

The raison de’etre of Chaos Horizon has always been to provide numerical predictions for the Nebula and Hugo Awards for Best Novel based on data-mining principles. I’ve always liked odds, percentages, stats, and so forth. I was surprised that no one was doing this already for the major SFF awards, so I figured I could step into this void and see where a statistical exploration would take us.

Over the past few months, I’ve been distracted trying to predict the Hugo and Nebula slates. Now that we have the Nebula slate—and the Hugo is coming shortly—I can turn my attention back to my Nebula and Hugo models. Last year, I put together my first mathematical models for the Hugo and Nebulas. They both predicted eventual winner Leckie, which is good for the model. As I’ll discuss in a few posts, my currently model has around 67% accuracy over the last 15 years. Of course, past accuracy is not going to make things accurate in the future, but at least you know where the model stands. In a complex, multi-variable problem like this, perfect accuracy is impossible.

I’m going to rebuilding and updating the model over the next several weeks. There’s a couple tweaks I want to make, and I also wanted to bring Chaos Horizon readers into the process who weren’t around last year. Over the next few days, we’ll go through the following:

1. Guiding principles

2. The basics of the model

3. Model reliability

4. To-do list for 2015

Let’s get started today with the guiding principles for my Nebula and Hugo models:

1. **The past predicts the future.** Chaos Horizon uses a type of statistics called data-mining, which means I look for statistical patterns in past data to predict the future. There are other equally valid statical models such as sampling. In a sampling methodology, you would ask a certain number of Nebula or Hugo voters what there award votes were going to be, and then use that sample to extrapolate the final results, usually correcting for demographic issues. This is the methodology of Presidential voting polls, for instance. A lot of people do this informally on the web, gathering up the various posted Hugo and Nebula ballots and trying to predict the awards from that.

Data-mining works differently. You take past data and comb through it to come up with trends and relationships, and then you assume (and it’s only an assumption) that such trends will continue into the future. Since there is carryover in both the SFWA and WorldCon voting pools, this makes a certain amount of logical sense. If the past 10 years of Hugo data show that most of the time a SF novel always wins, you should predict a SF novel to win in the future. If 10 years of data show that the second novel in a series never wins, you shouldn’t predict a second novel to win.

Now, the data is usually not that precise. Instead, there is a historical bias towards SF novels, and first or stand alone novels, and past winners, and novels that do well on critical lists, and novels that do well in other awards, etc. What I do is I transform these observations into percentages (60% of the time a SF novel wins, 75% of the time the Nebula winner wins the Hugo, etc) and then combine those percentages to come up with a final percent. We’ll talk about how I combine all this data in the next few posts.

Lastly for this point, data-mining has difficult predicting sudden and dramatic changes in data sets. Huge changes in sentiment will be missed in what Chaos Horizon does, as that isn’t reflected in past statistical trends. Understand the limitations of this approach, and proceed accordingly.

2. **Simple data means simple statistics.** The temptation for any statistician is to use the most high-powered, shiny statistical toys on their data sets: multi-variable regressions, computer assisted Bayesian inferences, etc. All that has it’s place, and maybe in a few years we’ll try one of those out to see how far off it is from the simpler statistical modeling Chaos Horizon uses.

For the Nebulas and Hugos, though, we’re dealing with a low N (number of observations) but a high number of variables (genre, awards history, popularity, critical response, reader response, etc.). As a result, the project itself is—from a statistical reliability perspective—fatally flawed. That doesn’t mean it can’t be interesting, or that we can’t learn anything from close observation, but I never want to hide the relative lack of data by pretending my results are more solid than they seem. Low data will inevitably result in unreliable predictions.

Let’s think about what the N is for the Nebula Award. Held since 1966, 219 individual novels have been nominated for the Nebula. That’s our N, the total number of observations we have. We don’t get individual voting numbers for the Nebula, so that’s not an option for a more robust N. Compare that to something like the NCAA basketball tournament (since it’s going on right now). That’s been held since 1939. The field expanded to our familiar 64 teams in 1985. That means, in the tournament proper (the play-in round is silly), 63 games are contested every year since 1985. So, if you’re modeling who will an NCAA tournament game, you have 63 * (2014-1985) = 1827 data sets. Now, if we wanted to add in the number of games played in the regular season, we’d wind up with 347 teams (Division I teams) * 30 games each / 2 (they play each other, so we don’t want to use every game twice) = 5,205 more observations. That’s just one year of college basketball regular season games! Multiply that by 30 seasons, and you’re looking at an N of 150,000 in the regular season, plus an N of 2,000 for the postseason. You can do a lot with data sets that big!

So our 219 Nebula Best Novel observations looks pretty paltry. Let’s throw in the reality that the Nebulas have changed greatly over the last 40 years. Does 1970 data really predict what will happen in 2015? That’s before the internet, before fantasy became part of the process, etc. So, at Chaos Horizon, I primarily use the post 2000 data: new millennia, new data, new trends. That leaves us with an N of a paltry 87. From a statistical perspective, that should make everyone very sad. One option is to pack up and go home, to conclude that any trends we see in the Nebulas will be random statistical noise.

I do think, however, that the awards have some very clear trends (favoring certain kinds of novels, favoring past nominees and winners) that help settle down the variability. Chaos Horizon should be considered an experiment—perhaps a grand failed experiment, but those are the best kind—to see if statistics can get us anywhere. Who knows that but in 5 years I’ll have to conclude that no, we can’t use data-mining to predict the awards?

3. **No black boxing the math.** A corollary to point #2, I’ve decided to keep the mathematics on Chaos Horizon at roughly the high school level. I want anyone, with a little work, to be able to follow the way I’m putting my models together. As such, I’ve had to chose some simpler mathematical modeling. I think that clarity is important: if people understand the math, they can contest and argue against it. Chaos Horizon is meant to be the beginning of a conversation about the Hugos and Nebulas, not the end of one.

So I try to avoid the following statement: given the data, we get this prediction. Notice how that sentence isn’t logically constructed: how was the data used? What kind of mathematics was it pushed through? If you wanted to do the math yourself, could you? I want to write: given this data, and this mathematical processing of that data, we get this prediction.

4. **Neutral presentation.** To trust any statistical presentation, you have to trust that the statistics are presented in a fair, logical, and unbiased fashion. While 100% lack of bias is impossible as long as humans are doing the calculating, the attempt for neutrality is very important for me on this website. Opinions are great, and have their place in the SFF realm: to get those, simply go to another site. You won’t find a shortage of those!

Chaos Horizon is trying to do something different. Whether I’m always successful or not is for you to judge. Keep in mind that neutrality does not mean completely hiding my opinions; doing so is just as artificial as putting those opinions in the forefront. If you know some of my opinions, it should allow you to critique my work better. You should question everything that is put up on Chaos Horizon, and I hope to facilitate that questioning by making the chains of my reasoning clear. What we want to avoid at all costs is saying: I like this author (or this author deserves an award), therefore I’m going to up their statistical chances. Nor do I want to punish authors because I dislike them; I try and apply the same processing and data-mining principles to everyone who comes across my plate.

5. **Chaos Horizon is not definitive**. I hold that the statistical predictions provided on Chaos Horizon are no more than opinions. Stats like this are not a science; the past is not a 100% predictor of the future. These opinions are arrived at through a logical process, but since I am the one designing and guiding the process, they are my ideas alone. If you agree with the predictions, agree because you think the process is sound. If you disagree with the process, feel free to use my data and crunch it differently. If you really hate the process, feel free to find other types of data and process them in whatever way you see appropriate. Then post them and we can see if they make more sense!

Each of these principles is easily contestable, and different statisticians/thinkers may wish to approach the problem differently. If I make my assumptions, biases, and axioms clearly visible, this should allow you to engage with my model fully, and to understand both the strengths and weaknesses of the Chaos Horizon project.

I’ll get into the details of the model over the next few days. If you’ve got any questions, let me know.

# Terry Pratchett Dies at 66

A sad day for SFF fandom, with the passing of Terry Pratchett at 66. Pratchett had been dealing with Alzheimer’s disease. *The Telegraph* has a nice page of various tributes.

Pratchett was one of our great—if not the greatest—fantasy humorists. His long-running Discworld series is a wealth of humor, satire, and imagination; it has never gotten the credit it deserves as one of the best and most inventive fantasy series of the 1980s and 1990s.

I first read *The Light Fantastic* when I was in middle school. I bought the book at a bookstore on pure speculation, knowing nothing about it. Given that this is a direct sequel to *The Colour of Magic*, I read Pratchett’s book in a state of amazed confusion. I remember having no idea what was going on, but I was overwhelmed by the sheer scope of creation on display in the book: wizards, tourists, homicidal luggage. I told all my friends about Pratchett, and we proceeded to tear through his books over the next several weeks: *The Colour of Magic*, *Mort* (still a favorite of mine), and *Equal Rites*. Pratchett’s ability to continual generate new plotlines and new characters for his world, his ability to use fantasy not as a space of repetition but of innovation, his skill at poking fun at our society through the lens of another society, all placed him in the first class of fantasy writers.

I’ve read Pratchett my whole life. When I finished my dissertation—about the role of the Post Office in American literature—I celebrated by reading *Going Postal*. Just this summer I wrote an essay about Pratchett, Albert Camus, and the Luggage, set to appear in the forthcoming *Discworld and Philosophy*. Pratchett was one of the authors of my life, who touched me in my teens, twenties, and thirties. The world is lesser for not having him in it.

Pratchett never won a Hugo or Nebula award. Neither awards have ever known what to do with humorous/satirical SFF. Both awards failed to live up to the imagination that Pratchett showed in his best work: it’s easier to celebrate the serious and prestigious than the fantastic. Our field should have done better. Pratchett did receive Nebula nominations late in his career, in 2006 (*Going Postal*) and 2009 (*Making Money*). Neither are among his best books. *Mort*, *Guards! Guards!*, and *Small Gods* all would have been worthy winners, but I’d draw your attention to 2003, the year that Robert Sawyer won the Hugo for *Hominids*. Pratchett published *The Night Watch* in 2002, a twisty time-travel caper, that would have been an outstanding winner for that year.

None of that matters: Pratchett’s books matter. His legacy will stand, and I have no doubt that young SFF fans will be book up Pratchett books for decades to come, and discovering the same fantastic worlds that I did when I was a child.

Thank you, Terry, for everything you’ve done for me.

# Celebrating One Year of Chaos Horizon!

Exactly one year ago I launched Chaos Horizon with my first post:

Chaos Horizon is a blog with a simple purpose: to predict the winners of the Nebula and Hugo Awards for best novel. To do so, I’ll be examining past trends in the Nebula and Hugo awards. By closely data mining this information, I’ll develop a predictive model that will allow us to make some educated guesses as to the eventual winner. Given the prestige of these two awards, they receive remarkably little analysis or prediction on the web. Hopefully Chaos Horizon can close that gap.

Little did I know what I was getting into! It’s been an interesting year, and Chaos Horizon has certainly changed a great deal over the last 12 months. I started the website for two main reasons:

1. I’m a little older (turning 40 in November), and I felt pretty disconnected from social media and the discussions going on in those places. While I’ve always been a SFF fan, I was so busy through the 2000s going to graduate school (i.e. I was locked up in a library) and then starting out as a college professor (i.e. I was locked in my office) that I didn’t get much of chance to keep up with the changing way that SFF fandom communicates. I thought starting a blog would be an interesting way to jump back into those conversations.

2. I love predictions, statistics, and lists, and I like disagreeing with those just as much as I like agreeing with them. For the Pulitzer Prize, there’s a great prediction website. For things like the Nobel Prize and the Booker, several websites publish betting odds. When I tried to look for these same things for the Hugo and Nebula, I couldn’t find them. Since I didn’t want my website to be 100% my opinions, I figured I’d take a shot at filling that gap.

And thus the history of Chaos Horizon! I’m trying to use data-mining techniques to come up with odds for the Hugo and Nebula awards. It’s been an interesting—and at times frustrating—problem to work on. The Hugos and Nebulas are awfully erratic, and they are full of all sorts of quirks and twists about what gets nominated and why. I feel like the various Reports I’ve issued have provided some clarity, but there’s still plenty to do. The awards are also changing very rapidly, and that makes any predictive work difficult.

**Traffic**: Always an intriguing question for websites. I think most bloggers are shy about sharing their stats, as if web traffic reflects the worth or meaning of a website. I didn’t have a lot of traffic expectations for Chaos Horizon: it seems to me that this kind of stat-work is a touch on the dull side, and the online SFF community is fairly small. Since I keep a fairly neutral tone (i.e. I don’t try to click-bait by weighing in on the various SFF tempests and controversies), I figured that would also hurt traffic.

But, good grief, was traffic slow in the first six months of Chaos Horizon! I had some naïve idea that if I started a blog, people would just pop out of the air to see it. In all of May 2014, for instance, I had a grand total of 23 views! Most of that is my fault, as I hadn’t the vaguest idea of how anything worked when it came to blogging.

Traffic began to pick up in August 2014, when I leapt from 41 views (total, not daily!) in July to 800 in August. This was when I published my first Nebula prediction, which caught some traction in the wider SFF world. That also when I started doing Review Round-Ups and actually linking to other blogs, which helped make me part of the community than an outsider to it.

Chaos Horizon has grown steadily since them. In 2015, I’m currently averaging about 150 views/day and 1000 views/week. That seems like an enormous increase in just a year, but, then again, I have no idea what other people’s blogs average. In the long run, it doesn’t matter: I enjoy what I do on Chaos Horizon, and I’m not doing it for the clicks.

I want to thank some of the early supporters of Chaos Horizon, including bloggers like From Couch to Moon, Reading SFF, Violin in a Void, Books, Brain and Beer, Far Beyond Reality, Lady Business, Nerds of a Feather, and the many others who have linked and discussed Chaos Horizon. I also want to thank the many people who have commented on Chaos Horizon; I appreciate your discussions, questions, and objections, and I look forward to more!

**The Future**: More of the same! There can never be enough stats, charts, and analysis. Through March and into April, I’ll be building up my Hugo and Nebula predictions for 2015. These will be a mathematical model of 10-12 different factors that contribute to the award, and we’ll end up with a % chance for each of the nominees to win. I also have plans to continue my reports: next up is Sequels, and then I’ll be tackling the issues of Gender, Book Length, and Age. Then I’ll shift my attention to 2016.

Well, once again, thank you to everyone who has contributed to the success of the first year of Chaos Horizon! Happy reading!

# The New Yorker Publishes Essay on Cixin Liu

*The New Yorker* ran a very complimentary essay about Cixin Liu’s *The Three-Body Problem* and his other stories, positioning him as China’s Arthur C. Clarke. Check it out here; it’s an interesting read.

This comes on the heels of Liu’s Nebula nomination for *The Three-Body Problem*, and will accelerate Liu being thought of as a “major” SF author. Essays like this are very important in establishing an author’s short and long term reputation. Much of the mainstream press follows the lead of *The New Yorker* and *The New York Times*; this means other newspapers and places like *NPR*, *Entertainment Weekly*, and others are going to start paying attention to Cixin Liu. While the influence of these venues on the smaller SFF community (and the Hugos and Nebulas) isn’t as significant, mainstream coverage does bleed over into how bookstores buy books, how publishers acquire and position novels, etc.

*The Dark Forest*, Liu’s sequel to *The Three-Body Problem*, comes out on July 7th. Expect that to get major coverage and to be a leading candidate for the 2016 Nebula and Hugo. I currently have Cixin Liu’s *The Three-Body Problem* at #6 in my Hugo prediction, and that may be too low. All this great coverage and exposure does come very late in the game: Hugo nominations are due March 10th. Liu’s novel came out on November 11th, and that’s not a lot of time to build up a Hugo readership. It does appear that most people who read *The Three-Body Problem* are embracing it . . . but will it be enough for a Hugo nomination?

As a Hugo prediction site, the hardest thing I have to account for is sentiment: how much do people like an individual novel? How does that enthusiasm carry over to voting? How many individual readers grabbed other readers and said “you’ve got to read this”? We can measure this a little by the force of reviews and the positivity of blogging, but this is a weakness in my data-mining techniques. I can’t account for the community falling in love with a book. Keep in mind, initial Liu reviews were a little measured (check out this one from Tor.com, for instance, that calls the main character “uninspiring”), but then there was a wave of far more positive reviews in December, such as this one from The Book Smugglers. My Review Round-Up gathers some more of these.

Has the wheel turned, and are most readers now seeing *The Three-Body Problem* as the best SF book of 2014? For the record, that’s my opinion as well, and I did read some 20+ speculate works published in 2014. Liu’s has a combination of really interesting science, very bold (and sometimes absurd) speculation, and a fascinating engagement with Chinese history in the form of the Cultural Revolution. In a head to head competition with *Annihilation*, I think *The Three-Body Problem* wins. You’d think that would be enough to score a Hugo nomination, and maybe it will be. We’ll find out within the month.

# Debuting the Awards Meta-List

I like collations on Chaos Horizon: I think gathering a wide range of opinions on SFF gives us a better overall feel than relying on one source. I’ve done this for the SFF Critics Best of 2014, for the Mainstream Best of 2014, and now I’m launching a new list meta-list: Awards 2015. What this will do is collate award nominations from the 2015 SFF awards season. I’m going with widely known novel awards that are specific to the SFF field (so no Stoker, because that’s horror, and no Sturgeon because that’s short fiction, no Aurealis because that’s only Australian, etc.). For now, I’m planning on tracking the 15 awards listed below. That’s a lot of awards for a relatively small community to give! I’m only counting the “novel” category, so things like “debut novel” aren’t making the cut. Check out the excellent Science Fiction Award database for descriptions of the awards and a history of them.

Since each award approaches the field differently (some SF only, some fantasy only, some juried, some voted, some American, some British, etc.) the totality of them should give us a good idea of the state of the field as a whole. Here’s the list:

Arthur C. Clarke

British Fantasy

British SF

Campbell

Compton Crook

Crawford

Gemmell

Hugo

Kitschies

Locus (EDIT: I’ll count the Locus Science Fiction and the Locus Fantasy as separate awards)

Nebula

Philip K. Dick

Prometheus

Tiptree

World Fantasy

EDIT (3/5/15): Doug in the comments usefully reminded me that the Locus has both Science Fiction and Fantasy categories. I think the Locus is fairly predictive of the Hugos, as it is a voted award by SFF fans. Because of that, I’m going to count them both as separate awards, which brings us up from that tidy 15 awards to an ugly 16 awards. So it goes.

Methodology is simple: I’ve creates an Excel matrix to track each award’s nominees and winners. You get 1 point for getting nominated, and then I sum up the points. I’m marking winners with green. Not many awards have announced their nominees yet: the Philip K. Dick, the Kitschies, the BSFA, and the Nebulas.

Because we’re so early, we don’t have a lot of useful info yet. 22 different authors have been nominated for those 4 awards, showing there isn’t a ton of agreement across the field as to what the “major” books were in 2014. Only two novels have shown up in more than one award:

*Lagoon*, Nnedi Okorafor: 2 points (Kitschies, BSFA)

*Ancillary Sword*, Ann Leckie: 2 points (BSFA, Nebula)

Remember, *Lagoon* didn’t get a 2014 US publication, so it’s pretty unlikely to show up in the more US-centric awards. Leckie had a dominating run last year with *Ancillary Justice*, showing up in 7 of these 15 awards and winning 4 (Hugo, Nebula, BSFA, Clarke, as well as some lesser awards like British Fantasy best Newcomer and Locus Magazine’s First Novel). 50% nomination rate seems about the high-water mark: if anyone can manage that this year, they’ll have a good chance to win the Hugo and Nebula.

Over time, we should start to see more patterns. Congratulations to *Grasshopper Jungle*, which just won the Kitschie. Here’s the in-progress matrix: 2015 Awards Meta-List.

# 2015 Hugo Prediction, Version 4.0

A lot has happened in the past month that will shape the 2015 Hugo Best Novel nominations. These are usually announced around Easter weekend, which has the unfortunate tendency of burying the nominations in the holiday. The deadline for nominations this year is March 10, 2015, so WorldCon voters still have time to get their nominations in.

In this post, I’ll focus on my final prediction: which 5 books I think will make the 2015 slate. Since the Nebula nominations just came out, these are likely to influence the Hugos in a substantial way. Over the past several years, about 40% of the eventual Hugo slate has overlapped with the Nebula slate. The Nebula slate is widely seen and discussed within the SFF community, and even if it only influences 4-5% of WorldCon voters, that’s enough to push a book from “borderline” to “nominated.”

Speaking of widely seen and widely discussed, the “Sad Puppies 3” slate is also likely to have a substantial influence on this year’s Hugo. Helmed by Brad Torgersen this year (and by Larry Correia in the past), the Sad Puppy 2 group of suggested nominees had a definite impact on the 2014 Hugos (placing 1 book into the Best Novel category, and several other nominees into other fiction categories), and there’s not a lot of evidence to suggest this campaign won’t be equally (or slightly more) successful this year. See my “Modeling Hugo Voting Campaigns” post for more discussion.

So where does that leave us? Here’s my top 5, based on awards history, critical acclaim, reviews, and popularity. Remember that *The Martian* by Andy Weir isn’t up here because of eligibility issues. Otherwise I’d have Weir at #3.

Reminder: Chaos Horizon is dedicated to predicting what is likely to happen in the 2015 awards, not what “should” happen. So, long story short, I’m not advocating any of these books for the Hugo, but simply predicting, based on past Hugo patterns, who is most likely to get a nomination.

1. ** Annihilation, Jeff VanderMeer**: VanderMeer’s short book, the first in the

*Southern Reach*trilogy that all came out this year, was one of the most critically acclaimed SF/weird fiction novels of recent years. It sold well, received a Nebula nomination, and provoked plenty of debate and praise, including high profile features in The New Yorker and The Atlantic. While the Hugos aren’t as susceptible to literary acclaim as the Nebulas, this is either a “love it” or “hate it” kind of book. Readers are either fascinated by VanderMeer’s weirdness and fungal based conspiracies or completely alienated by them. Since you can’t vote against a book in the nominating process, the “loves” will outweigh the “hates.” I have VanderMeer as my early Hugo favorite: I think he’ll win the Nebula, and that win will drive him to the Hugo.

2. ** Ancillary Sword, Ann Leckie**: The Hugo tends to be very repetitive, nominating the same authors over and over again. Given how dominant Leckie’s 2014 Hugo win was (and overall award season), it’s hard to see her not getting another nomination. Even if

*Ancillary Sword*is slightly less acclaimed than

*Ancillary Justice*, it still placed first in my SFF critics collation list, and it has already garnered Nebula and BSFA noms. While I think it’s unlikely Leckie will win two Hugos in a row, the VanderMeer may prove too divisive for the Hugo audience. In that case, Leckie might emerge as the compromise pick. The Hugo preferential voting system can easily allow for something like that to happen.

3. ** Monster Hunter Nemesis, Larry Correia**: Correia finished 3rd in the 2014 Hugo nominations, with only Leckie and Gaiman placing above him (Gaiman declined the nom). That put him very safely in the field, and the mathematics are in Correia’s favor for this year. While

*Monster Hunter Nemesis*is a slightly odd choice for the Hugos, being 5th in a series and urban fantasy to boot, it’s hard to imagine Correia’s supporters abandoning him en-masse in just one year. Despite the vigor of his campaign, Correia doesn’t haven’t the broad support necessary to win a Hugo.

4. ** The Goblin Emperor, Katherine Addison**: There are a number of edgier fantasy novels that could work their way into the Hugo. I’ve had the race down as between Robert Jackson Bennett’s

*City of Stairs*and this book for a while. With Addison grabbing the Nebula nomination, that probably boosts her into the Hugo field. This was well-liked in certain circles and placed very high on the SFF critics list. It’s also fantasy, which has a definite block of support behind it—not every WorldCon voter reads SF.

Now things get interesting. I expect their to be an all-out war for the fifth spot, given that there are 4-5 viable contenders. This’ll come down to who gets the vote out, not necessarily which novel is “better” than the other novels.

5. ** Skin Game, Jim Butcher**:

*Skin Game*was part of the “Sad Puppy 3” slate, but Butcher’s appeal extends well beyond that block of voters. While Butcher has never gotten much Hugo love in the past, he is one of the most popular writers working in the urban fantasy field, and his

Just missing:

6. ** The Three-Body Problem, Cixin Liu**: Liu is a best-selling Chinese science fiction author, and this is his first novel translated into English. Liu’s chances have been greatly boosted by his Nebula nomination: this is going to put

*Three-Body*front and center in SF fandom discussions. But is this a case of too little, too late? Are people rushing out to buy the Liu, and will they have time to read it before the Hugo voting closes? Liu’s novel will be very appealing to certain groups of SF WorldCOn voters since it has has throwback elements to hard SF writers like Arthur C. Clarke. I think it’ll be very close between Butcher and Liu (and maybe even Addison), and we’re dealing with guesswork here, not solid facts. There’s simply not enough data to model how a Chinese novel might do against an urban fantasy novel supported by a voting campaign.

7. ** Lock In, John Scalzi**: Although Scalzi isn’t getting a ton of buzz right now, he does have 4 recent Best Novel nominations and a 2013 win for

*Redshirts*. That indicates a broad pool of support in WorldCon voters; Scalzi is an author they’re comfortable with. While he might not be #1 on a lot of ballots, is he #4 or #5 on a plurality? We saw an old-standby in Jack McDevitt grab a Nebula nomination this year. Could Scalzi play the same role in the 2015 Hugos? You can never assume that the Hugos or Nebulas won’t be repetitive.

So, there’s my field. I’m going to drop *City of Stairs* down to 8th place: no Nebula nom really hurts it. I’m leaving McDevitt off the Hugos; he’s never had much chance there. Charles Gannon received both a Nebula nomination and an endorsement on the Sad Puppy 3 slate. Gannon isn’t as popular as Correia or Butcher, so I don’t think as highly of his chance. I’m slotting him in at #10. That gives us:

8. *City of Stairs*, Robert Jackson Bennett

9. *Words of Radiance*, Brandon Sanderson

10. *Trial By Fire*, Charles Gannon

11. *Symbiont*, Mira Grant

12. *The Mirror Empire*, Kameron Hurley

13. *The Peripheral*, William Gibson

14. *My Real Children*, Jo Walton

15. *Echopraxia*, Peter Watts

So, that’s how Chaos Horizon thinks it’ll play out. What do you think? Who is likely to grab a nomination in 2015?