Tag Archive | Methodology

Building the Nebula Model, Part 5

This post continues my discussion on building my 2015 Nebula Best Novel prediction. See Part 1 for an introduction, Part 2 for a discussion of my Indicators, Part 3 for a discussion of my methodology/math, and Part 4 for a discussion of accuracy.

Taken together, those posts should help explain to anyone new to Chaos Horizon how Chaos Horizon works. This post wraps things up with a to-do list for 2015. To update my model for 2015, this is what I need to do:

1. Update the data sets with the results of the 2014 Nebulas, Hugos, and everything else that happened last year.
2. Rethink the indicators, and possibly replace/refine some of them.
3. Reweight the indicators.
4. Test model reliability using the reweighed indicators.
5. Use the indicators to build probability tables for the 2015 Nebulas.
6. Run the probability tables through the weights to come up with the final results.

I won’t be able to get all of this done until mid-April. It doesn’t make any sense to run the numbers until the Hugo noms come out, and those will be coming out this Saturday (April 4th).

Building the Nebula Model, Part 3

This post continues my discussion on building my 2015 Nebula Best Novel prediction. See Part 1 for an introduction, and Part 2 for a discussion of my Indicators.

Now that we have 12 different indicators, how do I combine them together? This is where the theory gets sticky: how solidly do I want to treat each indicator? Am I try to find correlations between them? Do I want to pick one as the “main” indicator as my base, and then refine that through some recursive statistical process? Do I treat each indicator as independent, or are some dependent on each other? Do I treat them as opinions or facts? How complicated do I allow the math to be, given the low N we have concerning the Nebulas?

I thought about this, and read some math articles and scoured the internet, and I decided to use an interesting statistical tool: the Linear Opinion Pool. Under this model, I treat my data mining results as opinions, and combine them together using a Reliability Factor, to get a weighted combined percentage score. This keeps us from taking the data mining results too seriously, and it allows us to weigh a great number of factors without letting one of them dominate.

Remember, one of my goals on Chaos Horizon is to keep the math transparent (at a high school level). I want everyone who follows Chaos Horizon to be able to understand and explain how the math works; if it doesn’t, it becomes sort of a mysterious black box that lends an air of credibility and mystery to the statistics that I don’t want.

Here’s a basic definition of a Linear Opinion Pool:

a weighted arithmetic average of the experts’ probability distributions. If we let Fi(x) denote expert i’s probability distribution for an uncertain variable of interest (X), then the linear opinion pool Fc(x) that results from combining k experts is:
Linear Opinion Pool
where the weight assigned to Fi(x) is wi, and Σwi = 1.

Although the linear opinion pool is a popular and intuitive combination method with many useful properties, there is no method for assigning weights that is derived entirely from first principles. One can, however, interpret the weights in a variety of ways, and each interpretation lends itself to a particular way to calculate the weights.

This is a model often used in risk analysis, where you have a number of competing opinions about what is risky, and you want to combine those opinions to find any possible overlap (while also covering your ass from any liability). There’s plenty of literature on the subject; just google “Linear Opinion Pool” for more reading.

We have the probability distributions from my data mining. What weights do I use? That’s always the challenge in a Linear Opinion Pool. For Chaos Horizon, I’ve been weighting by how often that Indicator has actually chosen the Nebula in the past. So, if you used that Indicator and that Indicator alone to guess, how often would you actually be right? Not every Indicator comes into play every year, and sometimes an Indicator doesn’t help (like if all the nominated novels previously had Nebula nominations). We’ll be looking at all that data late in April.

Now, on to my mathematical challenge: can I explain this in easy to understand terms?

A Linear Opinion Pool works this way: you walk into a bar and everyone is talking about the Nebula awards. You wander around, and people shout out various odds at you: “3 out of 4 times a prior Nebula nominee wins” or “70% of the time a science fiction novel wins” and so forth. Your head is spinning from so much information; you don’t know who to trust. Maybe some of those guesses overlap, maybe some of those don’t. All of them seems like experts—but how expert?

Instead of getting drunk and giving up, you decide to sum up all the opinions. You figure, “Hell, I’ll just add all those probabilities up, and then divide by the total number of suggestions.” Then you begin to have second doubts: that guy over in the corner is really wasted and doesn’t seem to know what he’s talking about. I sidle over and ask his friend: how often has that guy been right in the past? He says 5% of the time, but that guy over there—the one drinking gin and tonics—is right 50% of the time. So I figure I better weight each opinion based on how correct they’ve been in the past. I add things up using those weights, and viola!, I’ve got my prediction.

Advantages:
It’s mathematically easy to calculate; no fancy software needed.
This allows me to add more indicators (opinions) very easily.
This treats my data mining work as an “opinion,” not a fact, which I think is closer to reality.
The weighting component allows me to dial up or dial down indicators easily.
The simple mathematics reflects the relative low amount of data.
The methodology is easy for readers to follow.

Disadvantages
It’s not as mathematically rigorous as other statistical models.
The weighting component introduces a human element into the model which may be unreliable.
Because this treats my data mining results as “opinions,” not “facts,” it may compromise the reliability of the model for some readers.
Because it is simple, it lacks the flashiness and impressiveness of grander statistical models.

When we’re dealing with statistical modeling, the true test is the results. A rigorous model that is wrong all the time is worse than a problematic model that is right all the time. In my next post, we’ll talk about past accuracy. Here’s my older posts on the Linear Opinion Pool and weighting if you want some more info.

As a last note, let me say that following the way model is constructed is probably more interesting and valuable than the final results. It’s the act of thinking through how different factors might fit together that is truly valuable. Process, not results.

Building the Nebula Model, Part 1

The raison de’etre of Chaos Horizon has always been to provide numerical predictions for the Nebula and Hugo Awards for Best Novel based on data-mining principles. I’ve always liked odds, percentages, stats, and so forth. I was surprised that no one was doing this already for the major SFF awards, so I figured I could step into this void and see where a statistical exploration would take us.

Over the past few months, I’ve been distracted trying to predict the Hugo and Nebula slates. Now that we have the Nebula slate—and the Hugo is coming shortly—I can turn my attention back to my Nebula and Hugo models. Last year, I put together my first mathematical models for the Hugo and Nebulas. They both predicted eventual winner Leckie, which is good for the model. As I’ll discuss in a few posts, my currently model has around 67% accuracy over the last 15 years. Of course, past accuracy is not going to make things accurate in the future, but at least you know where the model stands. In a complex, multi-variable problem like this, perfect accuracy is impossible.

I’m going to rebuilding and updating the model over the next several weeks. There’s a couple tweaks I want to make, and I also wanted to bring Chaos Horizon readers into the process who weren’t around last year. Over the next few days, we’ll go through the following:
1. Guiding principles
2. The basics of the model
3. Model reliability
4. To-do list for 2015

Let’s get started today with the guiding principles for my Nebula and Hugo models:

1. The past predicts the future. Chaos Horizon uses a type of statistics called data-mining, which means I look for statistical patterns in past data to predict the future. There are other equally valid statical models such as sampling. In a sampling methodology, you would ask a certain number of Nebula or Hugo voters what there award votes were going to be, and then use that sample to extrapolate the final results, usually correcting for demographic issues. This is the methodology of Presidential voting polls, for instance. A lot of people do this informally on the web, gathering up the various posted Hugo and Nebula ballots and trying to predict the awards from that.

Data-mining works differently. You take past data and comb through it to come up with trends and relationships, and then you assume (and it’s only an assumption) that such trends will continue into the future. Since there is carryover in both the SFWA and WorldCon voting pools, this makes a certain amount of logical sense. If the past 10 years of Hugo data show that most of the time a SF novel always wins, you should predict a SF novel to win in the future. If 10 years of data show that the second novel in a series never wins, you shouldn’t predict a second novel to win.

Now, the data is usually not that precise. Instead, there is a historical bias towards SF novels, and first or stand alone novels, and past winners, and novels that do well on critical lists, and novels that do well in other awards, etc. What I do is I transform these observations into percentages (60% of the time a SF novel wins, 75% of the time the Nebula winner wins the Hugo, etc) and then combine those percentages to come up with a final percent. We’ll talk about how I combine all this data in the next few posts.

Lastly for this point, data-mining has difficult predicting sudden and dramatic changes in data sets. Huge changes in sentiment will be missed in what Chaos Horizon does, as that isn’t reflected in past statistical trends. Understand the limitations of this approach, and proceed accordingly.

2. Simple data means simple statistics. The temptation for any statistician is to use the most high-powered, shiny statistical toys on their data sets: multi-variable regressions, computer assisted Bayesian inferences, etc. All that has it’s place, and maybe in a few years we’ll try one of those out to see how far off it is from the simpler statistical modeling Chaos Horizon uses.

For the Nebulas and Hugos, though, we’re dealing with a low N (number of observations) but a high number of variables (genre, awards history, popularity, critical response, reader response, etc.). As a result, the project itself is—from a statistical reliability perspective—fatally flawed. That doesn’t mean it can’t be interesting, or that we can’t learn anything from close observation, but I never want to hide the relative lack of data by pretending my results are more solid than they seem. Low data will inevitably result in unreliable predictions.

Let’s think about what the N is for the Nebula Award. Held since 1966, 219 individual novels have been nominated for the Nebula. That’s our N, the total number of observations we have. We don’t get individual voting numbers for the Nebula, so that’s not an option for a more robust N. Compare that to something like the NCAA basketball tournament (since it’s going on right now). That’s been held since 1939. The field expanded to our familiar 64 teams in 1985. That means, in the tournament proper (the play-in round is silly), 63 games are contested every year since 1985. So, if you’re modeling who will an NCAA tournament game, you have 63 * (2014-1985) = 1827 data sets. Now, if we wanted to add in the number of games played in the regular season, we’d wind up with 347 teams (Division I teams) * 30 games each / 2 (they play each other, so we don’t want to use every game twice) = 5,205 more observations. That’s just one year of college basketball regular season games! Multiply that by 30 seasons, and you’re looking at an N of 150,000 in the regular season, plus an N of 2,000 for the postseason. You can do a lot with data sets that big!

So our 219 Nebula Best Novel observations looks pretty paltry. Let’s throw in the reality that the Nebulas have changed greatly over the last 40 years. Does 1970 data really predict what will happen in 2015? That’s before the internet, before fantasy became part of the process, etc. So, at Chaos Horizon, I primarily use the post 2000 data: new millennia, new data, new trends. That leaves us with an N of a paltry 87. From a statistical perspective, that should make everyone very sad. One option is to pack up and go home, to conclude that any trends we see in the Nebulas will be random statistical noise.

I do think, however, that the awards have some very clear trends (favoring certain kinds of novels, favoring past nominees and winners) that help settle down the variability. Chaos Horizon should be considered an experiment—perhaps a grand failed experiment, but those are the best kind—to see if statistics can get us anywhere. Who knows that but in 5 years I’ll have to conclude that no, we can’t use data-mining to predict the awards?

3. No black boxing the math. A corollary to point #2, I’ve decided to keep the mathematics on Chaos Horizon at roughly the high school level. I want anyone, with a little work, to be able to follow the way I’m putting my models together. As such, I’ve had to chose some simpler mathematical modeling. I think that clarity is important: if people understand the math, they can contest and argue against it. Chaos Horizon is meant to be the beginning of a conversation about the Hugos and Nebulas, not the end of one.

So I try to avoid the following statement: given the data, we get this prediction. Notice how that sentence isn’t logically constructed: how was the data used? What kind of mathematics was it pushed through? If you wanted to do the math yourself, could you? I want to write: given this data, and this mathematical processing of that data, we get this prediction.

4. Neutral presentation. To trust any statistical presentation, you have to trust that the statistics are presented in a fair, logical, and unbiased fashion. While 100% lack of bias is impossible as long as humans are doing the calculating, the attempt for neutrality is very important for me on this website. Opinions are great, and have their place in the SFF realm: to get those, simply go to another site. You won’t find a shortage of those!

Chaos Horizon is trying to do something different. Whether I’m always successful or not is for you to judge. Keep in mind that neutrality does not mean completely hiding my opinions; doing so is just as artificial as putting those opinions in the forefront. If you know some of my opinions, it should allow you to critique my work better. You should question everything that is put up on Chaos Horizon, and I hope to facilitate that questioning by making the chains of my reasoning clear. What we want to avoid at all costs is saying: I like this author (or this author deserves an award), therefore I’m going to up their statistical chances. Nor do I want to punish authors because I dislike them; I try and apply the same processing and data-mining principles to everyone who comes across my plate.

5. Chaos Horizon is not definitive. I hold that the statistical predictions provided on Chaos Horizon are no more than opinions. Stats like this are not a science; the past is not a 100% predictor of the future. These opinions are arrived at through a logical process, but since I am the one designing and guiding the process, they are my ideas alone. If you agree with the predictions, agree because you think the process is sound. If you disagree with the process, feel free to use my data and crunch it differently. If you really hate the process, feel free to find other types of data and process them in whatever way you see appropriate. Then post them and we can see if they make more sense!

Each of these principles is easily contestable, and different statisticians/thinkers may wish to approach the problem differently. If I make my assumptions, biases, and axioms clearly visible, this should allow you to engage with my model fully, and to understand both the strengths and weaknesses of the Chaos Horizon project.

I’ll get into the details of the model over the next few days. If you’ve got any questions, let me know.

Hugo Prediction: The Indicators

The main purpose of my blog Chaos Horizon is to use mathematical modeling to predict the winners of the Hugo and Nebula awards. To do this, I use a Linear Opinion Pool constructed by data mining the last 15 years (since 2000) of award-winning data, as provided by excellent websites like SFADB.

The Hugo Formula (see the 2014 prediction here) uses 8 Indicators of Hugo success, each of which is weighted in turn. The percentage afterwards gives the basic reliability of the Indicator, with links to a fuller explanation of each indicator:

Indicator #1: Nominee has previously been nominated for a Hugo award. (78.6%)
Indicator #2: Nominee has previously been nominated for a Nebula award (prior to this year). (78.6%)
Indicator #3: Nominated novel is in the fantasy genre. (50%)
Indicator #4: The nominated novel wins one of the main Locus Awards categories. (57.1%)
Indicator #5: The nominated novel receives the most votes in the Goodreads Awards. (33%)

Indicator #6: Novel was the most reviewed on Amazon.com at the time of the Hugo nomination. (75%)
Indicator #7: Novel won a same year Nebula award. (85.6%)
Indicator #8: Novel received a same year Campbell nomination. (50%)

To generate these, I went through many possible interpretations of the available data. The Indicators are not perfect, nor are they intended to be. For them to be perfect, this would imply that the Hugo award is perfectly predictable—it is not. The pool of voters is too small, and too many outside factors can influence the awards.

Instead, by building a model with multiple indicators like this allows us to not overly-stress one factor, but rather look at a fuller range of issues. Since the point of this model is to generate discussion and have fun, we want the math to be a little elastic to encompass the human element of prediction.

2014 Hugo Prediction: Storm Clouds

Loncon (the convention award the Hugo award) has announced a huge increase in Hugo voters for this year (from here):

London, 7 August 2014 – Loncon 3, the 72nd World Science Fiction Convention being held at London ExCeL from 14-18 August, is proud to announce that it received 3,587 valid ballots for the 2014 Hugo Awards. 3,571 ballots were submitted online through the Loncon 3 website and 16 paper ballots were received. This total eclipses the previous record participation of 2,100 ballots (set by Renovation in 2011) by over 50%. Participation in the 1939 Retro Hugo Award process was strong as well with 1,307 valid ballots being received: 1,295 submitted electronically and 12 by postal mail.

This substantial increase—by at least 50% over any previous Hugo—is going to severely compromise any statistical analysis of the Hugos. Remember, anyone can vote for the Hugo, as long as you register and pay the fee (somewhere in the range of $40). More voters = more passion = more unpredictable results.

So what is causing this surge in voters? There are a number of factors, and we won’t know what is the most prominent until after the results come in:
1. There was a highly organized and vigorous campaign to nominate Robert Jordan’s The Wheel of Time. Jordan had never received a nomination before, and this time his whole series was nominated. How many people have joined just to vote for Wheel of Time?
2. Larry Correia ran a somewhat less organized and less vigorous campaign (a few posts on his blog) to nominate some more socially conservative SFF texts to the Hugo slate. While the “controversy” is complex, this campaign undeniably pushed some nominees onto the Hugo slate (including Correia himself), and at least some of those additional voters are coming solely to vote for said texts. How many?
3. In response to the Correia campaign, there has been clamors of outrage on the SFF left, who see such interventions in the slate as problematic. Why Correia would be faulted but the Wheel of Time fans praised is beyond me, but that isn’t the point of this blog. The reaction to point #2 is going to cause some more liberal voters to register when they wouldn’t have.

How will this affect the outcomes? No one knows.

2014 Hugo Prediction: Indicators #7 and #8

Back from a restful summer vacation–and the Hugo awards are just around the corner. The final two indicators are as follows:

Indicator #7: Novel won a same year Nebula award. (85.6%)
Indicator #8: Novel received a same year Campbell nomination. (50%)

The Nebula is the huge indicator here: the Nebula award is high profile and, most importantly, awarded well before the Hugo. This gives Hugo readers a great chance to read the Nebula winner and then vote for it. We’ve had quite a few novels in the last 15 years sweep their way to both awards. Although some years are open–the Nebula winner is not nominated for the Hugo–the Nebula winner has won 6 out of the last 7 times it’s been eligible, with only last years upset of Kim Stanley Robinson by John Scalzi to mess up the data.

The Campbell isn’t as reliable, but it is a good indicator that the novel is on reader’s radar. Expect very heavy weighting to the Nebula, with modest weighting to the Campbell.

2014 Hugo Prediction: Indicator #6

The Hugo is all about the book that’s most popular—so we have to find good indicators that reflect popularity. Unfortunately, winning a Hugo greatly increases the popularity of a book, so it’s hard to go back in time and find out how popular the book was before it won the award.

Right now, we can establish some more speculative indicators, based on Amazon ratings, and see if these become more reliable over time. The more people have reviewed a book, the more people have read it, thus more people can vote for it in the Hugo. Seems pretty straight-forward. There’s not much history here, so this category will be weighted relatively lightly.

This leaves us with:

Indicator #6: Novel was the most reviewed on Amazon.com at the time of the Hugo nomination. (75%)

So how about this year?

Wheel of Time 3,124 reviews
Ancillary Justice 232 reviews
Parasite 160 reviews
Neptune’s Brood 110 reviews
Warbound 109 reviews

This order echoes the Goodreads vote, except Correia and Stross swapped position (by one vote, though). Once again, this shows the huge advantage Jordan has in terms of sales in relation to the rest of the nominees.

2014 Hugo Prediction: Indicators #4 and #5

For the next part of our model, and just like the Nebula model, we’ll move on from awards history to critical and reader response.

Unlike the Nebula, critical response isn’t that important. Fans vote for the novels they like, not the most “esteemed” novels. There will be some critical response worked into same-year awards performance, but for Indicators #4 and #4 we’ll focus in on reader votes.

There are two reliable reader votes currently taking place: the Locus Awards and the Goodreads Choice Awards. The Locus Awards is the more established of the two. The readers of Locus Magazine vote in a variety of major categories (Science Fiction, Fantasy, Young Adult, First Novel). The statistics are pretty good here: 57.1% of the time, the eventual Hugo winner won one of the major categories. Often, both the first place Fantasy novel and the first place Science Fiction novel make the final slate, so you have a face off amongst different categories winners.

The Goodreads Choice Awards, voted on by the readers at the Goodreads website, hasn’t been around as long, and it isn’t showing the same reliability statistically. My hope is that as time passes, this becomes a more reliable Indicator. As of now, 33% of the time the book that received the most votes wins the Hugo, but that only gives us 3 years of data. As a consequence, this Indicator will be lightly weighted in the final rankings.

That leaves us with:

Indicator #4: The nominated novel wins one of the main Locus Awards categories. (57.1%)
Indicator #5: The nominated novel receives the most votes in the Goodreads Awards. (33%)

Where are we at this year? Well, the Locus Awards are usually given in late June. Finalists have been announced, and this years nominees haven’t done too well. Only Neptune’s Brood is a finalist for best SF novel and Ancillary Justice was nominated for First Novel.

The Goodreads vote has been held, and here’s how our novels fared. I counted the votes of A Memory of Light, the last volume of The Wheel of Time), for Jordan/Sanderson.

Wheel of Time 28,470 votes
Ancillary Justice 3,815 votes
Parasite 3,431 votes
Warbound 1,509 votes
Neptune’s Brood 1,144 votes

This is the first category where the popularity of Wheel of Time shines through. The gap between A Memory of Light and Ancillary Justice is enormous, and may factor into the final Hugo vote.

2014 Hugo Prediction: Indicator #3

While the Nebula award showed a clear bias towards science fiction novels, the Hugo actually shows the opposite. While almost 70% of the nominees are science fiction novels, fantasy novels win 50% of the time. While 50% may not seem like much of a statistical advantage, it’s the 70%/30% nominee split that gives fantasy novels a statistical boost.

On a practical level, this makes sense: there are dedicated fantasy and science fiction blocks within the Hugo voters. Few readers are equally passionate about both genres, and since the fantasy nominee pool is smaller, fantasy voters tend to boost those nominees.

So, this works out to:
Indicator #3: Nominated novel is in the fantasy genre. (50%)

3 of this years nominees are best described as science fiction: Ancillary Justice, Parasite, and Neptune’s Children. Warbound looks like a detective/fantasy hybrid, leaving Jordan’s and Sanderson’s The Wheel of Time as the go to choice for fantasy readers.

2014 Hugo Prediction: Indicators #1 and #2

Like the Nebula Award prediction model, the Hugo Award prediction uses date from previous Hugo winners and nominees from 2000 to find mathematical trends. Most of this data is mined from the excellent Science Fiction Awards Database, as well as other sources like Amazon and Goodreads.

Much like the Nebula, the Hugo Award shows a bias towards previous awards winners, although this bias is much less pronounced than the Nebula. While the Nebula constantly goes to past winners and the most honored nominees, the Hugo is a very different award. Past winners don’t show any statistical advantage, nor does a handful of prior nominations seem to help much. Charles Stross, for instance, has been nominated for 7 best novel Hugos (and 15 total Hugos, with 2 wins for short fiction), and has never won for best novel. The Hugos, unlike the Nebulas, are also not prone to giving lifetime achievement awards (well, unless Jordan gets one this year). Past winners of the Hugo award are just as often passed over as not.

What does seem statistically valid, though, is being known in the field. The Hugo rarely goes to a brand new nominee, with this only happening in 3 of the previous 13 years for both the Hugo and the Nebula. Note: this does not factor in a same year Nebula nomination or win; that’ll be factored in later. So this leads to our first two indicators:
Indicator #1: Nominee has previously been nominated for a Hugo award. (78.6%)
Indicator #2: Nominee has previously been nominated for a Nebula award (prior to this year). (78.6%)

So how do this year’s nominees fare?

Hugo Indicators 1-2

As you can see, this isn’t a group that has received a lot of prior awards consideration. Leckie’s profile has certainly improved in the last 6 months, winning a Nebula this year. That’s going to give her a huge boost in a later indicator. Jordan’s lack of award nominations may be surprising, and this says something negative about the support for Jordan’s work in the Hugo/Nebula realm. Given this indicator alone, Stross would leap to the front, although these two Indicators are going to be given relatively little weight in the final formula.

Xeno Swarm

Multiple Estrangements in Philosophy and Science Fiction

AGENT SWARM

Pluralism and Individuation in a World of Becoming

Space and Sorcery

Adventures in speculative fiction

The BiblioSanctum

A Book Blog for Speculative Fiction, Graphic Novels... and more!

The Skiffy and Fanty Show

Running away from the thought police on wings of gossamer and lace...

Relentless Reading

"A Veritable Paladin of Blogging!"

MyLifeMyBooksMyEscape

A little about me, a lot about books, and a dash of something else

SCy-Fy: the blog of S. C. Flynn

Reader. Writer of fantasy novels.

Far Beyond Reality

Science Fiction and Fantasy Reviews

Andrew Liptak

three more from on high

Eamo The Geek

The Best In Sci-Fi And Fantasy Book Reviews by Eamon Ambrose

Read & Survive

How-To Read Books

Mountain Was Here

writing like a drunken seismograph

The Grimdark Review

The very best of fantasy.

From couch to moon

Sci-fi and fantasy reviews, among other things

SFF Book Reviews

random thoughts about fantasy & science fiction books

Philip K. Dick Review

A Re-read Project