Building the Nebula Model, Part 3
Now that we have 12 different indicators, how do I combine them together? This is where the theory gets sticky: how solidly do I want to treat each indicator? Am I try to find correlations between them? Do I want to pick one as the “main” indicator as my base, and then refine that through some recursive statistical process? Do I treat each indicator as independent, or are some dependent on each other? Do I treat them as opinions or facts? How complicated do I allow the math to be, given the low N we have concerning the Nebulas?
I thought about this, and read some math articles and scoured the internet, and I decided to use an interesting statistical tool: the Linear Opinion Pool. Under this model, I treat my data mining results as opinions, and combine them together using a Reliability Factor, to get a weighted combined percentage score. This keeps us from taking the data mining results too seriously, and it allows us to weigh a great number of factors without letting one of them dominate.
Remember, one of my goals on Chaos Horizon is to keep the math transparent (at a high school level). I want everyone who follows Chaos Horizon to be able to understand and explain how the math works; if it doesn’t, it becomes sort of a mysterious black box that lends an air of credibility and mystery to the statistics that I don’t want.
a weighted arithmetic average of the experts’ probability distributions. If we let Fi(x) denote expert i’s probability distribution for an uncertain variable of interest (X), then the linear opinion pool Fc(x) that results from combining k experts is:
where the weight assigned to Fi(x) is wi, and Σwi = 1.
Although the linear opinion pool is a popular and intuitive combination method with many useful properties, there is no method for assigning weights that is derived entirely from first principles. One can, however, interpret the weights in a variety of ways, and each interpretation lends itself to a particular way to calculate the weights.
This is a model often used in risk analysis, where you have a number of competing opinions about what is risky, and you want to combine those opinions to find any possible overlap (while also covering your ass from any liability). There’s plenty of literature on the subject; just google “Linear Opinion Pool” for more reading.
We have the probability distributions from my data mining. What weights do I use? That’s always the challenge in a Linear Opinion Pool. For Chaos Horizon, I’ve been weighting by how often that Indicator has actually chosen the Nebula in the past. So, if you used that Indicator and that Indicator alone to guess, how often would you actually be right? Not every Indicator comes into play every year, and sometimes an Indicator doesn’t help (like if all the nominated novels previously had Nebula nominations). We’ll be looking at all that data late in April.
Now, on to my mathematical challenge: can I explain this in easy to understand terms?
A Linear Opinion Pool works this way: you walk into a bar and everyone is talking about the Nebula awards. You wander around, and people shout out various odds at you: “3 out of 4 times a prior Nebula nominee wins” or “70% of the time a science fiction novel wins” and so forth. Your head is spinning from so much information; you don’t know who to trust. Maybe some of those guesses overlap, maybe some of those don’t. All of them seems like experts—but how expert?
Instead of getting drunk and giving up, you decide to sum up all the opinions. You figure, “Hell, I’ll just add all those probabilities up, and then divide by the total number of suggestions.” Then you begin to have second doubts: that guy over in the corner is really wasted and doesn’t seem to know what he’s talking about. I sidle over and ask his friend: how often has that guy been right in the past? He says 5% of the time, but that guy over there—the one drinking gin and tonics—is right 50% of the time. So I figure I better weight each opinion based on how correct they’ve been in the past. I add things up using those weights, and viola!, I’ve got my prediction.
It’s mathematically easy to calculate; no fancy software needed.
This allows me to add more indicators (opinions) very easily.
This treats my data mining work as an “opinion,” not a fact, which I think is closer to reality.
The weighting component allows me to dial up or dial down indicators easily.
The simple mathematics reflects the relative low amount of data.
The methodology is easy for readers to follow.
It’s not as mathematically rigorous as other statistical models.
The weighting component introduces a human element into the model which may be unreliable.
Because this treats my data mining results as “opinions,” not “facts,” it may compromise the reliability of the model for some readers.
Because it is simple, it lacks the flashiness and impressiveness of grander statistical models.
When we’re dealing with statistical modeling, the true test is the results. A rigorous model that is wrong all the time is worse than a problematic model that is right all the time. In my next post, we’ll talk about past accuracy. Here’s my older posts on the Linear Opinion Pool and weighting if you want some more info.
As a last note, let me say that following the way model is constructed is probably more interesting and valuable than the final results. It’s the act of thinking through how different factors might fit together that is truly valuable. Process, not results.