Amazon vs. Goodreads Reviews
Time to get technical! Break out the warm milk and sleeping pills! Earlier in the week, I took a look at Amazon and Goodreads ratings of the major Hugo candidates. An astute viewer will notice that those rankings don’t exactly line up (nor do the number of ratings, but I’ll address that in a later post). Which is more representative? Which is more accurate?
At Chaos Horizon, I strive to do things: 1. To be neutral, and 2. Not to lie. By “not lie,” I mean I don’t try to exaggerate the importance of any one statistical measure, or to inflate the reliability of what is very unreliable data. So here’s the truth: neither the Goodreads or Amazon ratings are accurate. Both are samples of biased reading populations. Amazon over-samples the Amazon.com user-base, pushing us towards people who like to read e-books or who order theirs books online (would those disproportionally be SFF fans?). Goodreads is demographically biased towards a younger audience (again, would those disproportionally be SFF fans? Worldcon voters?). Stay tuned for more of these demographics issues in my next post.
As such, neither gives a complete or reliable picture of the public reaction to a book. If you follow Chaos Horizon, you’ll know that my methodology is often to gather multiple viewpoints/data sets and then to try to balance them off of each other. I’ve never believed the Hugo or Nebula solely reflects quality (which reader ratings don’t even begin to quantify). At the minimum, the awards reflect quality, reader reception, critical reception, reputation, marketing, popularity, campaigns, and past voting trends/biases. The Chaos Horizon hypothesis has always been that when a single book excels in four or five of those areas, then it can be thought of as a major candidate.
Still, can we learn anything? Let’s take a look at the data and the differences in review scores for 25 Hugo contenders:
The column on the far right is the most interesting one: it represents the difference between the Amazon score and the Goodreads score. As you can see, these are all over the place. There are some general trends we can note:
1. Almost everyone does better on Amazon than Goodreads. The average Amazon boost is .19 stars, and only three books out of 25 scored worse on Amazon than Goodreads. Amazon has a higher bar of entry to rate (you have to type a review, even if it’s one word; Goodreads lets you just enter a score), so I think more people come to Amazon if they love/hate a novel.
2. There doesn’t seem to be much of a pattern regarding who gets a higher ranking bump. Moving down the top of the list, you see a debut SF novel, a hard SF novel, an urban fantasy novel, a YA novel, etc. It’s a mix of genres, of men and women, of total number of ratings, and of left-leaning and right-leaning authors. I’d have trouble coming up with a cause for the bumps (or lack thereof). So, if I had to predict the size of a bump on Amazon, I don’t think I could come up with a formula to do it. I’ll note that since Amazon bought Goodreads, I think the audiences are converging; maybe in a few years there won’t be a bump.
3. If you want to use either ranking, you’d have to think long and hard about what audience each is reflecting, and what you’d want to learn from that audience’s reaction. It would take a major effort to correlate/correct the Amazon.com audience or Goodreads audience out to the general reading audience, and I’m not sure the effort would be worth it. Each would require substantial demographic corrections, and I’m not sure what you would gain from that correction. You’d have to make some many assumptions that you’d wind up with a statistics that is just as unreliable as Goodreads or Amazon.
I think “Reader Ratings” are one of the most tantalizing pieces of data we have—but also one of the least predictive. I’m not sure Amazon or Goodreads tells you anything except how the users of Amazon or Goodreads rated a book. So what does this mean for Chaos Horizon, a website dedicated to building predictive models for the Hugo and Nebula Awards?
That reader ratings are not likely to be useful in predicting awards. Long and short of it, Amazon and Goodreads sample different reading populations, and, as such, neither are fully representative of:
1. The total reading public
2. SFF fans
3. Worldcon Voters
4. SFWA voters
Neither is 100% useful (or even 75% . . . or 50%) reliable in predicting the Hugo and Nebula awards. So is it worth collecting the data? I’m still hopeful that once we have this data and the Hugo + Nebula slates (and eventually winners), I can start combing through it more carefully to see if it comes up with any correlations. For now, though, we have to reach an unsatisfying statistical conclusion: we cannot interpret Amazon or Goodreads ratings as predictive of the Hugos or Nebulas.