It’s been a busy month, but Chaos Horizon is slowly returning to it’s normal work: tracking various data sets regarding the Nebula and Hugo awards. Today, let’s take a look at where the 5 Hugo Best Novel nominees stand in terms of # of Goodreads ratings, # of Amazon ratings, and ratings score.
So far, I’ve not been able to find a clear (or really any) correlation between this data and the eventual winner of the Hugo award. In my investigations of this data—see here, here, and here—I’ve been frustrated with how differently Amazon, Goodreads, and Bookscan treat individual books. It’s also worth noting that I don’t think Amazon or Goodreads measure some abstract idea of “quality,” but rather a more nebulous and subjective concept of “reader satisfaction.” You definitely see that in something like the Butcher book: since it’s #15 in a series, everyone who doesn’t like Butcher gave up long ago. All you have left are fans, who are prone to ranking Butcher highly.
As a final note, Jason Sanford leaked the Bookscan numbers for the Hugo nominees in early April. Check those out to see how Bookscan reports this data.
On to the data! Remember, these are the 2015 Hugo Best Novel nominees:
Skin Game, Jim Butcher
Ancillary Sword, Ann Leckie
The Goblin Emperor, Katherine Addison
The Three-Body Problem, Cixin Liu
The Dark Between the Stars, Kevin J. Anderson
Number of Goodreads Ratings for the Best Novel Hugo Nominees, May 2015
This chart gives you how many readers on Goodreads have rated each book; that’s a rough measure of popularity, at least for the self-selected Goodreads audience.
Goodreads shows Skin Game as having a massive advantage in popularity, with almost 5 times as many rankings as Leckie’s book. Given Skin Game is #15 in the series, that’s an impressive retention of readers. Of course, any popularity advantage for Butcher has to be weighted against the pro and anti Sad/Rabid Puppy effect. Also don’t neglect the difficulty that Hugo voters will have in jumping into #15 of a series.
While Liu is still running behind Addison and Leckie, keep in mind that Liu’s book came out a full seven months after Addison’s book and a month after Leckie. Still, the Hugo doesn’t adjust for things like that: your total number of readers is your total number of readers. That’s why releasing your book in November can put you at a disadvantage in these awards. Still, Liu picked up a huge % of readers this month; if that momentum keeps up, that speaks well for his chances. Anderson’s number is very low when compared to the others; that probably is a mix of Anderson selling fewer copies and Anderson’s readers not using Goodreads.
Switching to Amazon numbers:
Number of Amazon Ratings for the Best Novel Hugo Nominees, May 2015
I don’t have as much data here because I haven’t been collecting it as long. I foolishly hoped that Goodreads data would work all by itself . . . it didn’t. Butcher’s Amazon advantage is even larger than his Goodreads advantage, and Liu leaps all the way from 4th place in Goodreads data to second place in Amazon data. This shows the different ways that Goodreads and Amazon track the field: Goodreads tracks a younger, more female audience (check the QuantCast data), while Amazon has historically slanted older and more gender-neutral. Your guess is as good as mine as to which audience is more predictive of the eventual Hugo outcome.
Lastly, the rankings themselves:
Let me emphasize again that these scores have never been predictive for the Hugo or Nebula: getting ranked higher on Amazon or Goodreads has not equated to winning the Hugo. It’s interesting that the Puppy picks are the outliers: higher and lower when it comes to Goodreads, with Leckie/Addision/Liu all within .05 points of each other. Amazon tends to be more generous with scoring, although Butcher’s 4.8 is very high.
The 2015 Hugo year is going to be largely useless when it comes to data: the unusual circumstances that led to this ballot (the Sad and Rabid Puppy campaigns, then various authors declining Best Novel nominations, and now the massive surge in voting number) mean that this data is going to be inconsistent with previous years. I think it’s still interesting to look at, but take all of this with four or five teaspoons of salt. Still, I’ll be checking in on these numbers every month until the awards are given, and it’ll be interesting to see what changes happen.
It’s the last of the month, so time to update my popularity charts. Now that we have the Nebula slate, I’m debuting a new chart:
Nicholas Whyte over on From the Heart of Europe has been tracking similar data for several years now, although he uses Library Thing instead of Amazon. He’s got data for a few different awards going several years back. Like me, he’s noted that popularity on these lists is not a great indicator of winning. A few weeks ago (here and here) I took a close look at how Goodreads numbers track with Amazon and Bookscan. The news was disappointing: the numbers aren’t closely correlated. Goodreads tracks one audience, Amazon another, and BookScan a third. The ratio between Amazon rankings and Goodreads rankings can be substantial. Goodreads tends to overtrack younger (under 40), more internet-buzzed about books. You can see how Amazon shows McDevitt, Lui, Addison, and Leckie to be about the same level of popularity, whereas Goodreads has Leckie 10x more popular than McDevitt. What do we trust?
The real question is not who we trust, but how closely the Goodreads audience correlates either to the SFWA or WorldCon voters. It’s hard to imagine a book from the bottom of the chart winning over more popular texts, but McDevitt has won in the past, and I don’t think he was that much more popular in 2007 than in 2015. I think the chart is most useful when we compare like to like: if Annihilation and Ancillary Sword are selling to somewhat similar audience, VanderMeer has gotten more books out than Leckie. Hence, VanderMeer probably has an advantage. I’m currently not using these numbers to predict the Nebulas or Hugos, although I’d like to find a way to do so.
Now, on to the big chart. Here’s popularity and reader change for Goodreads for 25+ Hugo contenders, with Gannon and McDevitt freshly added:
One fascinating thing: no one swapped positions this month. At the very least, Goodreads is showing some month to month consistency. Weir continues to lap the field. Mandel did great in February but that didn’t translate to a Nebula nomination: momentum on these charts doesn’t seem to be a good indicator of Nebula success. I’ll admit I thought Mandel’s success on Goodreads was going to translate to a Nebula nomination. Instead, it was Cixin Liu, much more modestly placed on the chart, who grabbed the nomination. Likewise, City of Stairs was doing better than The Goblin Emperor, but it was Addison who got the nod. At least in this regard, critical reception seemed to matter more than this kind of popularity.
Remember, Chaos Horizon is very speculative in this first year: what tracks the award? What doesn’t? I don’t know yet, and I’ve been following different types of data to see what pans out.
Interestingly, McDevitt and Gannon debut at the dead bottom of the chart. That’s one reason I didn’t have them in my Nebula predictions. That’s my fault and my mistake; I need to better diversify my tracking by also looking at Amazon ratings. I’ll be doing that for 2015, and the balance of Amazon and Goodreads stats might give us better insight into the field.
AS always, if you want to look at the full data (which goes back to October 2014), here it is: Hugo Metrics.
Over the past few posts, I’ve been looking at the correlation between Amazon data, Goodreads data, and the mythical “actual books sold” data that we don’t have. It would be nice if either Amazon or Goodreads correlated with that data, because then we’d be able to get a good estimate of actual sales.
Unfortunately, it appears that both the Goodreads and the Amazon data is demographically unreliable. That makes a certain amount of sense: websites cater to very specific audiences, and specific audiences don’t reflect the general reading public. Goodreads seem to be lean very young (under 40), and, according to Quantcast, has around a 70/30 female/male demographic. Amazon seems more neutral in terms of gender, but leans older (over 40) and wealthier (i.e. people who have enough money to buy lots of books online).
I’ve got one more set of data to present to you: for the past 5 months, I’ve been collecting data points that would compare Goodreads to BookScan numbers. BookScan is a point-of-sale recording service to see how many books are sold; instead of estimating from sampling, they actually try to count how many books are sold by different venues. They claim to cover some 80-90% of the market, although that’s probably inflated. Here’s a good article from Forbes that can serve as an introduction. I’ve heard plenty of authors say that Bookscan grabs less than 25% of their sales, and I think the more sell you through untraditional means (at cons, through small bookstores, etc.), the worse the BookScan numbers are.
Most BookScan data is locked behind a huge paywall—but Publisher’s Weekly prints weekly Hardcover bestseller lists on their website. They try to make it difficult (they only print # books sold this year), and they don’t include e-books. Still, if we were to take that data and compare it to Goodreads data . . . we’d have something.
This is exactly what I’ve done. Since early November, I’ve been tracking (weekly) any SFF book (broadly defined, and also including horror) that has shown up on the Top Hardcover Fiction list. I’ve then been comparing that to the number of Goodreads ratings for that week to see if there’s a sensible correlation between the two numbers.
While this isn’t perfect—some authors may sell a higher proportion of e-books than others—we’ll at least have a rough look at “actual physical” sales versus Goodreads ratings.
Only six SFF novels showed up on the Hardcover Bestseller list in the last 4 months. I put down the publication date because more people buy a book right when it comes out than read it right when it comes out; a long book like Revival might take people a month to read, so it may take a while for Goodreads ratings to catch up with sales. “Last Data” is the data for the week when the book fell off the cart; Gibson fell off quickly (two weeks), while Rothfuss stuck around longer. King is still going strong. The Hardcover column is the total amount of Hardcover sales as given by Publisher’s Weekly for that week; the Goodreads column is the number of Goodreads ratings from that same week.
Lastly we have the interesting column: the ratio of Hardcover sales/Goodreads ratings. In an ideal world, that would have been close the same number of everyone.
It is not. Goodreads is tracking the Rothfuss and Mandel in a totally different way than it is King, Rice, Koontz, or King. Perhaps this is because Mandel and Rothfuss are selling more to Goodreads’ specific audience (younger, female). Perhaps this is because the books are shorter. Perhaps this is because King and Rice sell more in places like Wal-Mart or Target, whose readers aren’t using Goodreads. Perhaps King and Rice are selling primarily to older readers, who are less inclined to use internet websites to record their reading habits.
In a complex statistical case like this, it probably comes down to a multitude of factors. With just 6 books, we don’t have enough data to hash that out. What we can say is that someone like Mandel is overperforming (compared to the average) on Goodreads in an enormous way. Even if we consider that a young author like Mandel might have a 50/50 Hardcover/e-book split in her sales (thus meaning she’s sold around 120,000 copies of Station Eleven, which seems reasonable), almost 25% of her total readers rated the book on Goodreads. That is astonishingly high. In contrast, King–whose probably tripled or quadrupled Mandel’s sales–only has about 5% of his readers on Goodreads.
That’s an enormous gap, and re-enforces what we learned in the last post: Goodreads is not a reliable indicator of total readers. It’s tracking Mandel and King in totally different fashion, and to compare Mandel to King via Goodreads makes Mandel seem more popular than she is and King less popular.
That doesn’t mean Goodreads is useless: it just means that it tracks a specific demographic. Whether that demographic is more in touch with the Hugo/Nebula awards is an open question.
One last chart for true stat geeks: let’s see what’s happened to the Amazon/Goodreads ratio over time. Not a ton of data here, as only 4 SFF books had a decent run on the Bestseller chart. Here it is:
You can see that Rice and King have reasonably shaped curves that are converging to around 15 in King’s case and at about 30 in Rice’s case. Mandel and Rothfuss have basically straight lines: they were popular on Goodreads to start, and haven’t changed at all. That re-enforces my last point: Goodreads treats King and Rice fundamentally differently than Mandel or Rothfuss.
With enough time and data—which we don’t have—we might be able to get a better sense of why books are tracked in different ways. Perhaps it would be a simple demographic correction (authors over 40 have this kind of ratio, authors under 40 have this kind of ratio). However, since Publisher’s Weekly doesn’t share enough data, we’re stuck. So be careful when looking at Goodreads numbers; they reflect a young audience, and are misleading when making comparisons between a Mandel and a Gibson.
I won’t lie: I’m a little disappointed that the Goodreads data isn’t more reliable. Given the large sample size, I’d hoped that Goodreads would flatten out any demographic bias. It doesn’t appear to do so, so any Goodreads numbers should be approached with healthy skepticism.
Next up for Chaos Horizon: start collecting Amazon data to see if that’s a better match to Publisher’s Weekly. Check back in a couple months, and we’ll see if that data lines up any better!
Time for another of my “boring but important” mathematical posts. One thing I’d like to know—and I think many other SFF observes as well—is how much SFF books actually sell. For many entertainment industries, this kind of information is readily available. There are great sites like Box Office Mojo for films or TV by the Numbers for TV. Both are free, well-designed, and easy to use.
But the book industry? They either don’t offer the information or, if they do, lock it behind paywalls. BookScan purports to track a fair portion of the field, but they have exorbitant rates (here’s an enrollment form asking $2,000 for a membership through Publishers Marketplace) and also have strict terms of service that would prevent anyone from broadly sharing that data in public. Other sites like BookStats (an annual survey of publishers) offer equally steep rates (that begin at an eye-popping $2,995 and only head up!).
What does all of this mean? That we don’t have access to free, reliable sales data for the book industry. I think this is a huge mistake on the part of the book industry; freely sharing data hasn’t hurt the movie industry, at it lets viewers hotly debate their favorite movies and the tricky relationship between sales and quality. If people are talking about your industry, they’re involved in it—and likely growing it. The more readers are locked out of conversations about books, the more likely they are to drift over to other industries that allow for fuller participation. People like numbers, and charts, and debating; they like to see how their favorite movie or TV show or book is selling.
Sadly, transparency has never been a feature of the book industry. So that means a site like Chaos Horizon is left to patch together popularity estimates through frustratingly inadequate and inaccurate techniques. Since we don’t have point-of-sales tracking, we’re left moving to a space like Amazon or Goodreads, which samples the reading public through the number of reader reviews. Since both those websites are fairly big, you can argue that both websites sample a large enough portion of the population to be statically meaningful. If a book sells 10,000 copies, and 1,000 people rank it on Goodreads, that’s a pretty solid 10%. Amazon tends to samples at a lower rate, so they might only grab 1% of the total readership. Still, that’s better than nothing . . .
If there wasn’t bias built into the Goodreads and Amazon user bases. I’m using bias in a purely statistical sense here, to indicate a demographic issue that skews from the norm. So, let’s say 50% of readers are men and 50% are women. To make a good sample, you’d have to be sure to have 50% men and 50% women in your sample (or you could correct your sample after it was done, if you like fancy math). Simple enough? The same with age, income level, etc., all the basic demographic categories.
If you take a look at the demographic information for Goodreads, you’ll see that it skews pretty substantially. If you head over to Quantcast, a web demographic site (why is this free but I can’t look at book sales numbers?), the report Goodreads as having a 71% women / 29% men visit ratio. That’ll definitely skew the data. All this info is at the bottom of the Quantcast page; if you click through it, you’ll see that Goodreads skews towards women, towards people from age 18-34, and towards the people with either undergraduate or graduate education. All of that makes a certain amount of sense: younger people are more likely to use social media, and avid book readers are probably more likely to be college educated. This means, though, that every bit of Goodreads data is going to be biased towards certain audience tastes.
Amazon’s demographic bias is harder to find. They’ve opted out of Quantcast, I’ve read several studies that suggest Amazon is biased towards older, high-income users, and the highly-educated. CBS News echoes all of that, and also reports Amazon as gender neutral. That’s from 2010 (when Quantcast data was still available for Amazon); I don’t know if it has changed or not. Still, that means the demographics between Amazon and Goodreads are hopelessly off: young versus old, average income versus high income, 70/30 gender versus 50/50. At least they converge on education level!
How much does bias like that matter in practice? An enormous amount. Let’s look at a comparison of # of Goodreads ratings to # of Amazon ratings for the 2015 Hugo contenders:
Take a look at the far right column: that’s the ratio of Goodreads/Amazon ratings. That range is what we’d call in statistical terms “a hot mess.” You range from Beukes having a 60x multiplier down to Gibson having a paltry 8x. Even if you toss out Beukes as an outlier, it would appear that some writers get ranked at a 4 time higher rate on Goodreads than Amazon. What’s happening?
This is demographic bias at work. Goodreads favors certain books and dislikes other books (in sampling terms). Since the Goodreads readership is younger + female, a book by Gibson (older + male) shows up much lower on the list. Books that appeal to the Goodreads demographic (presumably female-friendly books that market/cater to a slightly younger audience) do very well on Goodreads. Books catering to an older or more male audience tend to worse, at a rate of about 3 – 4 times (at the most extreme; most books are a little less than that).
So, what do we conclude? That Goodreads is biased. That Amazon is biased. If you wanted to correlate the Goodreads number to the Amazon numbers, you’d have to multiply the least Goodreads friendly books by around 3 or 4 times. But why correlate one set of biased numbers to another set of biased numbers? That’s statistically pointless: if you want to know the younger, more female audience, use Goodreads. If you want the older, richer, more gender-neutral audience, use Amazon.
So, we’re swirling around the question: can we correlate Amazon or Goodreads numbers to actual sales? What we’ve learned in this post is, according to Quantcast data either as openly accessible or reported by CBS:
1. In demographic terms, Goodreads is biased towards women, younger readers (18-34), and the highly educated.
2. In demographic terms, Amazon is biased towards older readers, higher income readers, and the highly educated.
3. This results in substantial differences (up to around 4 times) at the rate which they review books.
That’s all well and good. We now have a way to compare Amazon to Goodreads numbers through demographic correction (if we wanted to). But how do those two sets of biased numbers actually sync up with sales? I have some Bookscan data I’ll be sharing with you in the next post!