Pay to Play: Why Book Sales Data is Inaccessible

I love books, and I love data. I thought that it would be a fun idea to combine those loves for this blog to create a list of the children’s books that sold the most copies over the last fifty years. Not only would it scratch that organizational itch in my brain, but it would also provide a useful map of the path that children’s publishing has taken since 1976. Knowing the trends of the industry would help inform book publishing professionals and help them do their jobs better. 

Creating such a list is not impossible, but unfortunately, doing so would cost Hellebore thousands of dollars. And even if we paid that exorbitant price, I would still not be able to share exact numbers. 

The data can only be found in one place: Circana BookScan, a data provider for the book publishing industry. It tracks about 85 percent of point-of-sale data for trade print books. Circana also has digital arms that cover e-books (Circana PubTrack Digital) and audiobooks (Circana PubTrack Audio). The data is collected via bookseller and publisher self-reporting.

This aggregation of data is incredibly useful for those publishers, agencies, and interested third parties who can pay. Rather than poll every bookseller for their sales data, all that data can be easily accessed in one place and compared to all other books on the market. Unfortunately, the cost is prohibitive to small businesses like Hellebore. With our Publishers Marketplace account, the cost of a single annual subscription for just one user would be $2,950 annually.

And while having access to all that data would be nifty for us as a literary agency (and supremely satisfying for me as a data/chart/spreadsheet enthusiast), I wouldn’t be able to share any of the information I learned unless I was allowed to per my license. I learned this while I was completing my Masters in Book Publishing degree, but I can’t link a source because they don’t make licensing information available. BookScan data appears in news articles, suggesting that some licenses allow public sharing of data and/or reporters are allowed to share data.

In research, there are broadly two types of data: qualitative (data points that indicate quality and are often subjective) and quantitative (numerical data points that come from measurable quantities). Circana collects quantitative data. Many parts of the book industry operate on qualitative data—what books people liked and disliked, how it made them feel, what’s getting a lot of buzz. Those books are what booksellers stock, what book influencers review, and what the industry widely uses to propel progress. Hellebore will continue to use the qualitative data (and the quantitative, when it is reported) to help guide our decisions until the day comes where we can afford an expensive BookScan license.

In the meantime, I have found some lists of bestselling books (that have varying degrees of legitimacy, but are fun nonetheless) for you to peruse to get an idea of popular children’s books across the last fifty years. Check them out below.

Next
Next

Bookish Spotlight: Portland Book Week 2026