Your Book Recommendations Suck
No, not yours specifically. It’s about book recommendations in general. Actually, about anything that can be recommended.
This year I’ve been actively seeking for certain kind of books. The books I read were mostly connected to other media I’ve been consuming at given moment, so that’s why I picked up Gibson after playing Cyberpunk RED tabletop, G. K. Chesterton after playing Deus Ex, and Mievile after being sucked into the world of Disco Elysium. I look for books that would extend the emotions I got from consuming other media. That’s an easy way to find books: usually searching the web for “disco elysium-like books” is more than enough to get valuable results. The problem is, however, if you try to look further than that.
Recommendation done wrong
So I open Goodreads, navigate to the book I’ve read recently, and I look for “Similar Books” section. What I find there are typical collaborative-filtering1 based ranking, that lists books which were also “liked” by people who “liked” that initial book. I put “liked” in quotes, because reducing enjoyment of a book (or any media, really) to a single value simplifies the problem so much, that it’s difficult to describe what does the “liking” even mean. I’ll get to that later. So the books list contains items that are connected to each other only by the fact that somebody “liked” them both. Most likely the recommendation method ignores metadata and the content of the books altogether. This is not good at all. What’s worse, the same recommender system can be used in two ways: to recommend books to match individual tastes (personalized recommendations), and to recommend books “similar” to each other. The typical issues with this approach are well-known already:
- Introducing new books (without any ratings) defeats the method completely
- The semantics of books ratings are unclear, yet the ratings play a major role in the recommendation process.
- The method ignores both the metadata of people and books
Cold Start Problem
The first issue is usually referred to as the cold start problem. It basically means that if there’s no interaction data (no humans rating books), we can’t recommend anything meaningful. This problem is twofold and has the following consequences:
- Users who signed up recently must rate some books first, in order to receive recommendations.
- Newly added (published) books, that were not rated yet, won’t be recommended to anyone.
The first consequence is reasonable and not surprising. We don’t know anything about newcomers, so they must state their taste first. This is most likely a common thing for all recommender systems. The second consequence is more problematic though, because we usually know at least something about new books. We know the author, their previous works (if available), publisher-provided description of a book, and other metadata. That’s something we could and should leverage from the beginning, without waiting for people to finally rate the book. There’s also another, hidden problem with low data scenarios. Looking at Goodread’s case, if we navigate to the similar books of a newly published work, most likely we’ll receive useless, sometimes surprising and unexpected recommendations. That’s a thing not many people are aware of when starting with recommender systems, but it happens and when it does, it hits hard.
Readers’ feedback
Coming back to the definition of rating of a book: what is it exactly? What does a person mean when they say they “like” a book? Do they enjoy the plot? The characters? The style of writing? And what matters the most: do these factors mean that somebody else would enjoy the same book? It goes without saying that simplifying those factors into single value, that would then be simplified even further, during a collaborative filtering process2, is a huge simplification. It’s correct sometimes, and for such a simple method it seems to kinda work, so many people use it when building recommender systems.
Hybrid methods
The methods I described most likely don’t incorporate any additional metadata of both the users and the books in the database. A next step would be to add other data to build a hybrid recommender system. This way we could mitigate the problems arising from the lack of rating data.
What the problem really is
I dare to say that people usually rate books based on the overall impression the book made on them. And so, where does this impact of books on some people come from, and why don’t the same books work on everyone? For me the problem is cursed by the dimensionality. The function of impression is of so many variables, that we cannot really grasp it. Maybe someday we can frame that problem in some way that would take the burden of complexity off us (that’s what Machine Learning proves useful for, right?). Given the current state of the art, I think we cannot really understand what the problem really is, and our solutions only approximate and simplify it by introducing often incorrect and harmful assumptions.
Next steps
Some time ago (before transformers happened) I thought that if we were able to make a machine read and comprehend a book, we could build a recommender system that could suggest books based on their content. Now I doubt that this is the final solution and to build a truly superhuman recommender system we would need to somehow grasp the complexity of the problem or find a solution that would handle the complexity for us. When would that happen, or which way should we look for it? I have no idea.