Ever wonder why Google+ and Facebook don't let you dislike something? The Facebook like button and the Google +1 button are mostly ubiquitous around the web at this point, but they don't provide the symmetric action for disliking or -1ing anything.
This is in sharp contrast to the rest of the ratings systems around the web, which mostly fall into two camps:
Google+ and Facebook are somewhat unique in that they only offer the ability to give positive feedback. Now part of that reason is undoubtedly that they don't want people going around disliking pages from their advertisers that are the ones actually giving them money. However, I think the biggest reason is that even if they did offer a dislike button, people just wouldn't use it often enough to justify the precious screen real estate it would take up.
One of the common misconceptions around the web is that ratings distributions are bimodal, with people rating things they love and hate in roughly equal numbers. Zed Shaw argued this point of view in some blog posts he wrote on rating systems and their distributions, and it makes an intuitive kind of sense.
However, looking at the data and this just isn't the case. 5 star ratings distributions are roughly normally distributed - with the average rating being quite positive. Ratings for sites that only let you thumb up or down something up are overwhelmingly positive.
Of course, don't take my word for it. Ratings datasets are figuratively just lying around the web these days, begging for someone to take notice and analyze them. I've taken some of the ones for more popular sites below.
Here are the datasets I looked at:
For the 5 star datasets, we'll assume that a 3 star review is slightly positive, since that is usually how I rate things. However, I realize that not everyone agrees with me on this, so I'll leave 3 star reviews separated out here. Note that the Amazon dataset itself doesn't include 3 star reviews exactly because of this ambiguity.
The percentage of negative reviews here is fairly consistent, hovering around 15% for all the datasets (17%, 12.8%, 14.7% and 15.8% respectively for those of you that aren't adept at counting pixels). This is despite all of these coming from different web sites, and being ratings on different types of products. Even more telling is that one star reviews are even more rare: only 4% of yelp reviews, 6.9% of Amazon reviews and 4.8% of Netflix reviews are either 1 star or 1.5 stars.
For news, I've observed the same effect over multiple different datasets like the digg dataset, internal Zite data etc. I didn't bother including these because the Reddit dataset is the largest publicly available news ratings dataset I could find - and the results for all news datasets are basically the same: Digg had 88% diggs versus Reddit's 85% upvotes etc.
There a whole bunch of reasons that people don't usually leave negative reviews. Here are some of the bigger factors in my opinion.
Most of the time we all know what we're getting before we buy a book, rent a movie, or go to a restaurant. If we know that we won't enjoy something, we tend to avoid it. Also, people definitely don't bother seeking out all the things they don't like and rate them. The default action for disliking something on the internet is not clicking on it - not taking time out to provide a negative review.
... and popular things get more ratings - a whole lot more ratings.
All ratings datasets will have a power law distribution in terms of the number of ratings per item. This means that a handful of items will have tons of ratings, while the majority will have just a handful. To put this in context, for the Netflix prize dataset - the top 10 movies have as many ratings as the bottom 50% (8713) of movies put together. And all of the really negatively rated movies will be in the bottom 50% of movies.
You can see what I'm talking about in this scatter plot of number of reviews for a movie versus the average review for that movie. Notice how the entire left hand side of that scatter plot is blank, aside from a thin strip of terrible movies that have very few ratings. There are no really popular movies with a less than 2.5 average star rating.
Another way to look at this is to compare the worst movie in this dataset with the best. The worst movie is 'Avia Vampire Hunter' and has an average rating of 1.28 stars - but only has 132 reviews. The highest rated movie is "Lord of the Rings: The Return of the King: Extended Edition" with an average rating of 4.72 stars - but with 73335 reviews. The top rated movie gets over 500x as many reviews as the worst rated. Overall 30% of movies have an average rating of less than 3 stars, even though only 15% of all ratings are less than 3 stars.
This is really just a variation of the previous point: people avoid things they will dislike, causing universally disliked things to remain unpopular. Its important enough that I thought I would sneak it in twice in different forms though.
You don't typically leave a review for something that you don't know anything about - but the very act of being familiar with something actually makes you predisposed to like it. Psychologists call this phenomena the mere exposure effect , and it has been shown to affect decision making in a wide variety of contexts.
Nobody sets out to write a terrible novel, open a crappy restaurant or create a horrible movie, and people mostly succeed in not being abysmal.
Going back to the Netflix data again: 70% of movies have an average rating of 3 stars or more. Since 85% of all reviews are 3 stars or higher, popularity effects are skewing the averages higher. But even with normalizing away reviews-per-movie - most movies are still positively rated.
For all of us out there building recommender systems this means that we can't just rely on explicit negative feedback. For people on the web, an order of magnitude more people will click on an item than vote on it - and of those voting on it, people will thumb up data almost an order of magnitude more than thumb it down. While more data isn't always better than better models - it almost never hurts.
Published on 03 June 2013