The Story of Mr. Booster recommendation system

One of the most important events in the development of recommender systems was the Netflix Prize, $1M data science competition held a decade ago. The goal was to predict the viewer’s rating for films based on previous ratings, without any other information about the customers or films. It was within the framework of this competition that the basic principles of creating effective recommendation systems were laid out. For many years the development of recommender systems was associated with minor modifications and improvements to the ideas expressed by the contestants. On one hand tremendous progress was made in the development of such systems, on the other hand for many years the experience of one narrow specific area of application was transferred to another fields without taking into account their specifics. Several paradigms are common for the existing commercial implementations of recommender systems:

  • The system needs to recommend something very similar to what the customer liked (or bought) earlier. For example, if the customer liked the first series of “Star Wars”, then it should recommend the sequel;
  • It is necessary to find a very similar customer (or customer group) by the type of purchases (preferences) and to transfer their preferences to the customer of interest. For example, if one chooses a hamburger and french fries then the systems look for another buyer with these two items in the receipt, find an additional item in there, e.g. Coke, and recommend it to the customer;
  • It is necessary to find popular combinations and in the case of the purchase of one of them to offer the others. For example, if laptops are frequently bought with mouses, then the system needs to recommend mouse to the laptop buyers;
  • If the customer browsed a specific product then they are likely interested in purchasing it, and the system should repetitively recommend it.

The last paradigm is used even by the world’s largest online platforms despite the large number of negative customer reviews. If one is planning a vacation and looking for e.g.  sunscreen and flippers then the recommendations of these goods will haunt the customer for months even if they were already purchased. The first three paradigms generally work but in a number of narrow fields with extremely low efficiency.

During the research and development of Mr. Booster recommender system, several interesting and valuable “artifacts” were produced that gave nontrivial solutions to the current challenges. In many ways, these solutions have become the key to the effectiveness of the Mr. Booster recommendation system.

 

Challenges and solutions

 

How to determine the time period to analyze customer data united by the single purchasing intent?

 

The first hypothesis was to find such a time interval during which most of the purchases grouped by the specific intent have to complete (remember the vacation example) and the new ones haven’t started yet. We started by aggregating purchases over the past 10 days. However, the tests showed the inconsistency of this approach: if the time interval is small enough then many purchasing goals will not be adequately identified. Yes, many can buy everything needed for vacation during the week, but often the planning can take from few weeks to several months. If we take a larger time interval instead, say 50 days, then many customers have time to make plans of totally different nature (e.g. to get apparels for the gym as well) and the original recommendations become irrelevant. We propose to search for the time intervals during which the client did not buy anything, and then if the purchases start to follow one another consider the purchasing goal to not be finished. For example, if during the last month purchases were coming with the intervals of 2-3 days, and before that there was a month without purchases, then we aggregate everything that was purchased before this pause. Such aggregation turned out to be more efficient than the previous version, however, there were problems. Some customers obviously aggregated purchases with a clearly different intent. An analysis shown in most cases the problem was that such clients had a recurring purchases (for instance socks). For such clients, purchases were aggregated over a long period of time.

 

How to identify the joint purchases of a group of products?

 

Most recommender systems do not identify frequently purchased groups of products and recommendations are based on to the approach “if goods X1 and X2 were bought, they recommend goods in addition to item X1, separately recommend goods in addition to item X2 and then merge these two lists according to some rules”. In Mr. Booster we use features of the joint group purchase. A surprisingly good solution came from a different area of Data Science – Natural Language Processing. We used a concept of N-gram, a sequence of multiple words that occur frequently in the text. We took into account that the order of goods in the recept do not affect the identification of the N-gram (in contrast, in a text analysis the order of words is important). We found that the N-grams were identified relatively rare in purchases, not because they were absent in sales but because many items have a large number of analogues and even items with the same name have a number of manufacturers. Therefore, in addition to N-gram by products we built N-grams by product categories (at the level of aggregation by the most popular international classification, S4). These allowed us to identify statistically significant product combinations (e.g. skis, poles, helmet). The absence of a product from a single category (e.g. helmets) in the combination could now be identified virtually unambiguously and Mr. Booster was giving a relevant recommendation.

 

How to solve a general problem of recommender systems – trivial recommendations?

 

For example the recommendation to watch Star trek on Netflix would be trivial. Even if there is no information on the purchase of this movie by a specific customer but there is information about their interest in the movies from this category, most likely the customer has already watched Star trek or heard about it and would not be interested in seeing this recommendation. It would be much more beneficial to recommend a relevant but less advertised movie. We studied several different approaches. The most effective direction turned out to be the analysis of how unique the recommendation the product is in general. If the product we recommend is highly specialized and rarely occurs in other recommendations it is an excellent candidate. If the product is recommended frequently it is better to limit the number of its recommendations to the instances when the system rating of its recommendation effectiveness is high. For this we developed the solution for numerical evaluation of recommendation effectiveness.