Skip to main content Accessibility help
  • Print publication year: 2016
  • Online publication date: February 2016

7 - Personalization through Feature-Based Regression



It is important to recommend items that are tailored to the personal interests and information needs of each individual user. Recommending items based on global popularity is often not sufficient. Some personalization may be achieved through an easy extension to most-popular recommendation by creating user segments based on attributes such as age, gender and geolocation (e.g., male users between twenty and forty years old living on the East Coast may form a segment) and serving the most-popular items within each segment. However, such a strategy has limitations. When the number of user segments is large, obtaining a reliable estimate of the popularity of items for each segment is difficult because of data sparsity. Also, user visits tend to follow a power law type of distribution – a small fraction of active users tends to visit the site very often, while the rest are sporadic. It is desirable to build custom models for active users who have visited the site several times in the past. For instance, if Mary visited the Yahoo! front page one hundred times in the last week and clearly indicated her preference for baseball news over everything else, a baseball article in our content pool should get precedence for Mary's next visit. For users who visit sporadically, personalization is usually accomplished by pooling data across users who are similar. Defining similarity is the crux of the problem, and it is often done by looking at user features such as demographics, behavioral attributes, and social network information, among other sources of signals. Combining these signals in an accurate fashion is challenging. In addition, many users are in the gray area of being superactive versus sporadic visitors. We need a methodology that can automatically determine the appropriate amount of weighting we should provide to a user's own past interactions and those of similar users.

For recommender systems where users, items, and their behavior do not change over time, offline models that are trained based on a one-time dump of historical data may be sufficient. See Chapter 2 for such models. However, in many application settings, new items are frequently added, and user behaviors and interests also change over time. It is important to update recommendation models frequently to capture such nonstationarity.

In this chapter, we focus on feature-based regression, which personalizes recommendations of time-sensitive items by capturing similarities through a regression approach.