Preface

Deepak K. Agarwal; Bee-Chung Chen

doi:10.1017/CBO9781139565868.001

What This Book Is About

Recommender systems are automated computer programs that match items to users in different contexts. Such systems are ubiquitous and have become an integral part of our daily lives. Examples include recommending products to users on a site like Amazon, recommending content to users visiting a website like Yahoo!, recommending movies to users on a site like Netflix, recommending jobs to users on a site like LinkedIn, and so on. The matching algorithms are constructed using large amounts of high-frequency data obtained from past user interactions with items. The algorithms are statistical in nature and involve challenges in areas like sequential decision processes, modeling interactions with very high-dimensional categorical data, and developing scalable statistical methods. New methodologies in this area require close collaboration among computer scientists, machine learners, statisticians, optimization experts, system experts, and, of course, domain experts. It is one of the most exciting applications of big data.

Why We Wrote This Book

Although much has been written about recommender systems in various fields, such as computer science, machine learning, and statistics, focusing on specific aspects of the problem, a comprehensive treatment of all statistical issues and how they are interrelated is lacking. We came to this realization while deploying such systems at Yahoo! and LinkedIn. For instance, much of the focus in statistics and machine learning is on building models that minimize out-of-sample predictive error. However, this does not address all aspects of practical importance. Statistically, a recommender system is a high-dimensional sequential process, and it is equally important to study issues like design of experiments as it is to develop sophisticated statistical models. In fact, the two are closely related – efficient design needs models to tame the curse of dimensionality. Also, most existing work in the literature tends to build models for univariate response, such as movie ratings, purchases, and click rates.With the advent of social media outlets like Facebook, LinkedIn, andTwitter, multiple responses are available. For instance, one may want to model click rates, share rates, and tweet rates simultaneously for a news recommender application. Such multivariate response models are challenging to build. Finally, given the machinery to obtain such multivariate predictions, how does one construct utility functions to make recommendations? Is it more important to optimize share rates relative to click rates?

Book contents

Preface

Summary

Access options

Book contents

Preface

Summary

Access options

Save book to Kindle

Save book to Dropbox

Save book to Google Drive