Skip to main content Accessibility help
×
Hostname: page-component-8448b6f56d-jr42d Total loading time: 0 Render date: 2024-04-24T09:55:18.623Z Has data issue: false hasContentIssue false

8 - Large-Scale Learning to Rank Using Boosted Decision Trees

from Part Two - Supervised and Unsupervised Learning Algorithms

Published online by Cambridge University Press:  05 February 2012

Krysta M. Svore
Affiliation:
Microsoft Research, Redmond, WA, USA
Christopher J. C. Burges
Affiliation:
Microsoft Research, Redmond, WA, USA
Ron Bekkerman
Affiliation:
LinkedIn Corporation, Mountain View, California
Mikhail Bilenko
Affiliation:
Microsoft Research, Redmond, Washington
John Langford
Affiliation:
Yahoo! Research, New York
Get access

Summary

The web search ranking task has become increasingly important because of the rapid growth of the internet. With the growth of the web and the number of web search users, the amount of available training data for learning web ranking models has also increased. We investigate the problem of learning to rank on a cluster using web search data composed of 140,000 queries and approximately 14 million URLs. For datasets much larger than this, distributed computing will become essential, because of both speed and memory constraints. We compare a baseline algorithm that has been carefully engineered to allow training on the full dataset using a single machine, in order to evaluate the loss or gain incurred by the distributed algorithms we consider. The underlying algorithm we use is a boosted tree ranking algorithm called LambdaMART, where a split at a given vertex in each decision tree is determined by the split criterion for a particular feature. Our contributions are twofold. First, we implement a method for improving the speed of training when the training data fits in main memory on a single machine by distributing the vertex split computations of the decision trees. The model produced is equivalent to the model produced from centralized training, but achieves faster training times. Second, we develop a training method for the case where the training data size exceeds the main memory of a single machine. Our second approach easily scales to far larger datasets, that is, billions of examples, and is based on data distribution.

Type
Chapter
Information
Scaling up Machine Learning
Parallel and Distributed Approaches
, pp. 148 - 169
Publisher: Cambridge University Press
Print publication year: 2011

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Banko, M., and Brill, E. 2001. Scaling to Very Very Large Corpora for Natural Language Disambiguation. Pages 26–33 of: Association for Computational Linguistics (ACL).Google Scholar
Burges, C. J., Svore, K. M., Benett, P. N., Pastusiak, A., and Wu, Q. 2011. Learning to Rank Using an Ensemble of Lambda-Gradient Models. Special Edition of JMLR: Proceedings of the Yahoo! Learning to Rank Challenge, 14, 25–35.Google Scholar
Burges, C. J. C. 2010. From RankNet to LambdaRank to LambdaMART: An Overview. Technical Report MSR-TR-2010-82. Microsoft Research.Google Scholar
Burges, C. J. C., Ragno, R., and Le, Q. V. 2006. Learning to Rank with Non-Smooth Cost Functions. In: Advances in Neural Information Processing Systems (NIPS).Google Scholar
Caragea, D., Silvescu, A., and Honavar, V. 2004. A Framework for Learning from Distributed Data using Sufficient Statistics and Its Application to Learning Decision Trees. International Journal of Hybrid Intelligent Systems, 1(1–2), 80–89.CrossRefGoogle ScholarPubMed
Dean, J., and Ghemawat, S. 2004. MapReduce: Simplified Data Processing on Large Clusters. In: Symposium on Operating System Design and Implementation (OSDI).Google Scholar
Domingos, P., and Hulten, G. 2000. Mining High-Speed Data Streams. Pages 71–80 of: SIGKDD Conference on Knowledge and Data Mining (KDD).Google Scholar
Domingos, P., and Hulten, G. 2001. A General Method for Scaling Up Machine Learning Algorithms and its Application to Clustering. In: International Conference on Machine Learning (ICML).Google Scholar
Donmez, P., Svore, K., and Burges, C. J. C. 2009. On the Local Optimality of LambdaRank. In: ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR).CrossRefGoogle Scholar
Fan, W., Stolfo, S., and Zhang, J. 1999. The Application of AdaBoost for Distributed, Scalable and Online Learning. Pages 362–366 of: SIGKDD Conference on Knowledge and Data Mining (KDD).Google Scholar
Friedman, J. 2001. Greedy Function Approximation: A Gradient BoostingMachine. Annals of Statistics, 25(5), 1189–1232.CrossRefGoogle Scholar
Jarvelin, K., and Kekalainen, J. 2000. IR Evaluation Methods for Retrieving Highly Relevant Documents. Pages 41–48 of: ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR).Google Scholar
Lazarevic, A. 2001. The distributed boosting algorithm. Pages 311–316 of: SIGKDD Conference on Knowledge Discovery and Data Mining (KDD).Google Scholar
Lazarevic, A., and Obradovic, Z. 2002. Boosting Algorithms for Parallel and Distributed Learning. Distributed and Parallel Databases, 11, 203–229.CrossRefGoogle Scholar
Panda, B., Herbach, J. S., Basu, S., and Bayardo, R. J. 2009. PLANET: Massively Parallel Learning of Tree Ensembles with MapReduce. In: International Conference on Very Large Databases (VLDB).Google Scholar
Provost, F., and Fayyad, U. 1999. A Survey of Methods for Scaling Up Induction Algorithms. Data Mining and Knowledge Discovery, 3, 131–169.CrossRefGoogle Scholar
van Uyen, N. T., and Chung, T. 2007. A New Framework for Distributed Boosting Algorithm. Pages 420–423 of: Future Generation Communication and Networking (FGCN).CrossRefGoogle Scholar
Wu, Q., Burges, C. J. C., Svore, K. M., and Gao, J. 2010. Adapting Boosting for Information Retrieval Measures. Journal of Information Retrieval, 13(3), 254–270.CrossRefGoogle Scholar
Yahoo! Learning to Rank Challenge. 2010. http://learningtorankchallenge.yahoo.com/.

Save book to Kindle

To save this book to your Kindle, first ensure coreplatform@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats
×