To send content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about sending content to .
To send content items to your Kindle, first ensure email@example.com
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about sending to your Kindle.
Note you can select to send to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be sent to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
This is a practical guide to P-splines, a simple, flexible and powerful tool for smoothing. P-splines combine regression on B-splines with simple, discrete, roughness penalties. They were introduced by the authors in 1996 and have been used in many diverse applications. The regression basis makes it straightforward to handle non-normal data, like in generalized linear models. The authors demonstrate optimal smoothing, using mixed model technology and Bayesian estimation, in addition to classical tools like cross-validation and AIC, covering theory and applications with code in R. Going far beyond simple smoothing, they also show how to use P-splines for regression on signals, varying-coefficient models, quantile and expectile smoothing, and composite links for grouped data. Penalties are the crucial elements of P-splines; with proper modifications they can handle periodic and circular data as well as shape constraints. Combining penalties with tensor products of B-splines extends these attractive properties to multiple dimensions. An appendix offers a systematic comparison to other smoothers.
Transfer learning deals with how systems can quickly adapt themselves to new situations, tasks and environments. It gives machine learning systems the ability to leverage auxiliary data and models to help solve target problems when there is only a small amount of data available. This makes such systems more reliable and robust, keeping the machine learning model faced with unforeseeable changes from deviating too much from expected performance. At an enterprise level, transfer learning allows knowledge to be reused so experience gained once can be repeatedly applied to the real world. For example, a pre-trained model that takes account of user privacy can be downloaded and adapted at the edge of a computer network. This self-contained, comprehensive reference text describes the standard algorithms and demonstrates how these are used in different transfer learning paradigms. It offers a solid grounding for newcomers as well as new insights for seasoned researchers and developers.
Variational Bayesian learning is one of the most popular methods in machine learning. Designed for researchers and graduate students in machine learning, this book summarizes recent developments in the non-asymptotic and asymptotic theory of variational Bayesian learning and suggests how this theory can be applied in practice. The authors begin by developing a basic framework with a focus on conjugacy, which enables the reader to derive tractable algorithms. Next, it summarizes non-asymptotic theory, which, although limited in application to bilinear models, precisely describes the behavior of the variational Bayesian solution and reveals its sparsity inducing mechanism. Finally, the text summarizes asymptotic theory, which reveals phase transition phenomena depending on the prior setting, thus providing suggestions on how to set hyperparameters for particular purposes. Detailed derivations allow readers to follow along without prior knowledge of the mathematical techniques specific to Bayesian learning.
Meaningful use of advanced Bayesian methods requires a good understanding of the fundamentals. This engaging book explains the ideas that underpin the construction and analysis of Bayesian models, with particular focus on computational methods and schemes. The unique features of the text are the extensive discussion of available software packages combined with a brief but complete and mathematically rigorous introduction to Bayesian inference. The text introduces Monte Carlo methods, Markov chain Monte Carlo methods, and Bayesian software, with additional material on model validation and comparison, transdimensional MCMC, and conditionally Gaussian models. The inclusion of problems makes the book suitable as a textbook for a first graduate-level course in Bayesian computation with a focus on Monte Carlo methods. The extensive discussion of Bayesian software - R/R-INLA, OpenBUGS, JAGS, STAN, and BayesX - makes it useful also for researchers and graduate students from beyond statistics.
This book bridges theoretical computer science and machine learning by exploring what the two sides can teach each other. It emphasizes the need for flexible, tractable models that better capture not what makes machine learning hard, but what makes it easy. Theoretical computer scientists will be introduced to important models in machine learning and to the main questions within the field. Machine learning researchers will be introduced to cutting-edge research in an accessible format, and gain familiarity with a modern, algorithmic toolkit, including the method of moments, tensor decompositions and convex programming relaxations. The treatment beyond worst-case analysis is to build a rigorous understanding about the approaches used in practice and to facilitate the discovery of exciting, new ways to solve important long-standing problems.
Although computation and the science of physical systems would appear to be unrelated, there are a number of ways in which computational and physical concepts can be brought together in ways that illuminate both. This volume examines fundamental questions which connect scholars from both disciplines: is the universe a computer? Can a universal computing machine simulate every physical process? What is the source of the computational power of quantum computers? Are computational approaches to solving physical problems and paradoxes always fruitful? Contributors from multiple perspectives reflecting the diversity of thought regarding these interconnections address many of the most important developments and debates within this exciting area of research. Both a reference to the state of the art and a valuable and accessible entry to interdisciplinary work, the volume will interest researchers and students working in physics, computer science, and philosophy of science and mathematics.
Network data are produced automatically by everyday interactions - social networks, power grids, and links between data sets are a few examples. Such data capture social and economic behavior in a form that can be analyzed using powerful computational tools. This book is a guide to both basic and advanced techniques and algorithms for extracting useful information from network data. The content is organized around 'tasks', grouping the algorithms needed to gather specific types of information and thus answer specific types of questions. Examples include similarity between nodes in a network, prestige or centrality of individual nodes, and dense regions or communities in a network. Algorithms are derived in detail and summarized in pseudo-code. The book is intended primarily for computer scientists, engineers, statisticians and physicists, but it is also accessible to network scientists based in the social sciences. MATLAB®/Octave code illustrating some of the algorithms will be available at: http://www.cambridge.org/9781107125773.
This new color edition of Braun and Murdoch's bestselling textbook integrates use of the RStudio platform and adds discussion of newer graphics systems, extensive exploration of Markov chain Monte Carlo, expert advice on common error messages, motivating applications of matrix decompositions, and numerous new examples and exercises. This is the only introduction needed to start programming in R, the computing standard for analyzing data. Co-written by an R core team member and an established R author, this book comes with real R code that complies with the standards of the language. Unlike other introductory books on the R system, this book emphasizes programming, including the principles that apply to most computing languages, and techniques used to develop more complex projects. Solutions, datasets, and any errata are available from the book's website. The many examples, all from real applications, make it particularly useful for anyone working in practical data analysis.
SAS programming is a creative and iterative process designed to empower you to make the most of your organization's data. This friendly guide provides you with a repertoire of essential SAS tools for data management, whether you are a new or an infrequent user. Most useful to students and programmers with little or no SAS experience, it takes a no-frills, hands-on tutorial approach to getting started with the software. You will find immediate guidance in navigating, exploring, visualizing, cleaning, formatting, and reporting on data using SAS and JMP. Step-by-step demonstrations, screenshots, handy tips, and practical exercises with solutions equip you to explore, interpret, process and summarize data independently, efficiently and effectively.
Designing algorithms to recommend items such as news articles and movies to users is a challenging task in numerous web applications. The crux of the problem is to rank items based on users' responses to different items to optimize for multiple objectives. Major technical challenges are high dimensional prediction with sparse data and constructing high dimensional sequential designs to collect data for user modeling and system design. This comprehensive treatment of the statistical issues that arise in recommender systems includes detailed, in-depth discussions of current state-of-the-art methods such as adaptive sequential designs (multi-armed bandit methods), bilinear random-effects models (matrix factorization) and scalable model fitting using modern computing paradigms like MapReduce. The authors draw upon their vast experience working with such large-scale systems at Yahoo! and LinkedIn, and bridge the gap between theory and practice by illustrating complex concepts with examples from applications they are directly involved with.
This thoroughly updated new edition presents state-of-the-art sparse and multiscale image and signal processing. It covers linear multiscale geometric transforms, such as wavelet, ridgelet, or curvelet transforms, and non-linear multiscale transforms based on the median and mathematical morphology operators. Along with an up-to-the-minute description of required computation, it covers the latest results in inverse problem solving and regularization, sparse signal decomposition, blind source separation, in-painting, and compressed sensing. New chapters and sections cover multiscale geometric transforms for three-dimensional data (data cubes), data on the sphere (geo-located data), dictionary learning, and nonnegative matrix factorization. The authors wed theory and practice in examining applications in areas such as astronomy, including recent results from the European Space Agency's Herschel mission, biology, fusion physics, cold dark matter simulation, medical MRI, digital media, and forensics. MATLAB® and IDL code, available online at www.SparseSignalRecipes.info, accompany these methods and all applications.
How do you distinguish a cat from a dog by their DNA? Did Shakespeare really write all of his plays? Pattern matching techniques can offer answers to these questions and to many others, from molecular biology, to telecommunications, to classifying Twitter content. This book for researchers and graduate students demonstrates the probabilistic approach to pattern matching, which predicts the performance of pattern matching algorithms with very high precision using analytic combinatorics and analytic information theory. Part I compiles known results of pattern matching problems via analytic methods. Part II focuses on applications to various data structures on words, such as digital trees, suffix trees, string complexity and string-based data compression. The authors use results and techniques from Part I and also introduce new methodology such as the Mellin transform and analytic depoissonization. More than 100 end-of-chapter problems help the reader to make the link between theory and practice.
In this book the authors describe the principles and methods behind probabilistic forecasting and Bayesian data assimilation. Instead of focusing on particular application areas, the authors adopt a general dynamical systems approach, with a profusion of low-dimensional, discrete-time numerical examples designed to build intuition about the subject. Part I explains the mathematical framework of ensemble-based probabilistic forecasting and uncertainty quantification. Part II is devoted to Bayesian filtering algorithms, from classical data assimilation algorithms such as the Kalman filter, variational techniques, and sequential Monte Carlo methods, through to more recent developments such as the ensemble Kalman filter and ensemble transform filters. The McKean approach to sequential filtering in combination with coupling of measures serves as a unifying mathematical framework throughout Part II. Assuming only some basic familiarity with probability, this book is an ideal introduction for graduate students in applied mathematics, computer science, engineering, geoscience and other emerging application areas.
Written by leading authorities in database and Web technologies, this book is essential reading for students and practitioners alike. The popularity of the Web and Internet commerce provides many extremely large datasets from which information can be gleaned by data mining. This book focuses on practical algorithms that have been used to solve key problems in data mining and can be applied successfully to even the largest datasets. It begins with a discussion of the map-reduce framework, an important tool for parallelizing algorithms automatically. The authors explain the tricks of locality-sensitive hashing and stream processing algorithms for mining data that arrives too fast for exhaustive processing. Other chapters cover the PageRank idea and related tricks for organizing the Web, the problems of finding frequent itemsets and clustering. This second edition includes new and extended coverage on social networks, machine learning and dimensionality reduction.
Providing a novel approach to sparsity, this comprehensive book presents the theory of stochastic processes that are ruled by linear stochastic differential equations, and that admit a parsimonious representation in a matched wavelet-like basis. Two key themes are the statistical property of infinite divisibility, which leads to two distinct types of behaviour - Gaussian and sparse - and the structural link between linear stochastic processes and spline functions, which is exploited to simplify the mathematical analysis. The core of the book is devoted to investigating sparse processes, including a complete description of their transform-domain statistics. The final part develops practical signal-processing algorithms that are based on these models, with special emphasis on biomedical image reconstruction. This is an ideal reference for graduate students and researchers with an interest in signal/image processing, compressed sensing, approximation theory, machine learning, or statistics.
This rigorous, self-contained book describes mathematical and, in particular, stochastic and graph theoretic methods to assess the performance of complex networks and systems. It comprises three parts: the first is a review of probability theory; Part II covers the classical theory of stochastic processes (Poisson, Markov and queueing theory), which are considered to be the basic building blocks for performance evaluation studies; Part III focuses on the rapidly expanding new field of network science. This part deals with the recently obtained insight that many very different large complex networks – such as the Internet, World Wide Web, metabolic and human brain networks, utility infrastructures, social networks – evolve and behave according to general common scaling laws. This understanding is useful when assessing the end-to-end quality of Internet services and when designing robust and secure networks. Containing problems and solved solutions, the book is ideal for graduate students taking courses in performance analysis.
Machine learning methods extract value from vast data sets quickly and with modest resources. They are established tools in a wide range of industrial applications, including search engines, DNA sequencing, stock market analysis, and robot locomotion, and their use is spreading rapidly. People who know the methods have their choice of rewarding jobs. This hands-on text opens these opportunities to computer science students with modest mathematical backgrounds. It is designed for final-year undergraduates and master's students with limited background in linear algebra and calculus. Comprehensive and coherent, it develops everything from basic reasoning to advanced techniques within the framework of graphical models. Students learn more than a menu of techniques, they develop analytical and problem-solving skills that equip them for the real world. Numerous examples and exercises, both computer based and theoretical, are included in every chapter. Resources for students and instructors, including a MATLAB toolbox, are available online.
The popularity of the Web and Internet commerce provides many extremely large datasets from which information can be gleaned by data mining. This book focuses on practical algorithms that have been used to solve key problems in data mining and which can be used on even the largest datasets. It begins with a discussion of the map-reduce framework, an important tool for parallelizing algorithms automatically. The authors explain the tricks of locality-sensitive hashing and stream processing algorithms for mining data that arrives too fast for exhaustive processing. The PageRank idea and related tricks for organizing the Web are covered next. Other chapters cover the problems of finding frequent itemsets and clustering. The final chapters cover two applications: recommendation systems and Web advertising, each vital in e-commerce. Written by two authorities in database and Web technologies, this book is essential reading for students and practitioners alike.
'What's going to happen next?' Time series data hold the answers, and Bayesian methods represent the cutting edge in learning what they have to say. This ambitious book is the first unified treatment of the emerging knowledge-base in Bayesian time series techniques. Exploiting the unifying framework of probabilistic graphical models, the book covers approximation schemes, both Monte Carlo and deterministic, and introduces switching, multi-object, non-parametric and agent-based models in a variety of application environments. It demonstrates that the basic framework supports the rapid creation of models tailored to specific applications and gives insight into the computational complexity of their implementation. The authors span traditional disciplines such as statistics and engineering and the more recently established areas of machine learning and pattern recognition. Readers with a basic understanding of applied probability, but no experience with time series analysis, are guided from fundamental concepts to the state-of-the-art in research and practice.
This book explains how computer software is designed to perform the tasks required for sophisticated statistical analysis. For statisticians, it examines the nitty-gritty computational problems behind statistical methods. For mathematicians and computer scientists, it looks at the application of mathematical tools to statistical problems. The first half of the book offers a basic background in numerical analysis that emphasizes issues important to statisticians. The next several chapters cover a broad array of statistical tools, such as maximum likelihood and nonlinear regression. The author also treats the application of numerical tools; numerical integration and random number generation are explained in a unified manner reflecting complementary views of Monte Carlo methods. Each chapter contains exercises that range from simple questions to research problems. Most of the examples are accompanied by demonstration and source code available from the author's website. New in this second edition are demonstrations coded in R, as well as new sections on linear programming and the Nelder–Mead search algorithm.