We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
This last chapter summarizes most of the material in this book in a range of concluding statements. It provides a summary of the lessons learned. These lessons can be viewed as guidelines for research practice.
As the field of migration studies evolves in the digital age, big data analytics emerge as a potential game-changer, promising unprecedented granularity, timeliness, and dynamism in understanding migration patterns. However, the epistemic value added by this data explosion remains an open question. This paper critically appraises the claim, investigating the extent to which big data augments, rather than merely replicates, traditional data insights in migration studies. Through a rigorous literature review of empirical research, complemented by a conceptual analysis, we aim to map out the methodological shifts and intellectual advancements brought forth by big data. The potential scientific impact of this study extends into the heart of the discipline, providing critical illumination on the actual knowledge contribution of big data to migration studies. This, in turn, delivers a clarified roadmap for navigating the intersections of data science, migration research, and policymaking.
This chapter is dedicated to the memory of Sue Atkins, the Grande Dame of lexicography, who passed away in 2021. In a prologue we argue that she must be seen on a par with other visionaries and their visions, such as Paul Dirac in mathematics or Beethoven in music. We review the last half century through the eyes of Sue Atkins. In the process, insights of other luminaries come into the picture, including those of Patrick Hanks, Michael Rundell, Adam Kilgarriff, John Sinclair, and Charles Fillmore. This material serves as background to start thinking out of the box about the future of dictionaries. About fifty oppositions are presented, in which the past is contrasted with the future, divided into five subsections: the dictionary-making process, supporting tools and concepts, the appearance of the dictionary, facts about the dictionary, and the image of the dictionary. Moving from the future of dictionaries to the future of lexicographers, the argument is made that dictionary makers need to join forces with the Big Data companies, a move that, by its nature, brings us to the US and thus Americans, including Gregory Grefenstette, Erin McKean, Laurence Urdang, and Sidney I. Landau. In an epilogue, the presentation’s methodology is defined as being “a fact-based extrapolation of the future” and includes good advice from Steve Jobs.
This is the first of a two-part paper. We formulate a data-driven method for constructing finite-volume discretizations of an arbitrary dynamical system's underlying Liouville/Fokker–Planck equation. A method is employed that allows for flexibility in partitioning state space, generalizes to function spaces, applies to arbitrarily long sequences of time-series data, is robust to noise and quantifies uncertainty with respect to finite sample effects. After applying the method, one is left with Markov states (cell centres) and a random matrix approximation to the generator. When used in tandem, they emulate the statistics of the underlying system. We illustrate the method on the Lorenz equations (a three-dimensional ordinary differential equation) saving a fluid dynamical application for Part 2 (Souza, J. Fluid Mech., vol. 997, 2024, A2).
This is the second part of a two-part paper. We apply the methodology of the first paper (Souza, J. Fluid Mech., vol. 997, 2024, A1) to construct a data-driven finite-volume discretization of the Liouville/Fokker–Planck equation of a high-dimensional dynamical system, i.e. the compressible Euler equations with gravity and rotation evolved on a thin spherical shell. We show that the method recovers a subset of the statistical properties of the underlying system, steady-state distributions of observables and autocorrelations of particular observables, as well as revealing the global Koopman modes of the system. We employ two different strategies for the partitioning of a high-dimensional state space, and explore their consequences.
This paper aims at exploring the dynamic interplay between advanced technological developments in AI and Big Data and the sustained relevance of theoretical frameworks in scientific inquiry. It questions whether the abundance of data in the AI era reduces the necessity for theory or, conversely, enhances its importance. Arguing for a synergistic approach, the paper emphasizes the need for integrating computational capabilities with theoretical insight to uncover deeper truths within extensive datasets. The discussion extends into computational social science, where elements from sociology, psychology, and economics converge. The application of these interdisciplinary theories in the context of AI is critically examined, highlighting the need for methodological diversity and addressing the ethical implications of AI-driven research. The paper concludes by identifying future trends and challenges in AI and computational social science, offering a call to action for the scientific community, policymakers, and society. Being positioned at the intersection of AI, data science, and social theory, this paper illuminates the complexities of our digital era and inspires a re-evaluation of the methodologies and ethics guiding our pursuit of knowledge.
The Conclusion chapter reiterates the book’s approach, focus and main points. It reminds the reader that the book has concentrated on local, provincial, peripatetic and otherwise relatively marginal sites of scientific activity and shown how a wide variety of spaces were constituted and reconfigured as meteorological observatories. The conclusion reiterates the point that nineteenth-century meteorological observatories, and indeed the very idea of observatory meteorology, were under constant scrutiny. The conclusion interrogates four crucial conditions of these observatory experiments: the significance of geographical particularity in justifications of observatory operations; the sustainability of coordinated observatory networks at a distance; the ability to manage, manipulate and interpret large datasets; and the potential public value of meteorology as it was prosecuted in observatory settings. Finally, the chapter considers the use of historic weather data in recent attempts by climate scientists to reconstruct past climates and extreme weather events.
Second language (L2) learners need to acquire large vocabularies to approach native-like proficiency. Many controlled experiments have investigated the factors facilitating and hindering word learning; however, few studies have validated these findings in real-world learning scenarios. We use data from the language learning app Lingvist to explore how L2 word learning is affected by valence (positivity/negativity) and concreteness of target words and their linguistic contexts. We found that valence, but not concreteness, affects learning. Users learned positive and negative words better than neutral ones. Moreover, positive words are learned best in positive contexts and negative words in more negative contexts. Word and context valence effects are strongest on the learner’s second encounter with the target word and diminish across subsequent encounters. These findings provide support for theories of embodied cognition and the lexical quality hypothesis and point to the linguistic factors that make learning words, and by extension languages, faster.
Personal independence payment (PIP) is a benefit that covers additional daily living costs people may incur from a long-term health condition or disability. Little is known about PIP receipt and associated factors among people who access mental health services, and trends over time. Individual-level data linking healthcare records with administrative records on benefits receipt have been non-existent in the UK.
Aims
To explore how PIP receipt varies over time, including PIP type, and its association with sociodemographic and diagnostic patient characteristics among people who access mental health services.
Method
A data-set was established by linking electronic mental health records from the South London and Maudsley NHS Foundation Trust with administrative records from the Department for Work and Pensions.
Results
Of 143 714 working-age patients, 37 120 (25.8%) had received PIP between 2013 and 2019, with PIP receipt steadily increasing over time. Two in three patients (63.2%) had received both the daily living and mobility component. PIP receipt increased with age. Those in more deprived areas were more likely to receive PIP. The likelihood of PIP receipt varied by ethnicity. Patients diagnosed with a severe mental illness had 1.48 odds (95% CI 1.42–1.53) of having received PIP, compared with those with a different psychiatric diagnosis.
Conclusions
One in four people who accessed mental health services had received PIP, with higher levels seen among those most likely in need, as indicated by a severe mental illness diagnosis. Future research using this data-set could explore the average duration of PIP receipt in people who access mental health services, and re-assessment patterns by psychiatric diagnosis.
High-dimensional dynamical systems projected onto a lower-dimensional manifold cease to be deterministic and are best described by probability distributions in the projected state space. Their equations of motion map onto an evolution operator with a deterministic component, describing the projected dynamics, and a stochastic one representing the neglected dimensions. This is illustrated with data-driven models for a moderate-Reynolds-number turbulent channel. It is shown that, for projections in which the deterministic component is dominant, relatively ‘physics-free’ stochastic Markovian models can be constructed that mimic many of the statistics of the real flow, even for fairly crude operator approximations, and this is related to general properties of Markov chains. Deterministic models converge to steady states, but the simplified stochastic models can be used to suggest what is essential to the flow and what is not.
The serotonin 4 receptor (5-HT4R) is a promising target for the treatment of depression. Highly selective 5-HT4R agonists, such as prucalopride, have antidepressant-like and procognitive effects in preclinical models, but their clinical effects are not yet established.
Aims
To determine whether prucalopride (a 5-HT4R agonist and licensed treatment for constipation) is associated with reduced incidence of depression in individuals with no past history of mental illness, compared with anti-constipation agents with no effect on the central nervous system.
Method
Using anonymised routinely collected data from a large-scale USA electronic health records network, we conducted an emulated target trial comparing depression incidence over 1 year in individuals without prior diagnoses of major mental illness, who initiated treatment with prucalopride versus two alternative anti-constipation agents that act by different mechanisms (linaclotide and lubiprostone). Cohorts were matched for 121 covariates capturing sociodemographic factors, and historical and/or concurrent comorbidities and medications. The primary outcome was a first diagnosis of major depressive disorder (ICD-10 code F32) within 1 year of the index date. Robustness of the results to changes in model and population specification was tested. Secondary outcomes included a first diagnosis of six other neuropsychiatric disorders.
Results
Treatment with prucalopride was associated with significantly lower incidence of depression in the following year compared with linaclotide (hazard ratio 0.87, 95% CI 0.76–0.99; P = 0.038; n = 8572 in each matched cohort) and lubiprostone (hazard ratio 0.79, 95% CI 0.69–0.91; P < 0.001; n = 8281). Significantly lower risks of all mood disorders and psychosis were also observed. Results were similar across robustness analyses.
Conclusions
These findings support preclinical data and suggest a role for 5-HT4R agonists as novel agents in the prevention of major depression. These findings should stimulate randomised controlled trials to confirm if these agents can serve as a novel class of antidepressant within a clinical setting.
Public health data available for research are booming with the expansion of Big Data. This reshapes the data sources for DOHaD enquiries while offering ample opportunities to advance epidemiological modelling within the DOHaD framework. However, Big Data also raises a plethora of methodological challenges related to accurately characterising population health trajectories and biological mechanisms, within heterogeneous and dynamic sociodemographic contexts, and a fast-moving technological landscape. In this chapter, we explore the methodological challenges of research into the causal mechanisms of the transgenerational transfer of disease risks that characterise the DOHaD research landscape and consider these challenges in the light of novel technologies within artificial intelligence (AI) and Big Data. Such technologies could push further the collating of multidimensional data, including electronic health records and tissue banks, to offer new insights. While such methodological and technological innovations may drive clearer and reproducible evidence within DOHaD research, as we argue, many challenges remain, including data quality, interpretability, generalisability, and ethics.
This chapter deals with how public policy can steer AI, by taking how it can impact the use of big data, one of the key inputs required for AI. Essentially, public policy can steer AI through putting conditions and limitations on data. But data itself can help improve public policy – also in the area of economic policymaking. Hence, this chapter touches on the future potential of economic policy improvements through AI. More specifically, we discuss under what conditions the availability of large data sets can support and enhance public policy effectiveness – including in the use of AI – along two main directions. We analyze how big data can help existing policy measures to improve their effectiveness and, second, we discuss how the availability of big data can suggest new, not yet implemented, policy solutions that can improve upon existing ones. The key message of this chapter is that the desirability of big data and AI to enhance policymaking depends on the goal of public authorities, and on aspects such as the cost of data collection and storage and the complexity and importance of the policy issue.
In this chapter, we describe the development of AI since World War II, noting various AI “winters” and tracing the current boom in AI back to around 2006/2007. We provide various metrics describing the nature of this AI boom. We then provide a summary and discussion of the salient research relevant to the economics of AI and outline some recent theoretical advances.
In this chapter, we take the production function enriched with AI abilities from Chapter 4, and apply it to study the implications for progress in AI on growth and inequality. The crucial finding we discuss in this chapter is that understanding the nature of AI as narrow ML and its effect on key macroeconomic outcomes depends on having appropriate assumptions in growth models. In particular, we discuss the appropriateness of assuming, as most standard endogenous growth models today do, that economies are supply driven. If they are not supply driven, then demand constraints, which can arise from the diffusion of AI, may restrict growth. Through this, we show why expectations that AI will may lead to “explosive” economic growth is unlikely to materialize. We show that by considering the nature of AI as specific (and not general) AI and making appropriate assumptions that reflect the digital AI economy better, economic outcomes may be characterized by slow growth, rising inequality, and rather full employment – conditions that rather well describe economies in the West.
The relationship between opioid use and the incidence of psychiatric disorders remains unidentified.
Aims
This study examined the association between the incidence of psychiatric disorders and opioid use.
Method
Data for this population-based cohort study were obtained from the National Health Insurance Service of South Korea. The study included all adult patients who received opioids in 2016. The control group comprised individuals who did not receive opioids in 2016, and were selected using a 1:1 stratified random sampling procedure. Patients with a history of psychiatric disorders diagnosed in 2016 were excluded. The primary end-point was the diagnosis of psychiatric disorders, evaluated from 1 January 2017 to 31 December 2021. Psychiatric disorders included schizophrenia, mood disorders, anxiety and others.
Results
The analysis included 3 505 982 participants. Opioids were prescribed to 1 455 829 (41.5%) of these participants in 2016. Specifically, 1 187 453 (33.9%) individuals received opioids for 1–89 days, whereas 268 376 (7.7%) received opioids for ≥90 days. In the multivariable Cox regression model, those who received opioids had a 13% higher incidence of psychiatric disorder than those who did not (hazard ratio 1.13; 95% CI 1.13–1.14). Furthermore, both those prescribed opioids for 1–89 days and for ≥90 days had 13% (hazard ratio 1.13, 95% CI 1.12–1.14) and 17% (hazard ratio 1.17, 95% CI 1.16–1.18) higher incidences of psychiatric disorders, respectively, compared with those who did not receive opioids.
Conclusions
This study revealed that increased psychiatric disorders were associated with opioid medication use. The association was significant among both short- and long-term opioid use.
With the emergence of big data and artificial intelligence, the feasibility of central planning has again become a popular topic. The essential feature of the planned economy is the use of systematic and institutional force to negate entrepreneurship and deprive individuals of the freedom to choose, especially the freedom to start a business and innovate. Can big data revive the planned economy? The essence of this question is: Can big data displace entrepreneurship? The impossibility of a big data-based planned economy is demonstrated from five perspectives (i.e. the nature of knowledge, the nature of entrepreneurial decisions, the distinction between risk and uncertainty, the importance of ideas, and the evolutionary view). In other words, big data cannot replace entrepreneurship. The false belief that central planning is possible with big data is extremely naïve.
The question of how to balance free data flows and national policy objectives, especially data privacy and security, is key to advancing the benefits of the digital economy. After establishing that new digital technologies have further integrated physical and digital activities, and thus, more and more of our social interactions are being sensed and datafied, Chapter 6 argues that innovative regulatory approaches are needed to respond to the impact of big data analytics on existing privacy and cybersecurity regimes. At the crossroads, where multistakeholderism meets multilateralism, the roles of the public and private sectors should be reconfigured for a datafied world. Looking to the future, rapid technological developments and market changes call for further public–private convergence in data governance, allowing both public authorities and private actors to jointly reshape the norms of cross-border data flows. Under such an umbrella, the appropriate role of multilateral, state-based norm-setting in Internet governance includes the oversight of the balance between the free flow of data and other legitimate public policies, as well as engagement in the coordination of international standards.
The socioeconomic role of guanxi networks among individuals has been widely recorded, yet macro-level analysis has been sparse in empirical research. This research fills that gap by presenting the first nationally representative evidence illustrating the connection between regional guanxi culture and population mobility among cities in China, with a particular focus on instrumental guanxi culture. To quantify guanxi culture, we employ online search indices related to gift giving, a measure which is challenging to capture through traditional survey data. Applying matched prefecture-level data spanning from 2011 to 2019, the panel model reveals a strong negative correlation between a city's instrumental guanxi culture and inbound migration, while sentimental guanxi culture exhibits a positive correlation with inbound mobility. This research not only adds to the existing theories by exploring the macro-level effects of both instrumental and sentimental guanxi practices but also introduces an innovative method for quantifying guanxi culture through big data analysis.
The brain can be represented as a network, with nodes as brain regions and edges as region-to-region connections. Nodes with the most connections (hubs) are central to efficient brain function. Current findings on structural differences in Major Depressive Disorder (MDD) identified using network approaches remain inconsistent, potentially due to small sample sizes. It is still uncertain at what level of the connectome hierarchy differences may exist, and whether they are concentrated in hubs, disrupting fundamental brain connectivity.
Methods
We utilized two large cohorts, UK Biobank (UKB, N = 5104) and Generation Scotland (GS, N = 725), to investigate MDD case–control differences in brain network properties. Network analysis was done across four hierarchical levels: (1) global, (2) tier (nodes grouped into four tiers based on degree) and rich club (between-hub connections), (3) nodal, and (4) connection.
Results
In UKB, reductions in network efficiency were observed in MDD cases globally (d = −0.076, pFDR = 0.033), across all tiers (d = −0.069 to −0.079, pFDR = 0.020), and in hubs (d = −0.080 to −0.113, pFDR = 0.013–0.035). No differences in rich club organization and region-to-region connections were identified. The effect sizes and direction for these associations were generally consistent in GS, albeit not significant in our lower-N replication sample.
Conclusion
Our results suggest that the brain's fundamental rich club structure is similar in MDD cases and controls, but subtle topological differences exist across the brain. Consistent with recent large-scale neuroimaging findings, our findings offer a connectomic perspective on a similar scale and support the idea that minimal differences exist between MDD cases and controls.