To send content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about sending content to .
To send content items to your Kindle, first ensure firstname.lastname@example.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about sending to your Kindle.
Note you can select to send to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be sent to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
The t-test is a work horse of a lot of statistical analysis in HCI. There are a lot of myths about how robust it is to deviations from normality and other assumptions. However, when faced with practical data, particularly those coming from usability studies, the claims of robustness do not stand up. This chapter reevaluates the t-test as a test for an effect on the location of data. This leads to considering robust measures of location, such as trimmed or Winsorized means and associated Yuen–Welch test as a robust alternative to the traditional t-test.
Non-parametric tests, in particular rank-based tests, are often proposed as robust alternatives to parametric tests like t-tests when the assumptions of parametric tests are violated. However, non-parametric tests have their own assumptions which, when not considered, can lead to misinterpretation and unsound conclusions based on those tests. This chapter explores these problems and differentiates between the more and less robust non-parametric tests. Modern robust alternative non-parametric tests are suggested to replace the less robust tests.
Traditional statistical testing, sometimes called null hypothesis significance testing (NHST), and the use of p-values has come under strong criticism. This chapter looks at the social issues of achieving significance in statistics that has led to the problems of NHST. It also discusses, though, how the framework of severe testing provides a way to understand NHST as a means of uniting experiments, statistics and evidence for research ideas. When viewed this way, NHST can still be an important approach to data analysis in HCI.
The focus of statistical tests on significance can lead researchers to desperately seek significance, particularly when an experiment has 'failed'. However, this chapter tries to make clear, using the framework of severe testing, the problem of seeking significance at any cost and the resulting weakening of results based on over-testing or fishing for significance. The chapter proposes some rules to guide the researcher to both explore data thoroughly but not go too far in pursuit of significance.
Analysis of variance (ANOVA) is a family of tests widely used in HCI but these tests are not as robust as claimed by those who use them. This chapter looks at exactly what ANOVAs are testing and therefore what makes suitable robust alternatives to ANOVA when the assumptions of ANOVA are not met.
This chapter discusses how statistics support scientific practice by providing evidence for new ideas despite natural variation we see between people, systems and contexts. In particular, the frameworks of severe testing and new experimentalism are used to show how experiments can add to knowledge in HCI, even in the absence of strong theories of interaction.
Likert items are widely used in HCI research both as a convenient measure on their own and in questionnaires to give a unified instrument. Researchers often have questions about how best to format Likert items and this chapter looks at the most common issues of the number of response options, whether to have a midpoint and how to label the options. There are clear recommendations based on the state-of-the-art research on these topics.
Correlations are important to see connections between different facets of people and their experiences with systems. However, consideration of correlation coefficients alone can be misleading and, in particular, this chapter discusses how outliers and clustering can distort an interpretation of correlation. It also raises a note of caution for the many other methods that implicitly use correlation.
Outliers are a problem for statistical analysis as they can have a disproportionate effect on means and statistical tests that rely on means. This chapter looks first at how to identify outliers and then, from thinking about what might cause an outlier, how best to analyse data when there are outliers.
A common question of researchers learning statistics is which test to use. However, this greatly depends on the research question and what experiments have been or might be done. It is not possible to give a simple answer to this question. Instead, this chapter provides three principles that can help guide researchers to devise the right experiment and to choose the right test to address their particular research question. These three principles are articulation, simplicity and honesty.
Likert items and questionnaires are widely used in HCI, particularly to measure user experience. However, there is some confusion over which is the right test to use to analyse data arising from these instruments. Furthermore, this book has proposed several more modern alternatives to traditional statistical tests but there is little evidence if they are better in the context of this particular sort of data. This chapter therefore reports on several simulation studies to compare the variety of tests that can be used to analyse Likert item and questionnaire data. The results suggest that this sort of data best reveals dominance effects and therefore that tests of dominance are the most suitable, and most robust, tests to use.
Bayesian statistics are often presented as a better, modern alternative to the Frequentist approaches centred around NHST and the resulting obsession with statistical significance. This chapter outlines the basic ideas of Frequentist and Bayesian statistics. It raises critiques of the Frequentist approach but also points out constraints on the Bayesian approach that are often omitted or overlooked. In particular, the chapter discusses how both Bayesian and Frequentist approaches rely on a move from statistical hypotheses to substantive hypotheses that cannot be justified by consideration of the statistics alone. Instead, both approaches can lead to sound knowledge through a care for data analysis, tied to the experiments that generate the data.