Universal Tests

José Hernández-Orallo

doi:10.1017/9781316594179.017

One [way of calculating the longitude at sea] is by a Watch to keep time exactly. But, by reason of the motion of the Ship, the Variation of Heat and Cold, Wet and Dry, and the Difference of Gravity in different Latitudes, such a watch hath not yet been made.

– Isaac Newton, Letter to Josiah Burchett (1721), remarks to the 1714 Commissioners for the Discovery of the Longitude at Sea, quoted by Sobel (2005, p. 60).

CAN WE DEVISE behavioural tests that are valid for every possible kind of subject, biological or artificial? How can we determine an appropriate interface and adjust to each subject's resolution? To make things more challenging, in some scenarios (e.g., collectives, hybrids, artificial life) we may need to detect the subject first, prior to measuring it. Universal tests would be a fantastic tool for universal psychometrics, not only to evaluate completely anonymous subjects, but also to falsify wrong hypotheses about any general relation between tasks and features. In practical evaluation, however, universal tests would be very inefficient for subjects for which we have already some information, such as humans and animals, and machines for which we have an operational description.

ONE TEST FOR ALL?

In Chapter 6 we discussed the different principles for the measurement of behavioural features for humans, non-human biological systems and AI systems. We argued that a unification of these principles was needed, and we have developed newfoundations in the previous chapters. However, the use of a common foundation does not mean that we must necessarily use the same measurement instruments for all the subjects in the machine kingdom. This is similar to time, temperature, mass or any other measurable trait. It is usual to have specialised instruments depending on the context or the range of objects for which the instrument is intended (Thurstone, 1928, p. 547). For instance, we do not use the same kind of thermometer tomeasure the temperature of a person as to measure the temperature of a volcano. In fact, the medical thermometer would just melt inside the lava – as an IQ test can be gamed by an AI system. However, the mechanisms that both thermometers use are ultimately understood and related (or calibrated) to the physical notion of temperature.

Book contents

16 - Universal Tests

Summary

Access options

Book contents

16 - Universal Tests

Summary

Access options

Save book to Kindle

Save book to Dropbox

Save book to Google Drive