Increasingly sophisticated and precise molecular genetic tools are applied to mice in order to study the cellular mechanisms underlying higher brain functions, including learning and memory. However, several studies have produced unclear or conflicting results. One reason for this is that performance in the behavioural tests used to assess learning and memory is influenced by various non-cognitive phenomena and can thus easily be affected by mutations through mechanisms unrelated to memory function. We conducted principal component analysis on data from 3003 mice tested using a standardized protocol to demonstrate this for the Morris swimming navigation test, one of the most widely used paradigms to assess memory and hippocampal function. In addition, we present a meta-analysis showing that genetic background and environment alone produce sufficient variation to span the range of most, if not all, behavioural variables and can thus easily mask or fake mutation effects if genetic studies are not designed properly. We suggest that the chance of obtaining useful results is maximized if behavioural deficits are differentiated by combining complementary behavioural protocols and by analysing multiple complementary parameters in each of them. Mutation effects must be contrasted statistically against the influences of genetic background and environment. In many situations, this is most efficiently achieved if (i) mutations are backcrossed to and maintained in one or (preferably) two well-characterized, commonly available inbred strains and (ii) if mutant and wild-type littermates are analysed on a hybrid or mixed genetic background, that is in F1 or F2 generations derived from the inbred stocks