We combine a new data model, where the random classification is subjected to rather weak
restrictions which in turn are based on the Mammen−Tsybakov [E. Mammen and A.B. Tsybakov,
Ann. Statis. 27 (1999) 1808–1829; A.B. Tsybakov,
Ann. Statis. 32 (2004) 135–166.] small margin conditions,
and the statistical query (SQ) model due to Kearns [M.J. Kearns, J. ACM
45 (1998) 983–1006] to what we refer to as PAC + SQ model. We generalize the class
conditional constant noise (CCCN) model introduced by Decatur [S.E. Decatur, in
ICML ’97: Proc. of the Fourteenth Int. Conf. on Machine Learn. Morgan
Kaufmann Publishers Inc. San Francisco, CA, USA (1997) 83–91] to the noise model
orthogonal to a set of query functions. We show that every polynomial time PAC + SQ learning algorithm can be
efficiently simulated provided that the random noise rate is orthogonal to the query
functions used by the algorithm given the target concept. Furthermore, we extend the
constant-partition classification noise (CPCN) model due to Decatur [S.E. Decatur, in
ICML ’97: Proc. of the Fourteenth Int. Conf. on Machine Learn. Morgan
Kaufmann Publishers Inc. San Francisco, CA, USA (1997) 83–91] to what we call the
constant-partition piecewise orthogonal (CPPO) noise model. We show how statistical
queries can be simulated in the CPPO scenario, given the partition is known to the
learner. We show how to practically use PAC +
SQ simulators in the noise model orthogonal to the query space by
presenting two examples from bioinformatics and software engineering. This way, we
demonstrate that our new noise model is realistic.