The ideal programme evaluation is often characterised as a tightly controlled experiment assisted by carefully chosen compliant principals, properly trained co-operative teachers and large representative samples of willing students, randomly assigned to operationally defined treatments. The task of assessment may be facilitated by the ready availability of valid objective tests of all relevant outcomes, which lend themselves to multivariate statistical analyses of results – all supplemented and enriched by masterful ethnographic studies, whose observations are highly reliable and replete with insight.
In the real world, unfortunately, things are different. Samples are biased or unmatched, school principals are unco-operative, teachers defect, or take maternity leave, pupils move out of the district, or fall ill, contamination occurs between experimental and control groups, and tests prove too difficult or too easy for students. These and other unexpected stumbling blocks require the resourceful evaluator of a language programme to consider alternative ways of assessing the merits of new programmes, and to make informed judgements about which hallowed principles are essential, which are desirable, what might be feasible under the circumstances, and what is to be avoided at all costs.
In an effort to help disentangle, and to provoke discussion on such matters, this paper will outline various decision points in typical evaluation exercises, and make suggestions for coping with planned and unplanned contingencies.
Who should undertake the evaluation?
We will assume that the new language programme is at an advanced stage of development and that formative evaluation has already brought about required improvements.