International Assessment of Higher Education Learning Outcomes (AHELO)

OECD is undertaking a Feasibility Study for the International Assessment of Higher Education Learning Outcomes (AHELO)

The outline of the programme, published here, is as follows:

The OECD Assessment of Higher Education Learning Outcomes (AHELO) is a ground-breaking initiative to assess learning outcomes on an international scale by creating measures that would be valid for all cultures and languages. Between ten and thirty-thousand higher education students in over ten different countries will take part in a feasibility study to determine the bounds of this ambitious project, with an eye to the possible creation of a full-scale AHELO upon its completion.

The 21st Century is witnessing the rapid transformation of higher education. More students than ever before enter higher education and a growing number study abroad. The job market demands new skills and adaptability, and HEIs (“Higher Education Institutions”, which include universities, polytechnic schools and colleges) struggle to hold their own in a fiercely competitive marketplace. Ministers at the Athens Conference agreed that OECD countries needed to take a further step by making higher education not only more available but of better quality, and that current assessment methods were not fully adequate to meet these changes. An alternative had to be found. AHELO is the result.

There are, however, some real problems with this approach. Let’s look first at the elements of the study:

The factors affecting higher education are woven so tightly together that they must first be teased apart before an accurate assessment can be made. The AHELO feasibility study thus explores four complementary strands.

The four strands are:

    – generic skills
    – discipline specific strands in engineering and economics
    – learning in context: physical and organisational characteristics; education-related behaviours and practices. including “student-faculty interaction, academic challenge, emphasis on applied work”; psycho-social and cultural attributes; behavioral and attitudinal outcomes.
    – value-added (fraught with difficulty)

All seems worthy and laudable stuff, particularly the desire to move away from the reductionism of international league tables in favour of a more rounded and teaching and learning focused view. However, the approach is fundamentally flawed in its core assumption that learning outcomes are assessable in a meaningful and comparable way and indeed that this is desirable in higher education. This approach is therefore quite mistaken. One of the reasons that the factors affecting HE are so closely woven together is because of their inter-dependence and inseperability. The approach to assessing learning outcomes which seems to underpin the study has its origins in the Learning By Objectives movement of the last century from the USA: it failed there as a means of assuring standards of education and does not offer a way forward here. Looking to make the outcomes explicit and judge the standards of those outcomes and then compare them is misguided.

This is because explicitness about such outcomes, cannot, in itself, convince us that those outcomes are being achieved or that they are correct or even worthwhile. Even if we were in a position where we were able to describe the standards embodied by such outcomes satisfactorily (a questionable assumption), this could in no way be taken as assurance that such standards were being achieved or indeed that anyone fully understood what was meant by such descriptors. There is no necessary correlation between description and understanding – rather this would represent an extended and complicated version of a naming fallacy.

An extract from my book (with apologies for the self-referencing) ‘Dangerous Medicine: Problems with Assuring Standards and Quality in UK Higher Education‘ (p158-9) reinforces this:

Commenting on assessment in US education, Stake highlights the failure of large scale mandatory externally imposed assessment in schools in the USA to improve standards. He argues that the consequences of this assessment regime need to be more fully evaluated in order better to inform policy but the lessons for the UK are instructive. Glass, pursuing a similar theme, criticises the ‘nonchalance’ of ‘experts’ in dealing with the issue of standards, particularly in relation to those concerned, such as Mager, with the setting of behavioural objectives (from which the origins of the UK competence movement can be traced) and observes that the ‘language of performance standards is pseudoquantification, a meaningless application of numbers to a question not prepared for quantitative analysis’. He further examines the evolution of ‘criterion-referenced testing’ in which he describes the meaning of ‘criterion’ as a ‘case study in confusion and corruption of meaning’. Glass takes particular exception to the use of cut-off scores to differentiate performance where, ultimately, a decision on whether ‘to ‘pass’ 30% vs. 80% is judgmental, capricious, and essentially unexamined’, ie totally arbitrary.

It is important to note the culturally specific origins of these ideas which were developed in the United States in the 1930s; adaptations of Tyler’s approach became extremely influential there in the 1960s with the country desperately seeking technological advance and therefore open to an industrially-oriented and rational model which, in providing specified and measurable behavioural objectives, was inevitably attractive to federal and state funders. Although the circumstances are rather different in a post-millennial UK, the HE sector nevertheless appears to be moving towards adopting a new version of 70-year old model, a bastardised interpretation of which failed in another country 30 years ago. There are many other problems associated with the learning by objectives approach but it is worth noting Stake in his retrospective on his earlier paper ‘The Countenance of Educational Evaluation’ admitting, in slightly apologetic tone, his error in stating in the paper ‘that evaluators could improve their judgements of quality by identifying congruence between intent and outcome’. As Stake acknowledges, all this does is assist with description and understanding and such congruence says nothing about the merit of the course, programme or the individual student’s learning evaluated. As Norris observes, objectives-based evaluation inevitably leads to an over-valuing of measurable tasks and assumes that values are relatively unimportant.

So, I would suggest that this is really the wrong approach to be taking.


Stake, R (July 1998), ‘Some Comments on Assessment in US Education’, Education Policy Analysis Archives, 6(14) http://www.olam.ed.asu/epaa
Glass, G V (1978), ‘Standards and Criteria’, Journal of Educational Measurement, 15(4), pp237-261.
Stake, R E (1991), ‘Retrospective on the Countenance of Educational Evaluation’ in McLaughlin, M W and Phillips, D C (1991), Evaluation and Education at Quarter Century: Nineteenth Yearbook of the National Society for the Study of Education, pp67-88.
Stake, R E (1967), ‘The Countenance of Educational Evaluation’, Teachers College Record, 68(7), pp52-69.
Norris, N (1990), Understanding Educational Evaluation, London: Kogan Page.