Scales, stars and numbers: the question of evaluation

By Miranda Prynne, 7 April, 2023
View
Marking schemes are a recurring source of contention in academic discussions, where the key word is evaluation. Daniel Jutras offers a brief reflection on the art of grading and feedback
Article type
Article
Main text

There was a little dust-up in the cultural pages of the Montreal daily La Presse+ late last year. Theatre director René Richard Cyr railed against the rating scale for art critics adopted by the online newspaper and denounced a “paternalistic”, “laughable” and “infantilising” system. This was sparked by La Presse+ abandoning its five-star rating grid and replacing it with a decimal scale. Films, novels, plays and albums are now rated from one to 10. Cyr, who was already not particularly fond of the five-star system, reminds us that his job is to “invent worlds where we want to believe that everything is still possible and where sixteen divided by three equals a thousand suns”. Instead, he invites the critics to describe works “using analysis, intelligence and sensitivity, with words, impressions and ideas”. To hell with rating on a scale of one to 10.

This echoes a university departmental assembly. The question of grading schemes is a recurring topic in academic discussions where the key word is evaluation – evaluation of exams, assignments, manuscripts for publication, promotion dossiers, programmes, teaching and so on.

Let’s look at the evaluation of exams and assignments. The rest would detract from my point. 

Cyr is quite right: all grading grids are reductive. But they still have a meaning, which may vary depending on the recipient. So, here’s a first question: what’s the meaning conveyed by the grade? In the academic world, the awarding of a grade, whether it’s a percentage, a letter or an honour, serves two distinct but related purposes. 

A grade is primarily feedback – definitive, monolithic, crude – on the achievement of certain learning objectives. Apart from in some specific contexts, the mark evaluates the result rather than the effort, a nuance that is not always well understood. It can be a more or less precise range, from a binary statement of success or failure to a percentage scale and everything in between.

The other purpose of grading is comparison. The mark given places each “performance” on a scale that situates it in relation to others, with a desirable but variable degree of accuracy and objectivity. In an ideal world, the grading grid would allow individuals to situate themselves in relation to the group – useful information in a learning journey – without having their position on the scale shared with others. Nobody likes bad grades: not artists, not restaurants, not students. But bad grades hurt even more when they are used as the basis for decisions made by people other than the one being evaluated: a potential client or viewer, a graduate school admissions committee or a potential employer.

There’s a clear tension between these two aims of evaluation. Grades and their distribution on a curve provide clear, simple and immediately usable information for third parties. As a professor, am I accountable to these third parties? Do I have to worry about how they receive and use this information? Conversely, when it is intended for the person being assessed, the grade alone does not provide sufficiently informative feedback.

As a young professor, I spent a lot of time constructing grading grids that made it possible to distinguish a B from a B- when marking essays. It was a waste of time. Students filed into my office to find out more. The conclusion, not surprisingly, was that feedback in “words, impressions and ideas”, to use Cyr’s words, is more telling when it explains what’s wrong with an essay. But this requires time and resources that aren’t always available, especially if the group consists of dozens of people, all wanting to know exactly why they didn’t do as well as they had hoped.

Over the years, I came to the conclusion that feedback “in words” is more important than marks, though my students didn’t always agree with me. I’m aware that grades have consequences and should not be given carelessly or cavalierly. But I chose to pay greater attention to the close relationship between evaluation and learning by:

  • making sure that I assessed the skills and knowledge explicitly used in my course
  • disclosing my evaluation grid and the relative weight given to each element ahead of time
  • giving each student a paper version of this grid, annotated with the contents of their own examination booklet.

My group sizes varied, from about 15 people in a seminar to large groups of close to 200 in a required course. In doing this, I devoted many hours and days around Christmas as well as beautiful days in May. I didn’t please everyone, but it was worth the effort. Through trial and error, I usually – though not always – managed to fulfil my fundamental responsibility of explaining successes and failures to each person I taught.

In responding to Cyr, some argued that awarding a rating out of 10 was not intended for the artists but rather for all those people looking for a way to choose among the many shows on offer. This, in my opinion, is where the jobs of professor and theatre critic part ways. Teachers should avoid worrying too much about the other “users” of the information conveyed by transcripts. To those of you who devote many hours to reading, evaluating and commenting on papers, theses and other exams, I raise my hat. This will always be the most difficult aspect of your chosen career.

PS: The play Cyr directed was rated 8.5 out of 10.

Daniel Jutras is rector of the University of Montreal.

This is an edited version of a blog, “Stars and numbers: the question of evaluation”, originally published on the University of Montreal website.

If you found this interesting and want advice and insight from academics and university staff delivered direct to your inbox each week, sign up for the Campus newsletter.

Standfirst
Marking schemes are a recurring source of contention in academic discussions, where the key word is evaluation. Daniel Jutras offers a brief reflection on the art of grading and feedback

comment