Authentic Assessment and Progress. Keeping it Real.
This post is based on the ideas that I outlined during my workshop at #TLT14 in Southampton. It forms part of the process of rethinking assessment at KS3 now that levels have gone. This is a live discussion at my school and is very much a work in progress.
A good starting point is to revisit the many very good reasons for moving away from levels. A recent TES post by Tim Oates explains this very well:
I’ve explored a lot of these ideas in previous posts:
- The Assessment Uncertainty Principle
- Great Lessons 5: Journeys
- Assessment, Standards and The Bell-Curve.
In replacing levels, we should be seeking to implement a system that tackles some of the problems levels created. Here is a re-cap of some of the problems that I see:
- Levels create the impression that learning follows a linear progress path in equal-sized steps. This an illusion – though widely held as true and enshrined in the levels-of-progress concept.
- Levels suggest precise parallel standards between subject areas within a school – 5a in History is as good as a 5a in Science – even though almost no work is done in schools to measure this, beyond checking distributions on a bell-curve model.
- In reality, levels and sub-levels have become general bell-curve indicators for a cohort not statements of absolute attainment – so the detail of what has been learned and understood is largely absent from the discourse between teachers and with parents.
- The moderation needed to ensure that a 5a in English in School X in Birmingham means the same as a 5a in School Y in Exeter doesn’t happen. Again, it is largely an illusion that this level of national standardisation is meaningful.
- It requires precious time and effort to explain how a piece of work can be assessed on a level scale; meaning and detail are lost in the process. Similarly, it takes precious time and effort to explain how the next level might manifest itself in a real piece of work; more detail and meaning are lost. Using levels does not help to explain the next steps in a child’s learning in most situations; it’s far more effective to explain the steps in the context of the work itself.
- Very often, the demand to show progress through incremental steps through the levels forces teachers to make arbitrary decisions and to concoct perverse attainment statements that do not fit the organic nature of their discipline.
A possible solution: Authentic assessment and progress reporting
What is authentic assessment?
In practice, there are just a few different ways to measure performance from which teachers can make deductions about learning:
- Tests. Right and wrong answers or extended answers evaluated for quality. This generates an aggregated score.
- Qualitative evaluation of a product against some criteria – a piece of writing, a painting, a piece of design, a performance. These can generate a wide range of outcomes: marks, scores, broad overall grades or levels. Teachers’ professional judgement is critical.
- Absolute benchmarks: A straight-forward assessment that a student can do something – or can’t do it yet. I’d suggest that there is a very limited set of learning goals that are simple enough to be reduced to can do/can’t do assessment; in most cases there is a proficiency scale of some kind.
Across the range of disciplines at KS3, different situations in different subjects lend themselves to being assessed using a particular combination of these measures. There is usually an authentic, natural, common-sense mode of assessment that teachers choose with an outcome that fits the intrinsic characteristics of the discipline. My suggestion is that we simply report how students have performed in these assessments, with data in the rawest possible state, without trying to morph the outcomes into a code where the meaning is lost.
Let’s explore an example:
In science, students learn about balancing chemical equations in Year 9. They take a test with several questions of increasing difficulty. Each question is assigned marks based on the number of elements that can be right or wrong. Some or all could be multiple choice questions. The marking generates a score which indicates the level of a student’s performance. It could be expressed in raw terms – say a mark out of 30, but a percentage would also to help make comparisons with other tests.
If consistent tests are used over time, the range of marks for any cohort will tell teachers about the performance of each student in the context of that specific topic. Over time, a series of tests allows teachers to build up a profile of a student’s learning and progress. Some tests might be harder than others but teachers can see this from the pattern of performance of the whole cohort. The more tightly focused each test is on a specific set of concepts, the more precise the information will be about any student’s learning.
Teachers would know that a score of, say 70%, is an exceptional score for student with a low starting point, representing excellent progress. For a High Starter ( to borrow from John Tomsett), 70% might represent progress below the expected level. For both students, the feedback can focus on the details of balancing equations and the wrong answers. This is miles away from the nebulousness of a 6c. At the end of each term or year, the cumulative data from tests would represent a strong basis for a discussion with students and parents and for making an overall statement about attainment and progress in a report.
This will work if the tests are well designed to sample the curriculum and to span the range of likely performance levels. It’s no good if lots of students gain full marks in every test because that would suggest that their is a ceiling on their potential attainment in that area of the curriculum. The details of all the tests could be shared with parents and students (perhaps online) so that it is clear and transparent. Eg High Starters should be aiming to achieve at least 80% on the unit tests and in the practical assessment. The tests cover the topic with questions like these…..
There is a case for exemplifying standards more explicitly with samples of writing. Not all of science is made up of right and wrong answers; there is always the question of depth:
A: When someone is running they need to pump more oxygen to their muscles and take the carbon dioxide to the lungs so their heart has to beat faster.
B: During exercise, energy is released from respiration in muscle cells as they contract repeatedly. The heart rate increases in order to regulate the supply of oxygen to the cells and the rate at which the waste product carbon dioxide can be expelled via diffusion from the blood into the air via alveoli in the lungs… Etc.
Without the obfuscation of a level ladder, it is possible to illustrate different levels of depth in an extended answer. This may link to the number of marks given in an assessment and could be used as an exemplar for parents and students. It is expected that Middle Starters making excellent progress will be writing answers like Example B by the end of Year 9.
I could make up a similar example for maths. There is likely to be a series of topic-specific tests and, in conjunction with some exemplars of the increasing level of challenge of content areas through the curriculum, this would give all the information needed. In History and Geography, each unit could have specific outcomes described with success criteria for a synoptic assessment allowing progress to be measured relative to a starting point. Exemplars for written work could be produced and the students’ books would serve as an organic record of progress for all to see. In Art or DT, success criteria could be used referenced to some exemplar work for students to benchmark their work against. Grading or levelling might work here at the impressionistic level that NC Levels were originally designed – not the basket-case of sub-levelling that we ended up with.
It might be too confusing for parents to engage with 10 very different modes of assessment across the curriculum. (One reason levels are held onto by some is because of the illusion of simplicity – an opiate for the masses that masks the underlying house of cards). At KEGS, we devised a generic *, 1,2,3 system that was explained in detail for each subject with specific attainment criteria defined and shared with students and parents. At Highbury Grove I think a similar system could work but we’d need to add in another dimension to account for the broader range of starting points. The principle would be the same: students with starting point X, should be aiming to reach standard Y by the end of the year, with the standards defined and exemplified by subject. We haven’t started work on this yet but it is the direction of travel.
Progress will be relatively easy to report, focusing on attainment relative to the starting point and the progress of the cohort. We’re going to use the simple four-stage code: Excelling, Good Progress, Some Concerns, Poor Progress.
A parent at KS3 could be told that, in Science, a Middle Starter child’s progress level is S (Some Concerns) because the assessments (eg a test average of 48%) indicate that progress isn’t yet in line with that expected for a student starting at that point. A similar assessment for a Low Starter might warrant a progress level G (Good Progress) and for a High Starter in would be P (Poor). The combination of progress and attainment is critical to understanding the full picture but the progress measure is the most important.
If I was told my son was Excelling – I wouldn’t necessarily need to know precisely how – I’d trust the teachers to know what they are doing. However, if I needed more information, I’d expect the teacher to say “your son is Excelling, because for his age and starting point, his score of 82% in the science assessment represents excellent progress”. In History, it might be a question of showing me my son’s books or an essay at parent’s evening so I could see the progress (or lack of it) with my own eyes. During lessons I’d expect my son to be informed of his areas for development in some depth; he should know which 18% he got wrong and why. Similarly, he should know where his writing in English needs improvement based on an authentic assessment that suits the process of assessing English. Levels? Marks out of 20? Approximate GCSE Grade? Whatever is the most natural and retains the most detail.
(See: Formative use of summative tests.)
Standards and Moderation
An important reinforcement to this approach will be the routine moderation of work between teachers within departments and between schools. If there was a national database of tests and samples of work that exemplified standards for children of different ages then schools could cross-reference their own standards easily. In the short term this needs to happen though school-to-school collaboration. Teachers in next-door classrooms ought to have a shared understanding of what ‘exceptional work’ might look like for their parallel Year 8 classes. Moderation should create upward pressure; if one school is getting much better work out of the Year 8s who came in with Level 6 in English, then it would lead to a review of standards. Currently, because everyone’s version of a level varies, that discussion is often reduced to an exchange of mutual suspicion about the validity other people’s assessments. If we ‘keep it real’, that won’t happen. It will just fuel an upward spiral of challenge. That’s the theory in any case. Let’s see!
As I said, this is a work in progress… and, as ever, I’m more or less thinking aloud.