The government recently announced a significant investment in evidence-informed practice for early years practitioners, through the Early Years Stronger Practice Hubs programme.
So, this is an important time to consider a key question: what do we actually mean by evidence-informed practice? Why should we adopt this approach?
Back in 2013, Ben Goldacre argued that: ‘By collecting better evidence about what works best, and establishing a culture where this evidence is used as a matter of routine, we can improve outcomes for children, and increase professional independence.’
Let’s look at that through the lens of early years.
How might we decide 'what works'?
The question of what works is a tricky one. An early years setting or a school might try out a new approach and find it works well – but it might not work nearly so well a couple of miles up the road, let alone in another part of the country. We could conclude that all knowledge about ‘what works’ is essentially local, as a result.
But there is a different way to think about this. We should not base our decisions on a single piece or research, because that’s not a reliable guide to decision-making. As an alternative, we might consider some more reliable options.
Research synthesis and meta-analysis
In this approach, researchers analyse the findings of a large number of high-quality, robust studies. They identify the common features which led to improvement, and then average out the findings of impact across all those studies. These findings are much more likely to be applicable across a range of different contexts: to the setting where I work, and equally to other settings in other places. You can see this approach in the Education Endowment Foundation’s (EEF’s) Early Years Toolkit, which is an easy-to-use guide.
Randomised Control Trials (RCTs)
An RCT is a carefully designed trial in which a new approach or intervention is tried out in some settings, whilst others carry on working in their normal way – ‘business as usual’. The researchers very carefully match the characteristics of the settings and the children in both groups. That means we can be confident that any difference in outcomes results from the new approach or intervention.
Imagine a new approach to helping children to develop their communication and language seemed to lead to children making excellent progress in a setting.
The approach might appear to work. But, without using an RCT, we might not be able to tell if that difference could be down to other factors. Maybe there were unusually low numbers of children with SEND and delayed language in the setting. So the positive impact might be more about the characteristics of the children, than the quality of the new approach. We can't tell.
Promising approaches
While research synthesis is reliable, it doesn’t help us to see if an innovative approach might be worth trying.
RCTs are high-quality, but they are expensive and take a long time to set up.
When we are in a difficult situation – for example, the COVID-19 pandemic – trialling promising approaches might be a good approach.
This will build on existing evidence, but might use a different and novel way of working. An example from my own experience at East London Research School is the Maths through Picturebooks intervention.
This responded to an urgent issue: there were children in their reception year who had missed a lot of their nursery education and were struggling to understand number and the composition of number. The project was based on high-quality evidence about how children understand number, summarised in the EEF’s guidance report Improving Mathematics in the Early Years and Key Stage 1.
The approach was novel for England, but it drew on high-quality research from Stanford University's Dreme network which showed the effectiveness of using carefully-selected picturebooks as a medium for teaching and learning about numbers.
However, it is important to be cautious about the results of pilots like this.
Generally, the early years practitioners who join in with a pilot will be enthusiasts for the approach. They will be highly motivated and want to make the new approach work. When things work well in pilots like this, researchers sometimes call it a halo effect. It's a form of bias: because we like engaging with certain forms of professional development, or working with certain trainers or organisations, we want the pilot to work. As a result, we put a lot of effort into making sure that's exactly what happens. Or we look for positive results and exaggerate them.
There is no guarantee that the intervention will work as well in a more ordinary setting, without this enthusiasm.
So, where pilots work out, they still need to be studied more systematically. It would be unwise to put them into practice in more settings until then.
Are RCTs ethical?
Sometimes, practitioners worry that it isn’t fair to offer a new approach to some children and not others. This is, indeed, an important consideration. Here are some reasons why this concern might be misplaced:
- In the EEF’s RCT trials, all the children will receive the new approach if it’s found to work. The 50% who do not receive the new approach or intervention (the ‘control group’) will receive it later.
- Many new approaches and interventions don’t make a significant difference. So being in the control group doesn’t always have a negative impact. In some cases, the progress of children in the intervention group is actually worse than those in the control group. For example, the EEF's evaluation of Fresh Start found that children allocated to the Fresh Start intervention group made the equivalent of 2 months’ less progress, on average, than children in the control group. New approaches are not necessarily better approaches.
Careful and critical readers of research
It’s important to note that the methodology of a piece of research is not, in itself, a guarantee that it’s useful. Nor is the fact that the research is printed in a peer-reviewed journal.
Let’s consider a couple of reasons why. Peer-reviewed journals are an important forum for publishing research. But not everything that’s been peer-reviewed is going to be reliable or robust. For a start, few of us have time to delve into the methodology, follow all the references, and check all the maths and the statistical analysis.
The method, methodology and write-up of a study might be tip-top, but we also need to consider the baseline data.
For example, a study might be a well-conducted RCT. It might be written up in a journal. But its results might be worthless.
How might this be?
Imagine you see a post on social media saying that parachutes don’t reduce the risk of death or major injury when people jump from planes.
It’s a captivating and surprising headline – so you check the source. It’s the well-reputed British Medical Journal (BMJ). Even more convincing, the findings have emerged from a randomised control trial.
So why is this research wrong?
The surprising answer is – it isn’t.
The researchers, led by Robert W Yeh, conducted a high-quality RCT. But what you may have missed it that all the participants only jumped 0.6m metres off planes that were standing still. Having a parachute pack made no difference.
The researchers are kidding us, but they're making a serious point: we must always be careful and critical readers of research.
Are RCTs always best?
It's also important to note that RCTs are not always appropriate.
Imagine if we had waited for an RCT before we concluded that smoking was dangerous to health.
We’d have needed a trial in which an ample sample of volunteers was collected, with half randomly made to smoke for 20 years with the other half was made to abstain.
Both the ethics of the trial, and the length of time to secure the findings, make this completely unsuitable.
That’s why researchers needed to use a range of different approaches to demonstrate the health risks of smoking. These included population studies (e.g. following two distinct, healthy groups for a period of time, one of smokers and one of non-smokers, matched by age, sex, occupation etc).
RCTs can also be misleading in early years education.
It's worth noting that settings which choose to take part in RCTs, and end up in the control group, are unlikely to be ‘normal’ settings. For a start, they are interested enough in research and professional development to sign up to the trial. That probably means they keep up with lots of other research about effective practice, too.
The comparison is not always as conclusive as we might imagine.
(With thanks to Dylan Wiliam for the two examples above.)
Working with research and evidence is not straightforward. That’s why having a trusted source of evidence, like the Educational Endowment Foundation, is a good move. Their teams of researchers check thousands of studies.
Using these careful summaries of high-quality research (syntheses and meta-analyses), and checking new approaches through well-designed RCTS, are reliable approaches to decision-making.
This is important, because the decisions we take can make a big difference to the lives of the children we work with - for good, or for ill.
Yet even so, nothing in this field is 100% reliable.
I think we should use the work of the EEF to identify the ‘best bets’ when we make our decisions. But there is no assurance that things that have worked well in many places, will necessarily work as well in every place. As Professor Dylan Wiliam memorably commented, ‘everything works somewhere; nothing works everywhere’.
Early years education is a messy place, where there are many uncertainties and variables.
What about professional judgement and experience?
Evidence-informed practice is an approach which is based on the best-quality evidence, allied to the professional judgement and expertise of early years practitioners.
It’s up to the practitioners on the ground to reflect on their experiences and use their professional judgement to consider what needs most improvement in their setting, and whether new approaches are a good fit for their context.
The more confident we become about the evidence, the more our professional confidence will deepen. The more powerful our reflections will be.
For example, it’s commonplace for all sorts of people to put forward their views about what we should and shouldn’t be doing in the early years – politicians and other media commentators, for example.
Yet it would be unusual for people outside the medical profession to start telling doctors how to perform certain operations, or make choices about drug treatments.
One of the prizes of evidence-informed practice in the early years might be increased professionalism, and wider public confidence that we are making choices for the right reasons.
Influencer-informed practice
If we don’t use evidence as our key guide, what will we use? Should we decide to do something because it’s sold to us persuasively by a consultant? Or because we were inspired by a speaker at an event? These seem like poor approaches to making big decisions. As I commented earlier, these are decisions which can change children's life-chances.
Ben Goldacre refers to the dangers of ‘eminence-based medicine’, where decisions are taken because of someone’s position of power, charisma or authority. We face similar risks in early years.
The main alternative to ‘evidence-informed practice’ right now appears to be ‘influencer-informed practice’. Influencers may create terrific social media posts. They may have many thousands of followers. But if we squint a little, do we find any evidence to support their claims? How can we be sure that the approaches they are promoting are securely based in evidence and likely to work?
If I may take a bit of a liberty and go back to Ben Goldacre, with adapted wording: ‘there is a huge prize waiting to be claimed by early years practitioners. By collecting better evidence about what works best, and establishing a culture where this evidence is used as a matter of routine, we can improve outcomes for children, and increase professional independence’.
PS: In this blog, I have drawn a lot on the work of other people, most notably colleagues at the Education Endowment Foundation, the Research School Network, and the work of Professors Ben Goldacre and Dylan Wiliam. At the time of writing, I lead a Research School which is funded by the EEF.
I am not claiming originality here, but I hope I have been helpful, brief and clear. If there are errors, they’re mine and not the responsibility of those I have thanked above.
A big thank you to Caroline Vollans for subbing this blog and for her help with apostrophe's.
I'm not so sure the EEF is reliable, there have been many critiques, examples are on Feedback, where the EEF combined studies that have nothing to do with feedback - e.g., Ekecrantz (2015), Mannion (2017), Mannion (2020) & Fletcher-Wood (2021) also cast doubt about the EEF's & Hattie's representation of Feedback. then Greg Ashman (2018c) also details similar problems with the EEF's top strategy, meta-cognition, which Ashman says,
ReplyDelete"appears to be a chimera; a monster stitched together from quite disparate things."
Sundar & Agarwal (2021) analyse EEF & Hattie and also warn of this,
"If learning strategies included in the meta-analysis are not consistent and logical, then beware! For example, if you find a meta-analysis that groups together “feedback” strategies including teacher praise, computer instruction, oral negative feedback, timing of feedback, and music as reinforcement, does that sound consistent to you?" then Ashman (2018) - "If true randomised controlled trials can generate misleading effect sizes like this, then what monsters wait under the bed of the meta-meta-analysis conducted by Hattie and the EEF?"
Hello George and thank you for taking the time to post this comment. It's important to check the validity of evidence and sources of information.
DeleteHowever, the critiques you are sharing are about the previous version of the Toolkit.
Since then, it has been updated. It is no longer based on a review of reviews, which run the risk of generating very broad-brush averages.
Instead, the updated Toolkit now uses a meta-analysis of single studies which allows the EEF to run moderator analysis on heterogeneity and understand a lot more of the different interventions behind the analysis. A good example is on the feedback point, where in the text strand the EEF communicates a different impact for feedback using digital technology in the “Behind the Average” section.
thanks Julian, It looks like the EEF responded to those critiques, I can see they removed all the meta-analyses they previously referenced. Did they make a public statement about this as it indicates those critiques about combining studies that have very little to do with feedback in the classroom were correct. Admission of this fundamental change is a big deal. Steve Higgins did some webinars here in Aus in 2018-2019 and defended EEF & Feedback with that simplistic response of combining apples and oranges is ok if you are looking at fruit. I was very disappointed with that. Hattie dominates here and he has not made those changes. Most DOE's here publish Visible Learning (2009) as the definitive work on Education. It's got to the stage where in QLD you have to link any Ed Initiative to Visible Learning (2009) to get funding from the DOE.
ReplyDeleteInteresting piece. I retired in 2021 but this reminds me of conversations with colleagues and my dilemmas as a head teacher; practitioner enthusiasm being such a consistent indicator of any programme’s likely success.
ReplyDeleteI really enjoy reading articles. I thanks for sharing this.
ReplyDelete