Teaching strategy · 8 min read
Cognitive Load Theory in Plain English
The most useful piece of cognitive science for teachers, explained without the jargon.
Published 2026-12-26
Cognitive load theory has the unusual distinction of being both one of the most important findings in educational psychology and one of the most frequently mangled. In practice, it ends up in schools as a vague recommendation to 'not overload children,' which is true but not particularly useful.
Here's the actual idea, in plain terms, and what it means for how you teach.
The basic insight
Working memory — the part of memory actively processing information right now — is very small. Research suggests it can hold around four to seven distinct items simultaneously, and this capacity drops significantly when information is novel or complex.
Long-term memory, by contrast, is practically unlimited. Information stored in long-term memory can be retrieved into working memory as a single 'chunk' rather than as many separate items, effectively expanding working memory's capacity for familiar domains.
The implications: learning depends on getting new information from the environment into long-term memory. This transfer has to pass through the bottleneck of working memory. If working memory is overloaded — if there's too much unfamiliar information to process simultaneously — nothing gets to long-term memory. The child appears to be learning but isn't retaining.
This is cognitive load. Too much of it and learning stops.
Three types of cognitive load
It helps to distinguish three different sources of mental effort:
**Intrinsic load** is the inherent complexity of the material itself. Multiplying two two-digit numbers is harder than multiplying one-digit numbers. Writing a persuasive argument is harder than writing a simple recount. You can't eliminate intrinsic load — the difficulty is the point — but you can sequence learning to manage it, building simpler components before combining them.
**Extraneous load** is unnecessary mental effort created by poor instructional design. Confusing instructions. A worksheet where the layout itself requires decoding before the child can address the content. A diagram on one page and its explanation on another, requiring children to hold information in working memory while switching between them. Extraneous load wastes cognitive capacity without contributing to learning. It's the type teachers have the most control over.
**Germane load** is the mental effort of actually building and connecting knowledge — making sense of new information and connecting it to what's already known. This is the useful kind of cognitive effort. More germane processing means more learning.
The goal is to minimize extraneous load so that more working memory capacity is available for germane load.
What this looks like in practice
**Worked examples are powerful for novices.** When material is genuinely new, showing children a complete worked example is more efficient than asking them to solve a problem. The reason: discovering the solution to an unfamiliar problem requires searching through possibilities, which consumes huge amounts of working memory without necessarily producing the right connections. A worked example gives the child a complete model to study, and studying is more efficient than searching.
This is counter-intuitive for teachers raised on discovery learning. But the research is clear: for genuine novices, worked examples outperform problem-solving for initial learning. Once students have enough background knowledge, problem-solving becomes more effective — but not before.
**Reduce split attention.** Extraneous load spikes when children have to hold information in working memory while looking for something in the environment. The classic example: a diagram with labels on a separate key, requiring children to repeatedly cross-reference. The fix is to put labels directly on the diagram. This isn't just tidier — it frees cognitive capacity.
The same principle applies to instructions. If children need to refer back to instructions while completing a task, the instructions are extraneous load. Either simplify the task to eliminate the need for reference, or make the instructions immediately accessible at the point of use (printed at the top of the worksheet, not on the board at the front).
**Novices and experts need different approaches.** A child who has no background knowledge in an area benefits from explicit instruction: explanation, examples, guided practice, with complexity added gradually. A child who already has strong background knowledge in an area can benefit from more independent exploration — their long-term memory is doing more of the heavy lifting, freeing working memory for genuine inquiry.
The common error is using the same approach for both groups. When a novice is asked to explore independently, they search without a schema, burning cognitive capacity without productive results. When an expert is walked through an explicit explanation of something they already understand, they're not learning — they're waiting.
**Worked examples should fade.** The worked-example effect applies to novices, but the aim is always to build toward independence. As children become more familiar with a type of problem, the scaffolding should reduce: a fully worked example → a partially worked example with gaps to complete → a similar problem to solve independently. This is called the 'fading effect,' and it's one of the most practical things cognitive load theory offers teachers.
**Reduce unnecessary information.** This sounds obvious but cuts against a lot of good-faith teaching practice. Interesting but irrelevant contextual information in a worked example isn't helpful — it's extraneous load. A beautifully designed worksheet with illustrations and decorative borders takes cognitive effort to parse. Complex multimedia presentations can split attention between channels rather than integrating them.
The standard isn't minimalism for its own sake. It's: does this element contribute to the learning, or does it consume working memory without returning anything useful?
The practical bottom line
You cannot simultaneously teach children to do something genuinely new and complex while expecting them to manage their resources, respond to behavioural prompts, and process elaborate instructions. Something will give — usually the learning.
When a lesson doesn't work, cognitive load problems are often the explanation. Too many elements introduced at once. Instructions that required too much working memory to decode. A task that assumed more prior knowledge than children actually had. These aren't motivation failures or attention problems. They're design failures.
The fix is almost always to sequence more carefully, reduce unnecessary complexity, and give more explicit scaffolding before asking for independence. This is slower in the short term and much more effective over time.
Going deeper
On cognitive science and teaching
Books we'd recommend on the topics raised in this article.
Cognitive science in the classroom
- P Powerful Teaching: Unleash the Science of Learning — Efrat Fischler, Miriam Sherin, Virginia Baxter Esche
- M Make It Stick: The Science of Successful Learning — Peter Brown, Henry Roediger, Mark McDaniel
- W Why Don't Students Like School?: A Cognitive Scientist Answers Questions About How the Mind Works — Daniel Willingham
- C Cognitive Load Theory — John Sweller, Paul Ayres, Slava Kalyuga
Convenience links to Amazon. As an Amazon Associate we earn from qualifying purchases at no extra cost to you. Read our affiliate disclosure.
Keep reading
Teaching strategy
The Quiet Power of Low-Stakes Quizzing
Frequent low-stakes quizzes might be the single highest-impact, lowest-effort change you can make to your teaching this year.
6 min read
Teaching strategy
Growth Mindset — The Honest Version
Growth mindset became a school-wide initiative in many places. Most of those initiatives misunderstood the original research badly. Here's the honest version.
6 min read
Teaching strategy
Differentiation without 27 worksheets
The 'must, should, could' tiered worksheet model has been the default differentiation approach in UK primary for two decades. It's exhausting to plan, often counterproductive, and based on assumptions that don't hold up. Here's a better way.
8 min read