I acknowledge the Ngunnawal people, the traditional owners of the lands we are meeting on today. In so doing, I recognise that the issues we are discussing today have special resonance for First Nations communities. Governments that continually learn and improve will make faster progress at Closing the Gap.
1. A learning machine, not a guessing game
When a German bakery chain wanted to improve sales, it didn’t bring in consultants or introduce a sweeping new business model. Instead, it tried something much simpler: it ran a randomised trial (Friebel et al. 2017).
Some of its 193 stores were offered a modest group bonus for staff. Others weren’t. After a few months, the results were in. The bonus group had increased sales by 3 per cent. For every dollar spent on bonuses, the company gained $3.80 in revenue and $2.10 in operational profit. Encouraged by these findings, the company rolled the program out more broadly. Profit margins rose by more than 60 per cent, which might be the best thing to come out of a bakery since pretzels.
It’s a reminder that in both business and policy, good ideas are important – but better still is knowing whether they work. And that’s what randomised trials offer: the ability to learn what works, what doesn’t, and where public resources will do the most good.
We’ve seen this thinking increasingly embraced in government, too. Across Australia’s public service, we’re embedding a culture of testing and learning – through small‑scale trials, behavioural insights, and rigorous evaluation. From tax compliance nudges to SMS reminders that improve service delivery, we’re building an evidence base for better decisions.
Because being willing to learn isn’t a sign of weakness; it’s a sign of seriousness.
Almost a century ago, the philosopher John Dewey wrote that ‘a problem well put is half‑solved’. Randomised trials help us frame problems clearly. They allow us to compare options fairly. And they help ensure that taxpayer dollars are used not just efficiently, but wisely.
In a world of tight budgets and rising expectations, that kind of disciplined curiosity matters more than ever. As a government, our job isn’t just to deliver services – it’s to keep making them better. And that begins with learning.
Over the next few minutes, I want to share how randomised trials are helping us do exactly that – from small changes that improve service delivery, to better policy design, to the infrastructure we’re building to make learning part of how government does business.
2. What is government productivity – and how do we learn to improve it?
In the private sector, productivity is relatively straightforward: output per unit of input. A delivery company that reduces the cost per parcel is improving its productivity. A call centre that shortens the average handling time without compromising service is doing the same.
In government, the outputs are more complex, and arguably more important. They’re things like higher school completion rates, shorter surgery wait times, fewer people stuck in long‑term unemployment. What we care about is not profit margins, but public value.
So when we talk about government productivity, we’re talking about better outcomes for citizens – achieved with the same, or fewer, public resources.
And just like in the private sector, we improve productivity in government by understanding what works. Not just what sounds plausible, or what’s been done before, but what actually improves results.
That’s where randomised trials come in.
By comparing 2 versions of a program – one that includes a new intervention, and one that doesn’t – we can isolate the effect of that change. It might be an SMS reminder. A redesigned letter. A new digital prompt. Or a pilot coaching service for jobseekers. Some of these interventions work remarkably well. Others don’t. But each trial helps us learn, and over time, build a more effective, more responsive and more productive public sector.
Crucially, these aren’t abstract exercises. They’re grounded in real‑world decisions. Should we send this letter or that one? Should we roll out this new program nationally, or trial it first in 2 regions? Should we allocate resources toward one approach, or a better‑tested alternative?
Every trial is a chance to find out.
And as we accumulate this evidence, we’re not just improving individual programs. We’re improving the system’s ability to learn. The learning machine gets stronger with each iteration. That’s the difference between policy and guesswork. It saves us from reinventing the wheel, only to discover it’s square.
3. Learning from micro‑experiments: the text message trials
Some of the most valuable lessons in government have come not from major reforms, but from small experiments: short messages, subtle nudges, and modest changes that quietly improved the way services are delivered.
Take Services Australia. In one randomised trial, a simple text message was sent to people who had submitted a form to the department. It was a confirmation message, only 3 sentences, letting them know their submission had been received (Department of Prime Minister and Cabinet 2017a).
The intervention was simple: a brief text message confirming that a submission had been received. The kind of small gesture most people might ignore, but quietly appreciate. It was the public service version of a thumbs‑up emoji: understated, reassuring and surprisingly effective. The result? A reduction of 11 percentage points in follow‑up calls to the department.
On average, those who did call waited nearly 2 weeks longer to do so. If rolled out to everyone who sent in a form, this intervention would result in thousands of fewer calls to call centres each year. That means freeing up time for staff and reducing wait times for other callers. Sometimes, the most productive thing a system can say is: ‘We’ve got it’.
Another trial tested whether a text message reminder could prompt more income support recipients to report their earnings on time (Department of Prime Minister and Cabinet 2017b). It worked. On‑time reporting increased by 13.5 percentage points. Using the most effective text message reminder nearly halved the rate of payment suspensions.
The study authors estimated that the most effective text message reminder would save 6,000 hours of staff time per year – time that could be directed towards helping those in need. And it happened without changing the underlying policy, just the way the message was delivered.
These are examples of what behavioural economists call low‑friction interventions: changes that are easy to implement, low‑cost to run and quick to test. They’re not about changing people’s values or rewriting legislation. They’re about helping people do the right thing more easily.
They also illustrate the power of randomised trials. Without them, we wouldn’t know whether the message made a difference, or whether the change would have happened anyway. But by comparing outcomes between those who received the message and those who didn’t, we can say with confidence: it worked.
Other trials have followed a similar pattern. The Australian Charities and Not‑for‑profits Commission has recently completed a trial to see whether improved messaging could increase on‑time submission of annual information to the regulator. The results of that trial will be released soon, and will add to the growing body of knowledge on how small changes can improve regulatory compliance.
The Australian Taxation Office, in partnership with the Behavioural Economics Team of the Australian Government, also ran a trial involving letters to tax agents (Department of Prime Minister and Cabinet 2018). These letters gently pointed out potential over‑claiming of work‑related deductions. The result: average claims fell by $191 per taxpayer. Across the sample, that translated into more than $2 million in reduced deductions.
These aren’t just examples of operational efficiency. They’re case studies in how government learns. Each trial helps us discover what works, and then improve what we do. Not in theory, but in practice.
Importantly, these trials also respect the scale and complexity of government. Not every challenge needs a parliamentary inquiry. Some can start with a pilot and a control group. That kind of experimentation allows us to fail safely, adapt quickly and succeed at scale.
And because these trials often focus on small operational changes, they can usually be implemented without legislative change or major budget impact. They sit within the day‑to‑day work of government, embedded in forms, websites, phone scripts and service flows.
This is the productivity dividend of a culture of curiosity. We can make better use of the systems we already have, simply by understanding them more deeply.
And the payoff isn’t just in dollars. It’s in dignity: helping people interact with government in ways that feel smoother, fairer and more respectful of their time.
These are the kinds of lessons we want every agency to discover: what works best for our clients, what delivers value to citizens and how to do more with less – not by working harder, but by learning faster.
4. Learning from failure
When we talk about randomised trials, we often focus on the successes – the well‑designed message that lifted compliance, the small nudge that saved millions. But learning doesn’t only come from what works. Sometimes, our most important lessons come from what doesn’t.
A few years ago, the Behavioural Economics Team of the Australian Government worked with researchers to develop an app aimed at improving university students’ resilience (Department of Prime Minister and Cabinet 2021). The design was based on good evidence: a sense of belonging is known to help students bounce back from setbacks. The app offered tools to support that connection.
But when the results came in, there was no significant difference between students who used the app and those who didn’t.
That trial didn’t produce the hoped‑for outcome – but it still produced something valuable: insight. It showed us where the gap was between theory and practice. And because the trial had been carefully run, the findings were clear and credible. It didn’t move the needle – but unlike some group assignments, it still taught us something.
Importantly, the learning didn’t stop there. The Behavioural Insights Team picked up where the earlier work left off, and tested a different approach – sending short, supportive messages directly to students’ phones (Behavioural Insights Team 2021). That trial was a success. Students who received the messages reported a 7.3 per cent improvement in life satisfaction.
A similar approach was taken in the UK’s Business Basics Programme, which funded 32 productivity improvement projects for small and medium enterprises – 17 of them randomised trials (United Kingdom Department for Business Energy and Industrial Strategy 2024). Not all succeeded, but even the failures revealed valuable lessons about what didn’t scale. One pilot using AI tools, for instance, failed to improve outcomes, while another offering targeted management training showed real promise. The point was not perfection – it was purposeful experimentation.
This is what a learning system looks like. One trial identifies a dead end. The next finds a path forward. Some lessons wear lab coats. Others wear hazard vests.
In government, this kind of adaptive thinking is especially important. The scale is large. The stakes are high. And the cost of implementing an untested program – or persisting with one that doesn’t work – can be measured in wasted resources and missed opportunities.
So we need to normalise learning from failure. That means running trials with enough integrity to trust the results – even if the results aren’t what we hoped. It means publishing findings regardless of outcome. And it means treating every evaluation as a chance to improve, not just to prove.
A government that learns from failure is a government that gets better. Not by luck. But by design.
5. Learning in policy, not just delivery
So far, we’ve looked at how randomised trials can fine‑tune how government delivers services – from form confirmations to reporting reminders. But their value doesn’t end at the front desk. Trials can also inform the deeper questions of policy design: what services to offer, how to structure them and how to allocate public resources.
That’s the next frontier of learning – not just improving the efficiency of what we do, but making better choices about what we do.
A good example comes from the Department of Employment and Workplace Relations, which has been working with the Australian Centre for Evaluation to test improvements to Workforce Australia Online. One of those trials is exploring whether a voluntary one‑on‑one coaching session improves employment outcomes for jobseekers using the online platform.
The idea was straightforward: perhaps a short, personalised conversation – even within a largely digital system – could deliver better long‑term results. The trial is now wrapping up. And the findings will be released when they’re available.
If the results show that outcomes improved without a significant increase in cost, it could help reshape how we think about designing online services – not just in employment, but across the public sector. And if the results don’t show a strong effect, that too will be valuable. A clear signal about where to focus effort and investment next.
This is the kind of decision governments face all the time: whether to add a new service, revise an existing one or shift funding between programs. Randomised trials give us a disciplined way to test those options before investing at scale.
They can also help us avoid false economies – policies that look efficient on paper but underdeliver in practice. Or conversely, interventions that appear costly but prove cost‑effective once outcomes are measured.
The goal isn’t to replace judgement with data, but to inform it – to build policies on a stronger foundation of evidence, so we’re not just doing what sounds good, but what does good.
When we apply the same learning mindset to policy as we do to service delivery, we strengthen the entire public sector. We become not just better administrators, but better stewards of public trust. That principle is now reflected in the APS’s own values, with stewardship formally recognised as a core responsibility – a reminder that good government is not just about today’s outcomes, but about building capability for tomorrow.
6. Building a system that learns
If we want a government that learns, we need to invest in systems that support learning – not just individual trials, but the infrastructure that helps us share, interpret and build on what we know.
That’s the thinking behind the new evaluation library launched by the Australian Centre for Evaluation (Australian Centre for Evaluation 2025). Hosted on the Analysis and Policy Observatory, it brings together existing evaluation reports from across government in a single, searchable place. It’s open, accessible and built to support transparency – so that knowledge doesn’t sit in filing cabinets, but feeds back into policy design.
This is a significant step. Too often in the past, valuable findings have remained siloed within departments or never published at all. A central evaluation library helps us break down those barriers. It also reinforces a simple principle: if taxpayers funded a policy, taxpayers should be able to see whether it worked.
Around the world, governments and research organisations are investing in better ways to synthesise and share evidence.
The Campbell Collaboration, for instance, brings together policymakers, researchers and practitioners to produce systematic reviews and evidence summaries – helping decision‑makers see what’s known, where gaps remain and which interventions are likely to be most effective (Campbell Collaboration n.d.).
The Evidence Synthesis Infrastructure Collaborative, supported by the Wellcome Trust and others, is exploring how AI can be used to maintain living evidence reviews – continuously updated syntheses of trial results that evolve as new data comes in (Wellcome 2024). Rather than waiting years for a final report, policymakers can draw on timely, high‑quality evidence when it’s needed most.
And in the United Kingdom, the What Works Network continues to bridge the gap between research and practice – commissioning trials, building evidence standards and translating complex findings into practical guidance (United Kingdom Evaluation Taskforce 2024).
These are the kinds of institutions that make learning routine. They don’t just enable better individual decisions. They shape a culture – one that expects evidence, shares results and rewards curiosity.
Because a learning government doesn’t rely on chance discoveries. It builds systems that help us ask better questions, find clearer answers, and keep getting better at delivering for the people we serve.
When these pieces come together – transparency, infrastructure, collaboration – they don’t just improve individual programs. They prepare the ground for something bigger: a system that learns by default, and governs with confidence.
7. A vision for a learning government
The examples I’ve shared today – from simple text messages to coaching trials, from tax nudges to system‑wide evaluation libraries – all point to a larger ambition: building a government that evolves as it governs.
This isn’t about being tentative. It’s about being rigorous. My colleague Zhi Soon MP put it perfectly in his first speech last week: ‘As governments, we need to be humble in recognising what we do and do not know and what we do and do not do well. We must test, learn and adapt.’ (Soon 2025).
We already see this ethos emerging across the Australian Public Service. Randomised trials are being used more often. Evaluations are becoming more transparent. And departments are working together to embed learning into everyday practice.
But there’s more we can do.
We can make high‑quality trials and impact evaluations a routine part of how we design and deliver policy. We can compare alternative approaches to the same problem, not just to see what works, but to choose what works best – what delivers the greatest outcomes for the lowest cost.
And as the evidence base grows, we can use synthesis – not just individual findings, but aggregated knowledge – to guide the way forward.
This is a vision of a more productive government. A more transparent government. And, ultimately, a more trustworthy one.
Because when we learn openly, we govern more confidently. When we test our assumptions, we serve people better. And when we embed learning in the way we work – not as an exception, but as a standard – we honour our responsibility not just to the present, but to the future.
That is what it means to be stewards of public trust. And that is what it means to make government a learning machine. And that starts with asking the right questions – even if the answer isn’t always ‘yes, Minister’.
Note: My thanks to officials at the Australian Centre for Evaluation for assistance in preparing these remarks.
References
Department of Prime Minister and Cabinet (2018). Improving tax compliance: deductions for work related expenses.
Australian Centre for Evaluation (2025). The ACE Evaluation Library.
Behavioural Insights Team (2021). Can nudging improve student wellbeing? Results from an RCT in Australia.
Department of Prime Minister and Cabinet (2021). Gather. Grow. Graduate. Using behavioural insights and technology to help students graduate from university.
Campbell Collaboration (nd). Campbell Collaboration: About.
Department of Prime Minister and Cabinet (2017b). Effective use of SMS: timely reminders to report on time.
Wellcome (2024). Evidence Synthesis Infrastructure.
Friebel G, Heinz M, Krüger M and Zubanov N (2017) ‘Team Incentives and Performance: Evidence from a Retail Chain’, American Economic Review, 107(8):2168–2203.
Department of Prime Minister and Cabinet (2017a). Effective use of SMS: improving government confirmation processes.
Soon, Zhi (23 July 2025). First Speech to Parliament, Hansard.
United Kingdom Evaluation Taskforce (2024). Guidance: What Works Network.
United Kingdom Department for Business, Energy and Industrial Strategy (2024). Business Basics Programme: Final Report.