1 February 2024

Address to the Human Technology Institute Shaping Our Future Symposium, University of Technology, Sydney

Note

From chance to change: leveraging randomised trials and data science for policy success

I acknowledge the Gadigal people of the Eora Nation, and all First Nations people present today. Thank you for the opportunity to speak at the Human Technology Institute’s Shaping Our Future Symposium. I share the Institute’s goal of ‘building a future that applies human values to new technology’, and thank you for your willingness to partner and engage with government as part of the myGov advisory group and on issues such as AI regulation.

Introduction

Each year thousands of patients miss their hospital appointments.

It costs money – contributes to backlogs and delays – and means that appointments cannot be allocated to others in need.

Some 15 per cent of outpatient appointments at St Vincent’s Hospital – just down the road in Darlinghurst – use to be missed each year, despite patients being sent reminders.

St Vincent’s estimated that each missed appointment cost at least $125.

This could add up to $500,000 a year despite the hospital sending patients SMS reminders of their upcoming appointments.

Money that could have been spent treating other patients.

In 2015, the UK’s Behavioural Insights Team found that SMS reminders which highlight the specific cost of a missed appointment effectively reduced hospital no‑shows by almost 3 percentage points (Behavioural Insights Team, 2015).

The UK trial found that what you say, and how you say it, makes a difference.

In 2016, the NSW Behavioural Insights Unit partnered with St Vincent’s to put the UK findings to the test in the Australian context.

They ran two randomised trials to work out if changing their existing text messages could make a difference for attendance rates at St Vincent’s (NSW Behavioural Insights Unit, 2016).

Text messages were sent to nearly 7,500 patients covering about 65 per cent of St Vincent’s outpatient appointments over 13 months.

In the first trial, seven new SMS messages were sent to patients while a control group received the hospital’s standard reminder texts.

In the second trial the two most effective texts from the first trial were sent out to patients.

The messages told patients what it would cost the hospital if they did not turn up for their appointment and that the money could have been used to help others.

Based on the trial outcomes the hospital adopted new text messaging techniques and reduced no‑shows by 19 per cent.

The trial also helped the hospital to better understand people and their contexts by analysing whose behaviour was most affected by the messages.

St Vincent’s approach has been adopted at other hospitals with success, and demonstrates the value of rigorously testing an idea, and committing to using it at scale when the approach works (NSW Behavioural Insights Unit, 2021b)

Why randomised trials?

We know randomised trials can support good decision making.

They have been shown over many decades to be one of the best ways of determining whether a program is effective, if it needs modification, or if it should be dropped all together.

Advances in data science also support effective governments and businesses to make goods decisions.

High quality data monitoring can be used to build feedback loops that enable organisations to improve decision making through compelling evidence about how policies and programs are impacting on outcomes on the ground.

And experiments and trials can surprise us by revealing where interventions are not as effective as we had hoped.

They enable us to test new interventions against what would have happened if we had changed nothing. Randomised trials help to us to understand causation and not just correlation.

Importantly we now have a range of great examples of randomised trials leading to better policies across a range of domains.

An electrifying example

Have you struggled to make sense of a needlessly complicated energy bill?

Enabling consumers to make better decisions about the products they purchase is good for their hip pockets and important for the broader economy.

But it is not always easy to motivate people to change even when it is in their interests.

Most consumers will not engage with their electricity companies to find a better deal that suits their individual needs unless they are moving house or facing other significant life events.

This could be because consumers themselves are time poor, overloaded with information from multiple sources, or struggling to keep up with changes in technology, systems, and processes.

Or they simply do not understand what their energy provider is trying to tell them.

Effective communication makes a huge difference.

To this end the Australian Energy Regulator introduced mechanisms to increase consumer engagement in the energy market (Behavioural Insights Team, n.d.).

Many of these mechanisms have been tested as online randomised trials, including work focusing on how energy bills are communicated to consumers.

The Behavioural Economics Team of the Australian Government conducted a series of six online trials of consumer comprehension, testing different things such as:

  • the length of the bill,
  • the complexity of the content,
  • including information about how the bill was calculated, and
  • information about whether the consumer is on an offer that suits their needs (Behavioural Economics Team of the Australian Government, 2021b).

The results of these online trials, as well as previous randomised trials in this area, informed the regulator’s approach to issuing of consumer energy bills, including via the Better Bills Guideline (Behavioural Economics Team of the Australian Government, 2021b).

The approach has been one of testing and applying the lessons learnt over time and has led to steady improvements in the way energy bills are presented to consumers.

Randomised trials such as these should and could be routinely used to test new and existing programs across all levels of government.

With the right design randomised trials can be cheaper and simpler than often supposed.

Positive results can lead to improvements in processes, regulation, and outcomes, and they can show where further investment is warranted.

But, as we know, randomised trials can also be informative for decision‑making even when they deliver surprising results.

Understanding deeply

The fact is, sometimes when you conduct a trial you will get a null result. This might be a disappointing outcome for those who have invested in the program, but the findings are vital for decision making.

With around one in five students likely to leave university without a degree, the Behavioural Economics Team, with funding from the Department of Social Services, conducted research to understand how students, and particularly disadvantaged students, bounce back from setbacks. The research pointed to feelings of belonging as an important contributor to student resilience (Behavioural Economics Team 2021a).

The research team used these behavioral insights to build a gamified app called Grok – which means to understand something deeply, a term coined by sci‑fi writer Robert A. Heinlein (Behavioural Economics Team 2021a, page 10).

More than 4,000 students downloaded the Grok app which featured an interactive Zen Garden and activities to gather friends and grow resilience skills.

But did it work?

A randomised trial of students from the University of Newcastle and the Western Sydney University found ‘no difference in key outcomes between those who had access to Grok and those who did not’.

In this case, testing the app led to evidence that it did not achieve what was hoped for – (noting the trial coincided with the beginning of the COVID‑19 pandemic which limited in‑person social activities).

In light of these results, the researchers took a step back and reconsidered what other approaches could be used to support students’ resilience.

They ultimately decided that ‘a multi‑pronged approach, reaching more students through multiple avenues, might help to capture more at‑risk students before non‑completion’.

Findings in translation

Sometimes, we have strong expectations about what will work. But we still need to test ideas to confirm whether they do, because we can get surprising results that are counter to our predictions.

Take efforts to increase participation in the Adult Migrant English Program as an example (Behavioural Economics Team 2022).

The Department of Home Affairs runs the free language program which aims to increase social connectedness and improve employment outcomes.

Although there are about 50,000 migrants enrolled in the program at any one time, many migrants who are eligible for the program do not participate.

With the aim of increasing uptake, the Behavioural Economics Team partnered with Home Affairs to run two randomised trials in 2021 and 2022.

First, they tested whether sending a letter and email in English or translated to someone’s home language would lead to more program registrations.

Second, they tested text messages sent to participants who had left the program early, again both in English and translated to the participants home language. And they tested sending text messages from Home Affairs and from the program provider.

Based on the user‑testing process, researchers hypothesised that there would be higher levels of registration for those who received the translated version of the message, compared to the English version.

But, surprisingly, translating communications into someone’s home language did not increase engagement with the Adult Migrant English Program.

What this example demonstrates is that rigorous randomised trials always have something to teach us.

We should always be prepared to put the idea to the test to see if it works in practice.

They help us decide which programs and policies should get the green light, which need to be modified, and which to axe.

Robust trials give you the evidence that you need to make these calls with confidence.

Investing in capabilities

Knowing that experiments can be helpful is not enough on its own. government also needs the capability and data access to deliver great trials.

The OECD ranks Australia highly on regulation, data availability and accessibility (Thodey, 2019).

We enjoy above average citizen confidence and satisfaction in public services.

But there is plenty of room for improvement.

In 2018 an independent panel was asked to examine the capability, culture and operating model of the Australian Public Service and identify ways to guide and accelerate reforms to make sure it was ready for the changes transforming the Australian economy and society.

The final report became known as the Thodey Review, after David Thodey who led the independent panel.

The review reported that the public service approach to evaluation was ‘piecemeal in both scope and quality’ and that this was ‘a significant barrier to evidence‑based policy making'.

It reported that in‑house evaluation capabilities had diminished, and the service had ‘lost its analytical capability’.

The Thodey Review recommended that a culture of evaluation and learning from experience, underpinned by evidence‑based policy, should be embedded in the public service.

It proposed that a central evaluation function be established to drive a service‑wide approach to evaluation, uphold minimum evaluation standards, and provide guidance and support for agencies on best‑practice approaches.

The Australian Government has made substantial investments to grow the public sector’s evaluation and data science capability.

Randomised trials and rigorous evaluations are being explored for a range of topics and contexts, from education to the environment, from crime to health.

Australian Centre for Evaluation

In 2023 we established the Australian Centre for Evaluation within the Treasury.

The centre aims to provide leadership and make rigorous evaluation the norm for policy development.

Over time, it will improve the volume, quality, and impact of evaluations across the public service.

It will champion high quality impact evaluation and partner with other government agencies to initiate a small number of high‑profile evaluations each year.

It will promote the use of evaluations and improve evaluation capabilities, practices, and culture across government.

It will put evidence at the heart of policy design and decision‑making.

To achieve this the Australian Centre for Evaluation is building a network of partnerships across the public sector.

It currently manages a fast‑growing Community of Practice, with over 650 members.

It is working to increase the sector’s capability with in‑person and online training modules for high‑quality impact evaluation.

But we know a single unit in the vast national public service cannot do it all.

The evaluation centre is supported by a network of units across departments and agencies.

These include the Behavioural Economic Team Australia and the Office of Impact Analysis in the Department of Prime Minister and Cabinet.

The goal is to incorporate evidence in decision making so it becomes the systematic, routine way of working.

Eleanor Williams, the Managing Director of the Australian Centre for Evaluation, has joined us here today, and will be taking part in a panel discussion to share her insights.

Partnerships with government departments

Internal and external partnerships will be crucial to improving evaluation across government.

This includes partnerships across and within Australian Government departments to test the effectiveness of policies and programs and build capability.

One such partnership between the Australian Centre for Evaluation and Department of Employment and Workplace Relations is testing changes to online employment services (Leigh, Burke, 2023).

Employment services are a significant investment for government and affect many people – 4.6 per cent of the Australian population aged from 16 to 64 receive some form of unemployment support (AIHW, 2023).

A series of five randomised trials will be conducted looking at various aspects of online employment services.

They will test variations of time spent in online services, improvements to communication methods, and support and tools for clients.

They will look at whether these changes improve employment outcomes.

The evidence generated will help improve and adapt online service delivery to meet the needs of the people using it.

Importantly, all these trials are subject to a robust ethical framework, consistent with the National Statement on Ethical Conduct in Human Research (Nation Statement, 2023).

The trial outputs will inform the government’s response to the House Select Committee’s inquiry into Workforce Australia Employment Services (Select Committee, 2023).

Building on this model, the Australian Centre for Evaluation is now establishing operational partnerships with other large government departments, including the Department of Health and Aged Care, to roll out further robust impact evaluations.

Partnerships with academia

The second form of partnerships for the Australian Centre for Evaluation is with organisations and individuals external to the public service, including academics.

We know that a great deal of expertise in impact evaluation, AI and data science sits outside of the public service.

Many people in this room have significant expertise in new and novel methods for considering causation and in using AI and machine learning to aid this work.

It is crucial that we build bridges across sectors and learn from each other to ensure our research has the impact we hope for.

The goal is to use the best knowledge generated in the public, private, and academic sectors to improve the lives of Australians.

For example, we know that there are new and promising emerging approaches, such as Adaptive Trials, that have potential applications for government policy. The Australian Centre for Evaluation is working to explore how these kinds of innovative approaches can best be used in their work.

The centre’s first Impact Evaluation Showcase will be held in the middle of this year.

It will bring together policy officers, evaluators, and academics to discuss the benefits and challenges of incorporating high‑quality evidence into public policy.

The intention is to identify great examples of experiments and evidence‑based policy making, learn from one another, and better equip the public sector to move towards this leading practice.

Data driven decision‑making

I also want to briefly reflect on the Australian Government’s other substantial moves towards using large‑scale integrated data and AI to better respond to the challenges we face.

Beyond the promise of great experiments, we recognise that there is enormous scope to use government data assets to provide useful information to help shape policy setting.

In recent years, the Australian Bureau of Statistics has developed large scale integrated datasets including its Person Level Integrated Data Asset (PLIDA) which brings together information on health, education, government payments, income and taxation, employment, and population demographics (including the Census) over time.

This presents the opportunity to better understand how people travel between services and also to track the intended and unintended outcomes of policy changes.

By November 2023, there were 204 active projects using this integrated data.

These included projects that aim to:

  • support the agriculture sector with better labour demand forecasts for on‑farm workers
  • explore the effects of health and socio‑economic factors driving poor early development outcomes for children, and to identify policy interventions or protective factors that can improve these outcomes
  • gain policy insights from investigating the elevated levels of mental health disorders reported in recipients of income support and students, and
  • produce a stronger evidence base for decisions about disaster mitigation and recovery investments, with the aim to reduce future disaster impacts through better informed policies

Importantly all projects using the integrated data for policy analysis, research and statistical purposes go through a rigorous assessment and approval process, and must meet several safety and ethical thresholds.

These examples demonstrate powerful uses of integrated, large‑scale data assets using traditional research methods.

But over time – as generative artificial intelligence develops and matures, and government builds its capabilities to use these tools – they will to present new and innovative opportunities for government.

Government is conscious of the rapid evolution of artificial intelligence tools and the growing demand for guidance to help assess the risks in their uses.

It is great that people are coming together at events like today’s to discuss these exciting new technologies and their ability to improve efficiencies, and enhance public services.

Conclusion

I began with examples of how randomised trials have provided new insights into what works.

At the heart of a randomised trial – in medicine or public policy – is chance. When people are allocated to the treatment or control based on luck, then we know that any observed differences must be due to the impact of the treatment.

We’re often told by our parents ‘don’t leave it to chance’. But by deliberately using chance in the structure of an evaluation, we set ourselves up to succeed.

We’re not hoping for dumb luck. We’re using luck to determine causal impacts – just as pharmaceutical manufacturers do when testing whether a new treatment helps patients.

From the perspective of the trial participant, chance decides which group they fall into.

From the standpoint of the researcher, all those chance allocations add up to an approach that is as rigorous as possible.

Making good public policy can be difficult.

We need to raise the bar by making sure claims about a program’s effectiveness are based on quality evidence.

We need to be working at the intersection of technology, policy, and people.

We need to better use data and technology to track our progress towards safe, fair and inclusive outcomes.

We know that we can achieve this by bringing together expertise from government, academia and industry.

We can connect people, resources, and opportunities to increase the benefits that rigorous evidence and data science can deliver.

By deploying chance in the service of policy change, we can shape the world for the better.

References

AIHW (Australian Institute of Health and Welfare), Unemployment payments, web article, September 2023, https://www.aihw.gov.au/reports/australias‑welfare/unemployment‑payments.

Behavioural Economics Team of the Australian Government. (2021a). Gather. Grow. Educate. Using behavioural insights and technology to help students graduate from university. https://behaviouraleconomics.pmc.gov.au/sites/default/files/projects/beta‑report‑grow‑gather‑graduate‑3‑june‑2021.pdf

Behavioural Economics Team of the Australian Government. (2021b). Improving energy bills: Final report. A report prepared for the Australian Energy Regulator. https://behaviouraleconomics.pmc.gov.au/sites/default/files/projects/final‑report‑improving‑energy‑bills.pdf

Behavioural Economics Team of the Australian Government. (2022). To translate or not to translate: Increasing participation in the Adult Migrant English Program. https://behaviouraleconomics.pmc.gov.au/sites/default/files/projects/adult‑migrant‑english‑program‑trials.docx

Behavioural Insights Team, n.d., BIT – Testing the better offer notice on energy bills – Final report.pdf (aer.gov.au)

Behavioural Insights Team (2015, October 22). Reducing missed appointments. https://www.bi.team/blogs/reducing‑missed‑appointments/

Leigh, the Hon Dr Andrew, MP; Burke, the Hon Tony, MP; Service improvement trials for Workforce Australia Online, joint media release, December 2023, https://ministers.dewr.gov.au/burke/service-improvement-trials-workforce-australia-online

NSW Behavioural Insights Unit. (2016). Behavioural Insights in NSW: Update Report 2016. https://www.nsw.gov.au/sites/default/files/2021–05/Behavioural‑Insights‑in‑NSW‑2016.pdf

NSW Behavioural Insights Unit. (2021a, March 16). Reducing missed hospital appointments with better text messages. NSW government. https://www.nsw.gov.au/departments‑and‑agencies/behavioural‑insights‑unit/blog/reducing‑missed‑hospital‑appointments‑better‑text‑messages

NSW Behavioural Insights Unit. (2021b, May 25). Using behaviourally informed reminders cuts missed hospital appointments by more than a third. NSW government. https://www.nsw.gov.au/departments‑and‑agencies/behavioural‑insights‑unit/blog/using‑reminders‑to‑cut‑missed‑hospital‑appointments