November 3, 2022

Using data to support young people’s health and wellbeing: lessons from infrastructure development, governance and public involvement

In the seventh in our blog series showcasing the DARE UK Sprint Exemplar Projects, the FAIR TREATMENT research team discuss the use of federated analytics to better understand how to support young people’s mental health.

What is the problem we are trying to solve?

Mental health problems are on the rise in the UK, and young people often do not get the help they need until their wellbeing deteriorates. New advancements in machine learning technology could make it possible to build digital tools to help practitioners identify young people in need of support earlier and get them the right kind of help.

One approach is to use existing data which is already being routinely collected by services which work with children and young people. Linking and using this data poses several challenges:

  1. technically, we must securely link de-identified data from different organisations within one region (healthcare, social care, and schools) and allow for analyses to be done across trusted research environments (TREs) created in several regions;
  2. legally, we must manage a federated (joined-up) network of multi-agency databases ethically and safely; and
  3. we must do all of this in a way that is acceptable to patients and the public.

The FAIR TREATMENT (Federated analytics and AI Research across TREs for AdolescenT MENTal health) project set out to begin addressing these issues.

What we learned: Technology

Our aim was to create three TREs, one each for Cambridge, Birmingham and Essex, and populate them with synthetic (‘dummy’) data to demonstrate how we could use them to a) link de-identified data from different local organisations, and b) run federated analytics (carry out analyses across all three multi-agency TREs without compromising security).

Rather than start from scratch, we chose to integrate a range of open source and commercial technologies, as follows:

  • AIMES – provides TRE infrastructure including security measures like multi-factor authentication and airlock mechanisms.
  • CRATE (Cardinal et al.) – a user-friendly interface which standardises data from different sources and restricts researchers to only the sub-sets of the data they are allowed to access.
  • InterMine (Micklem et al.) – a user-friendly interface which standardises data from different sources and restricts researchers to only the sub-sets of the data they are allowed to access.
  • Bitfount – enables queries to be carried out securely across different TREs (while preserving privacy), including SQL queries and machine learning tasks.

Our demonstrator illustrated the following, which forms the basis of our recommendations:

  • Database systems should automatically construct themselves within the TRE – defining standards and harmonising the data is already difficult and time-consuming.
  • The database should automatically generate user interfaces (UIs) and programming interfaces (APIs) to ensure that researchers can answer complex, multi-agency questions efficiently.
  • TREs should support secure federated queries, including the ability to train machine learning algorithms, without compromising data protection.
  • It should be easy to discover and access data rapidly and securely, within the pre-defined subsets to which the researcher has permissions.
  • Database systems should support fine-grained user access controls as an extra layer of security, including common authentication protocols (such as OAuth).
  • In order to test the infrastructure during development, projects should use synthetic data generated from real data sources. This can be more complicated and time-consuming than it initially appears as the data will initially need to be verified by humans as being truly anonymous.
  • Differential privacy can be added for additional privacy protection of individual TRE’s data during federation.

What we learned: Governance

We worked with Information Governance Services Ltd and local information governance (IG) leaders to develop a governance framework which would be legal, ethical, and acceptable to the organisations which will be supplying the data into the TRE. Key to the work was an emphasis on our common goals to use the data to improve children’s health, as well as the detailed patient and public involvement work, which was supportive of the project throughout. The team co-created two legally viable frameworks which would work at the local level, allowing local IG groups to adopt the one that suited them best. A second layer of governance enabled federation between TREs in different locations.

Figure 1: Two options for information governance approaches for TREs; the organisations are examples of healthcare, social care and education providers within the area of an Integrated Care Board.

To support each model, a range of templates were created for use by local regions wanting to adopt the approach (available via the FAIR TREATMENT team):

  • Terms of Reference – outlines rules regulating the governing bodies (which will need to be created for the database), including membership, frequency of meetings, how decisions about accessing the database will be made, accountability and responsibilities.
  • Data Access Request Form and Terms of Use – supports researchers in requesting access to the database and outlines the rules by which they should abide if access is granted.
  • Information Security Model – outlines technical and organisational measures applied to ensure the data is stored and used securely and lawfully.
  • Privacy, Fair Processing and Transparency – informs the public about who the data controllers are, as well as how and why they are collecting and processing data, all in a way that is easy to understand with clear and plain language.
  • Data Protection Impact Assessment – reviews how the data will be processed and whether this is necessary and proportionate to the purposes of the database, assessing the risks to the rights and freedoms of patients and the measures conceived to address and mitigate these risks.

Figure 2: Example infographic co-designed with our patient and public Involvement workshop participants to describe the governance process of accessing data from a TRE.

What we learned: Patient and Public Involvement

In-depth engagement with the public was critical to the process. We worked alongside three groups over the course of the programme – 11 to 15-year-olds, 16 to 24-year-olds, and parents and guardians. We placed emphasis on diversity and approached over 40 charities to support recruitment, with over 200 participants signing up over the course of the project.

We undertook a multi-step process with participants where we initially tested the approach for its acceptability, then co-designed the governance framework, and finally planned our approach to transparency and communications. A range of communication materials were developed for adult and young audiences to help support the explanation of TREs and data sharing, and were tested with a final group of participants for clarity and accuracy.

You can view an infographic summarising our findings, which was co-created with recommendations from our workshop participants.

You can also find out more about how the project engaged with young people in the video below:

@mqmentalhealth #MentalHealthResearch poster by Alisa Anokhina “Do you know someone who is struggling with mental health problems but isn’t getting the help that they need? In this video we present findings from a series of workshops carried out with members of the public to understand their views of linking multi-agency data to create early identification tools for children and young people’s mental health.   For more information about the Timely project, contact Dr Anna Moore on Study manager: Dr Alisa Anokhina Narrated by: Charlotte Burdge   Special thanks to The Anna Freud Centre, Information Governance Services, Tamanna Miah, Emily Bampton, The Timely Expert By Experience group, and our workshop participants.   This project was funded by Towards Turing 2.0 under the EPSRC Grant T2_15 & The Alan Turing Institute, and by UK Research & Innovation Grant MC_PC_21025 as part of Phase 1 of the DARE UK (Data and Analytics Research Environments UK) programme, which is delivered in partnership with Health Data Research UK (HDR UK) and ADR UK (Administrative Data Research UK).   Stock footage provided by Music by ComaStudio.” #mqsciencefestival #sciencetiktok #mentalhealth ♬ original sound – MQmentalhealth

Find out more about FAIR TREATMENT and access the final project report.