The DARE UK Sprint Exemplar Projects have reached their midway point – find out how they’re progressing
The purpose of the projects – which run for eight months from January to August 2022 – is to uncover and test early thinking in the development of a joined-up and trustworthy national data research infrastructure. This will support cross-domain analysis of sensitive data at scale for public good.
Now halfway through their funding period as of the end of April, how are the projects progressing? You can find a short summary of the progress of each of the nine projects below; click on the links to find out more information about the projects and their aims, as well as slide decks with more detail about their progress to date.
TREEHOOSE: Trusted Research Environment and Enclave for Hosting Open Original Science Exploration
Trusted research environments (TREs) enable the secure and safe re-use of patient electronic healthcare records (EHRs) for clinical research. The TREEHOOSE project team has, together with its cloud partner, developed a portable, cloud-based (accessible via a secure internet connection rather than locally) TRE which will be made open source for free. This will enable TRE managers to deploy a common TRE infrastructure which can support cutting-edge reproducible research. The team is also developing an IP (intellectual property)-protecting secure enclave to be able to run external algorithms within the TRE, ensuring security of the software and, importantly, the EHR data. By the end of the Sprint project, all the infrastructure-as-code will be made available to all.
PRiAM: Privacy Risk Assessment Methodology
PRiAM aims to publish a best-practice privacy risk assessment framework for safe data usage in federated research networks. The project has engaged data practitioners, legal experts and members of the public to co-design the approach and understand attitudes, responsibilities and skills needed to participate in data sharing decisions. Requirements for safe federations have been analysed, driven by use cases from public health, integrated care systems and complex hospital discharge. The risk assessment framework has been outlined and aligned with privacy and security risk modelling tools for automated and repeatable assessments. Extensions to privacy domain knowledge for modelling based on the identified privacy requirements has started, whilst plans for an open community supporting privacy and security methodology and tooling has been initiated.
STEADFAST: Education outcomes in young people with diabetes – innovative involvement and governance to support public trust
The STEADFAST project is exploring the best ways to inform, engage and involve young people, their families and the wider public in important issues around the use of their sensitive data for research. The project team has worked with a co-production panel of young people with diabetes to develop the methods and messages to recruit participants to inform the project, working with five community groups to ensure these approaches are as inclusive as possible. The project team is currently engaged in the design of the focus group discussion activities, involving both the community groups and the project’s Young People’s Involvement Advisory Group. The team is also convening two stakeholder workshops on the perspectives of researchers, funders, companies and patient groups to inform a public involvement toolkit for large-scale data-driven research.
Creating a federated, cloud-based trusted research environment to facilitate collaborative research between existing institutions
This project aims to document use cases covering the main barriers faced by research collaborations using sensitive data, and then show how they can be met using cloud-based data technologies (technologies which are accessible via a secure internet connection rather than locally). The project is progressing well across all workstreams, with all key components of the technology platform now deployed and integrated. The team has had a rich thread of engagement from all interviewees, and very engaged and constructive dialogue with the project’s patient groups, which have provided both evidence that the solution in development is directionally correct and pointed to some interesting future development areas.
Overcoming technical and governance barriers to support innovation and interdisciplinary research in trusted research environments
The project team has made significant progress in developing the governance process around ‘non-traditional’ researcher access to health data, as well as standing-up the analytical workbench. To support this continuing development, the team has worked with market research company Ipsos on a survey of 595 residents in South-East Scotland, and deliberative workshops involving a broadly representative sample of over 40 residents. Analysis is underway, with headline findings indicating high overall support for health data sharing. Approval for access by the NHS and universities is greater than for private-sector organisations, but the latter still secured over 50% support. Recommendations from these public exercises will inform the project’s governance design, especially priority conditions for data access and defining trustworthy organisations. The design of a bespoke training course for non-traditional researchers is also underway.
FED-NET: Creating the blueprint for a federated network of next generation, cross-council trusted research environments
FED-NET is federating data analysis across two Microsoft Azure trusted research environments (TREs), one based in Birmingham and one in Nottingham. The project is progressing well; the team has an agreed data specification and has synthetic data which combines health, biomarker, meteorological and air quality (PM 2.5) data. Working with ENSONO and Microsoft, the team has a technical blueprint for federation, and the project’s public contributors have produced the first full draft of a public-facing brochure to explain federation, which is now out for wider review. Stakeholder workshops are also being planned with patients, the public and NHS staff to present the work and gather feedback.
Multi-party trusted research environment federation: Establishing infrastructure for secure analysis across different clinical-genomic datasets
This project aims to create a UK first demonstration of federation of genomic data by bridging the trusted research environments (TREs) of the NIHR Cambridge Biomedical Research Centre and Genomics England. The project team has made significant progress since the beginning of the Sprint: Lifebit have worked collaboratively with Genomics England to shape the target architecture, which is now ready for final sign-off; and the team has been working to put in place the downstream data pipeline to enable the intended use-case of a distributed analysis on mutational signatures, as well as a method to transform Genomics England authentication into GA4GH (Global Alliance for Genomics and Health) Passport standards. They have also held a Q&A session to introduce the project and a focus group on preferences around access review committees with project partner public involvement and engagement panels.
FAIR TREATMENT: Federated Analytics and Artificial Intelligence Research across Trusted Research Environments for Child and Adolescent Mental Health
The FAIR TREATMENT team has recruited a diverse panel of over 150 young people (aged 11-25 years), as well as parents and guardians, to support the co-creation of the project’s governance programme. The three trusted research environments (TREs) involved in the project (Cambridge, Birmingham and Essex) have also now been constructed, and the integration software (InterMine) and federation software (Bitfount) are currently being combined. The software to generate synthetic data for testing has been completed, with permission to process de-identified data to generate realistic synthetic data agreed. The information governance architecture has also been agreed, and a two-tier governance framework to support (i) multi-agency local data sharing and (ii) federation has also all been drafted, as well as role-based access controls to support these.
GRAIMatter: Guidelines and Resources for Artificial Intelligence Model Access from Trusted Research Environments
The GRAIMatter team have begun to draft their recommendations regarding AI model access from trusted research environments (TREs), and have invited other Sprint project representatives and a range of groups who run TREs to a virtual workshop in May to seek further input. The team has developed software for the TRE community to help reduce and assess the risk of disclosure control of trained machine learning models from TREs, and are in the process of drafting legal clauses to support TREs to add legal and ethical controls. They have also run three workshops with members of the public to gain their input on the work, with another two planned, and are in the process of drafting a range of publications covering different areas of the project.