Next-Gen Catalysts
STAR-TRE: Safe and Trustworthy Assessment of Risk in TREs for Sensitive Free-Text Access
STAR-TRE is developing practical tools and guidance to enable the safe, transparent use of sensitive free-text data, such as clinical notes, within federated Trusted Research Environments.
Free-text data contains some of the richest information available for health and social research, capturing detail that structured data cannot. However, because identifiers can be embedded in free text in irregular ways, this data presents significant privacy risks that are difficult to address consistently. As a result, many Trusted Research Environments (TREs) exclude free-text data altogether, limiting the scope and value of research.
STAR-TRE aims to address this gap by developing prototype tools that support privacy-risk identification and assessment in free-text data. Using large language models (LLMs), the project will explore how semi-automated approaches can assist governance teams in detecting and assessing disclosure risks while maintaining transparency and human oversight. The focus is not on replacing judgment, but on strengthening decision-making with evidence and auditability.
Building on previous DARE UK work, STAR-TRE will generate technical and governance insights that can be adopted across the TRE ecosystem. Its findings will inform updates to the Standardised Architecture for Trusted Research Environments (SATRE), helping to establish clearer, more consistent approaches to free-text data access.
STAR-TRE will deliver a UK-wide programme of public engagement, beginning with accessible information sessions explaining large language models and their role in de-identification. Deliberative workshops will explore trust, transparency, acceptable use, and concerns around automation, with recommendations feeding directly into tool development and governance frameworks.
By the end of the project, STAR-TRE will:
- Develop prototype tools to support privacy-risk assessment for free-text data in TREs
- Produce evidence-based guidance on the responsible use of LLMs for de-identification
- Contribute to SATRE guidance on free-text data governance
- Support more consistent and transparent decision-making across TREs
Project information
Lead organisation: University of Edinburgh
Principal investigator: Dr Arlene Casey
Project duration: 12 months
Project partners: Scottish Safe Haven Network, University of Sussex
Funding provided: £282,917
Primary contact email: arlene.casey@ed.ac.uk
GET IN TOUCH
If you’re interested in learning more about our work, how it can benefit you, or how to get involved, click the button to get in touch with us using our contact form.