The NHS data strategy: 5 things to know, from data sharing, to OSS, and synthetic data.

So, how will your recent sexual health check-up results be kept safe?

A new NHS data strategy published June 22 admits the government intends to use secondary legislation to force the sharing of "personal information for the purposes of supporting the health and care system."

That's among the aggressive government efforts being lined up to tackle what the draft NHS data strategy describes as "cultural, behavioural and structural barriers" to the broader sharing of healthcare data.

With some 10,000 analysts in the NHS, and growing capabilities in machine learning and natural language processing, the government hopes the freeing up and integration of healthcare data will yield not just curative and research dividends, but "improve decision and policymaking" via predictive analytics.

Here's five things to know about the draft NHS data strategy.

5 things to know about the NHS data strategy.

NHS data strategy 5 things to know
Some of the ongoing work to improve national healthcare IT architectures/structures.

1: It will require a major IT overhaul

Implementing the strategy will require a wholesale overhaul of existing IT systems across the national healthcare estate, the government admits. The ultimate goal: "A modern architecture in which data can be accessed real-time through APIs via a national gateway." The strategy acknowledges that "this goal cannot be achieved when the data is held in silos within individual electronic patient record (EPR) systems."

Among priorities will be "separating the data layer" from existing healthcare IT systems -- by creating what sounds like a single healthcare cloud datalake, although the report doesn't explicitly describe it as such. It touts creation of "a set of structured data records stored in the cloud, separate from the EPR systems themselves. The data can then be easily accessed and updated by both third party and NHS systems."

Along with the creation of cleansed datasets, this will allow a "a more modular approach to EPRs, avoiding vendor lock-in and creating a more dynamic and responsive market".

The NHS data strategy includes promises to:

  • Agree a target data architecture (Winter 2021)
  • Map the technical debt for national systems (Mar. 2022)
  • provide services to find and retrieve records (Spring 2022)
  • Develop APIs that can be accessed over the internet to access "multiple channels including clinical systems, web pages and apps, and for both patients and clinicians to access patient data (Mar. 2022)
  • Improve onboarding to national systems (Nov. 2021)
  • Develop a roadmap for core NHS services using cloud (Oct. 2021)
  • Increase the APIs on the national healthcare gateway (Aug. 2021)
  • Develop data infrastructure services to support interoperability (Oct. 2021)
  • Build Centres of Excellence (CoEs) in the area of data architecture that focus on promoting best practices, support and training (Aug. 2022)

2: A Single NHS Account

As part of these efforts, the government aims to create a single NHS Account for users that will bring patient details from different services across the system, including information found in appointment bookings, vaccination status, health records and personalised wellness services.

The draft NHS data strategy notes that "an account API will be made available nationally for strategic partners to integrate with." Data sharing (including healthcare data monetisation) remains hugely controversial, despite the benefits possible from large harmonised data sets being made available to experienced data scientists and as a result, the hashtag #NHSdatagrab started trending as the report was published.

3: Data partnerships with the private sector...

At face value, the big focus is cross-agency, cross-departmental (e.g. health and social care) data sharing that would allow the UK's public health agencies to "draw on multiple data sources to gain new insights into the public’s health, with quicker access to high quality health intelligence to inform improved decision-making and responses to crises (ongoing)".

The draft government healthcare and NHS data strategy tried to head off criticism by emphasising "we do not sell health and care data for the benefit of private companies. Where access to data is granted, having met these high thresholds, it must always have the explicit aim to improve the health and care of our citizens, or to support the improvements to the broader system".

It's a statement that on closer inspection does not mean much at all.

So what are the data sharing plans with the private sector? That's an issue that's tackled primarily in section 7 of the report, which lays out plans to give  health and care bodies "guidance and tailored support to help them navigate data partnerships, so they can make decisions properly informed by legal advice and commercial best practice."

This includes support to:

  • "Provide common approaches to establishing the most appropriate partnership model, as well as consistent financial valuation models, treatment of Intellectual Property, accounting methods, and legal and contractual drafting"
  • "Ease access to the UK’s healthcare data assets, to stimulate a new wave of digital and data-driven innovation to benefit patients and the wider system.

The government has promised to develop a resource hub for healthcare leaders on data partnerships by Sep. 2021, and to "review and update NHS Digital’s data sharing contracts to reflect the Value Sharing Framework Guidance" by 2022. Ben Goldacre, meanwhile, has been tasked with finding out how "useful data science by the public and private sectors [can] be best incentivised and resourced practically? What roles must the state perform, and which are best delivered through a mixed economy? How can the system ensure true delivery is rewarded?"

4: Open source

Analysts should be encouraged to think from the outset of a project about how work can be shared, or consider "coding in the open" --  sharing code and methodology through platforms like GitHub, the report notes, emphasising that "public services are built with public money, and so the code they are based on should be made available for digital pioneers across the health and care system, and those working with it, to reuse and build on.

NHSX itself says it will " begin to make all new source code that we produce or commission open and reusable and publish it under appropriate licences to encourage further innovation (such as MIT and OGLv3, alongside suitable open datasets or dummy data) by the end of 2021. The government will also this year publish a "digital playbook on how to open source your code for health and care organisations with guidance on where to put the code, how to license and what licences to use."

(Martin James, VP EMEA for Percona noted to The Stack: "This will require the NHS to have an existing open source development approach, in order for them to manage their releases as open source... Setting up an open source programs office would be a good start, as this will provide a team who can manage the community and ensure important tasks are followed, like providing the right kinds of documentation and updates. For open source communities, the ability to look into the code is just one element for success. Alongside this, you need to consider how you engage with the community, how you manage contributions and think ahead for your approach to updates too. This is an exciting time for open source, and it is great to see the UK Government demonstrating its support.)

5: Cybersecurity, and Synthetic data

So, how will your recent sexual health check-up results, etc. be kept safe? Before coming to the cybersecurity component of all this data sharing, the government and NHS say they are keen to explore a range of emerging privacy enhancing technologies (PETs).

These might span:

  • Synthetic data (statistically consistent with a real dataset)
  • "Moving code to the data" using so-called federated analytics.
  • Homomorphic encryption.
  • "Differentially private algorithms: enabling useful population-level insights about a dataset to be gained, while limiting what can be learned about any individual in the dataset"

See also: One to Watch - Homomorphic Encryption Specialist Enveil

On the cybersecurity front, the NHS has made real improvements since 2018's devastating WannaCry attack, with "radically upscaled" central defences. NHS Digital’s Cyber Security Operations Centre (CSOC) now provides local and national network monitoring, incident response and threat intelligence and blocks a claimed 21 million items of malicious activity every month. In February 2021, the central NHSX digital unit also conducted its first incident response exercise with a primary care provider, watching over a step-by-step walk-through of its IR plans.

As datasets and IT systems become more integrated, the government will "continue to work more broadly to embed effective cyber security across the health and care system, including adult social care" it said, without adding much detail. It will need to. Sweeping datasets accessible by API from a central location will have a huge target painted on them from the word go. Ultimately, to wrap up, the aim is to "reshape legacy systems and platforms into smaller discrete services by creating national platforms that can talk to each other and work together, and so can easily be used to access and share data."

Read the full government healthcare and NHS data strategy here.

Share your thoughts with The Stack's team here.

Follow The Stack on LinkedIn