Sir David Davis writes for The Telegraph on using healthcare data in medical research


As published in The Telegraph:

This week researchers announced a significant breakthrough in the treatment of Parkinson’s disease. They found that terazosin, a drug that had been developed to treat an entirely different condition – benign prostate enlargement – could also intervene to prevent or mitigate the death of nerve cells in Parkinson’s sufferers.

It is a remarkable discovery – and all the more so for how it was reached. The scientists successfully tested the theoretical action of the drug in animals. But would it work on people? To find out, they reached for “big data”. Using a massive medical database – the IBM Watson/Truven database, containing the records of 240 million people – they compared the progress of Parkinson’s in 2,880 patients taking terazosin or similar drugs with 15,409 patients taking another drug. Such virtual experiments enabled them to show that, yes, terazosin could be effective for humans, too.

This discovery is a harbinger of the brilliant possibilities arising from the combination of big data with modern genetic and physiological research. Not only will these techniques enable researchers to find new treatments more easily, but these treatments could reach the market more quickly. Since drugs like terazosin are already approved for use, they have gone through all the safety protocols.

The UK should be leading the world in this sphere. The NHS has its own vast database of medical records that could be used to deliver dramatic improvements in treatment. Progress, however, has been pedestrian.

Admittedly, there are weaknesses in this area of research. The IBM database is largely based on people with health insurance, a non-random sample. The data is, of course, also ultra-sensitive; there is nothing more private than our health records. One of the uses this big data analysis is being put to is dealing with the opioid epidemic. Imagine if that data became public!

IBM and its partners describe the data as “de-identified”. They use this clumsy term because you cannot fully anonymise health records. A full health record of an adult is almost as specific as a fingerprint. My childhood vaccinations, for example, would give a minimum age. Then my records include facts such as that I have broken my nose five times. Publicly known medical conditions would pin it down for an assiduous reporter in no time.

This is what has stopped the NHS from carrying out similar experiments to the terazosin research using its completely universal database of health records. Most of the early work was carried out on the assumption that the database could be made secure enough to access via the internet. This was a ludicrously bad judgment. If security conscious organisations from Microsoft to the Pentagon have had their secret databases broken into, what chance the IT incompetent NHS?

But if we were able to make the NHS database security rock-solid, it would provide the best experimental underpinning of health research in the world. This can be done. Imagine if we had the database on half a dozen supercomputers, one in each of our major medical research centres. They would need to be isolated from the internet. In the parlance of the trade they would be “air gapped”. In effect they would be the Fort Knox of data.

Researchers would be able to carry out their experiments, testing their ideas on the supercomputers in the universities and research centres, just as they might carry out a laboratory experiment in the same places. This is not entirely new. When Dame Sally Davies, the chief medical officer, told me about the “Hundred Thousand Genome Project”, which is sequencing whole genomes for NHS patients, she brilliantly described it as a “reading library not a lending library”.

If we set up the NHS database in this ultra-secure way, nobody would worry about their privacy being intruded upon. And since the drugs and therapies tested by such an approach have already been cleared for use, we could redesign the medical approval procedures to allow them to be brought to market easily.

This would accelerate both the discovery and deployment of new treatments for intractable diseases, and create a formidable national competitive advantage for the medical research centres that had unique access to these databases. It would save lives, and build on the extraordinary position the UK already enjoys in physiological research and genomics.