The fact that people care passionately about their health is both a help and hindrance to researchers. On the one hand, most people are ready to provide extremely personal information to their healthcare provider if they think it will be used to make them, or others like them, feel better.
On the other, they would be incensed if this information was shared with their name, date of birth and address attached, or otherwise misused.
For that reason, information libraries like SAIL (Secure Anonymised Information Linkage) Databank, which has collected over 10 billion person-based data records from the population of Wales over a 20-year period , have a fine line to tread between holding critical research information and fulfilling their obligation to keep that information safe.
Professor David Ford, Co-Director of SAIL, says, “I think data analytics is going to make a huge difference to healthcare over the next decade or so.
“Different jurisdictions are collecting data and analysing it, similar to ourselves, right across the world, and the potential for pooling that data and being able to answer questions more definitively is enormous. I think, if we can overcome some of the legal challenges, the technology’s there to be able to do it at scale and the benefits for the whole world are great.”
SAIL’s records have already been used for numerous studies. One big data research project found that there was a positive association between social deprivation and high mortality rate following a hip fracture. Another discovered that linking primary care or prescriptive databases to congenital anomaly registries improved the quality of information on maternal medication use during pregnancy. However, big data can be used for more than just traditional research.
During the Zika virus outbreak in 2015, the WHO (World Health Organisation)  analysed online media reports to gain supplementary information on where and how the virus was spreading, and to estimate the rate of transmission in order to plan their response.
Data analytics is going to make a huge difference to healthcare over the next decade or so.
Closer to home, Wrightington, Wigan, and Leigh NHS Foundation Trust  is using big data analytics to monitor the time lag between referral and treatment, as well as pattern analysis to predict peak periods and help with staffing. NHS Business Services Authority  has used its own database to analyse the effectiveness of prescriptions provided to patients. It has also created a dashboard to help identify surgeries overprescribing antibiotics, in an effort to help fight drug resistance.
While the potential benefits of using big data in healthcare are huge, there are also a number of challenges that must be overcome to make the most of the available information.
Professor David Ford concedes that: “In common with most big data operations, the SAIL databank struggles with inconsistent and variable data. We get data from a wide variety of different organisations, some of them collecting data in entirely different formats. Our job is to try and reconcile that and produce one consistent standing of the truth, which is no mean feat.”
Cleaning the data and making sense of it is where specialist databanks like SAIL come into their own, offering support to projects accessing their resources. Using the resulting statistics thoughtfully is also critical.
Mohsen Bayati, Associate Professor of Operations, Information and Technology at Stanford Graduate School of Business, warns: “You need the human influence as well. Companies are guilty of jumping on the ‘big data is king’ bandwagon by collecting as much data as possible, but unless you know you are asking the right questions, it’s unlikely to pay off.” 
There’s no doubt that big data must be viewed in context. SAIL holds primary care data for 75% of the GP practices in Wales. But what about the other 25%? Similarly, data collection through social media could lead to population sectors with little access to the internet, such as the elderly or disadvantaged , being digitally excluded. In order to fairly distribute the benefits of studies, such as increased funding, researchers will need to look at the bigger picture.
A key challenge in the use of big data is the need to keep it secure and to ensure that subjects cannot be identified. It is therefore vital that adequate protection is set up for each and every databank.
Before records go into SAIL’s databank, the information is split into two sets, the demographic and content components. The first set is coded and sent to NHS Wales Information Service, so that identifying features can be removed. The second goes directly to SAIL. Later, the coding allows the components to be re-linked. The databank will only release data for genuine research purposes, for studies with potential health benefits, and until 2011 it could only be accessed on the site. Now, a secure virtual environment and remote desktop protocol are used for safe access.
As digital technology evolves, so does the potential to collect more data via phones and social media. Professor David Ford points out that mobile technology can be used to help us make informed choices regarding diet, exercise, drinking, and smoking. “These devices that we all put in our pockets are capable of being very significant changers of our behaviours, and also providing data to those researchers and health organisations to understand what’s going on in the population, and hopefully improve things for us.”
We all know somebody who has bought a personal fitness monitor or used mobile technology to map their exercise route and check the sugar in their favourite products. So, it makes sense to collect the resulting data. Even when it comes to more sensitive data, the public, given certain reassurances, will largely back the sharing of health information.
In January 2018, Tessa Jowell, who was diagnosed with a brain tumour in May 2017, urged the House of Lords to support the Eliminate Cancer Initiative, already underway in Australia and set to roll out in the US, the UK, and China. She said, “ECI aims to do three main things: first, link patients and doctors across the world through a clinical trials network; secondly, speed up the use of adaptive trials; and thirdly, build a global database to improve research and patient care.” 
A lot of good can come from the sharing of health data. For that reason, the cornucopia of information collected must be fiercely protected and used responsibly, in order to gain the confidence of the public. Only then can big data reach its true potential to deliver worldwide healthcare and public health advances.