Monitoring the effects of environmental change is a massive undertaking, one that can only be achieved with the power of big data.
The biodiversity of planet Earth is under threat like never before. As the human population increases, and land is appropriated for farming, housing, and industry, our world’s natural flora and fauna are being pushed ever closer to the brink of extinction. Pollution, poaching, and climate change are all exacerbating the problem.
Biodiversity is incredibly important: plants and animals all exist in complex interrelated ecosystems, where one species becoming extinct might lead to the downfall of several others. And because of the benefits of natural recycling, air, and water purification, the production of biological medicines and so on, it’s vital for long-term human survival.
By killing biodiversity – “burning the library of life” as it’s been called – we’re effectively killing ourselves. “Without biodiversity, there is no future for humanity,” said Professor David Macdonald of Oxford University talking to The Guardian 
To solve the problem, we first need to monitor and understand it. That’s no easy task. For starters, data comes from all manner of sources: field research, satellite imaging across a variety of wavelengths, LIDAR surveys, weather stations, geographic information systems (GIS), GPS-tracked animals, even geotagged smartphone photos.
Researchers amass mountains of data every year, and the only way to effectively manage and analyse it is with scalable cloud computing systems and the big data analytics applications they enable. By applying informatics techniques to biodiversity data – taxonomic, biogeographic, and ecological information – scientists can generate detailed biological models, enabling them to better forecast the result of events such as the spread of invasive species or the impact of climate change.
The Natural History Museum’s PREDICTS project – Projecting Responses of Ecological Diversity In Changing Terrestrial Systems – aims to better monitor biodiversity to see how it responds to human pressures. To do this, the museum is attempting to combine the databases from 90 countries and 450 scientific papers, representing more than 40,000 species. It currently consists of 2.5 million biodiversity records and is growing all the time. .
"We know that the landscape is going to change a lot in the future as the population grows, but we haven’t really known how biodiversity will change in response,” says Professor Andy Purvis of the museum’s Life Sciences department. “With PREDICTS we’re building global models that help us to predict how land-use change will affect local biodiversity – and us – in the future."
Big data helps us to better understand the essential role biodiversity plays at all levels.
The team has already used the database to simulate four different climate change scenarios. As expected, the results were somewhat depressing. But they also offered a glimmer of hope. “I had expected these scenarios to be variations on a rather dark theme,” admits Purvis. “But we could actually undo the last 50 years of damage if we mitigate against climate change by using carbon markets so that we fully value high-biodiversity forest areas.” So just by protecting forested areas, most countries would see an increase in biodiversity by the end of the century.
The NHM’s effort mirrors that of the Global Biodiversity Information Facility (GBIF). It’s an internationally funded organisation that records species occurrences around the world, using the open source Darwin Core standard. In July 2018, GBIF hit a significant milestone, surpassing 1 billion species occurrences  – all of which can be freely accessed by anyone, at any time, via its network of data-holding institutions.
“If we want to address the big challenges we face around the future of land use, conservation, climate change, food security, and health, we need efficient ways to bring together all the data capable of helping us understand the changing state of the world and the essential role that biodiversity plays at all scales,” explains Donald Hobern, executive secretary of GBIF.
“This milestone shows that today’s GBIF is prepared for continued growth and ready to handle the massive volume of data we expect to see from other new technologies, and sources, including environmental sequencing and remote sensing.”
Part of GBIF’s remit is also to promote the use of standards and protocols for data sharing, enabling information from disparate sources to be interlinked. Similar repositories of data include the Catalogue of Life and the Encyclopaedia of Life, both of which aim to provide trusted, accessible information of every known species on Earth.
Going one step further is the Earth BioGenome Project (EBP), a new initiative launched in January 2018 that aims to sequence the DNA of 1.5 million organisms and store them in an open-source database. The completed project is expected to require about 1 exabyte (1 billion gigabytes) of digital storage capacity. 
Advances in high-performance computing, data storage, and bioinformatics mean that the EBP aims to achieve its goal within a decade and at a cost of $4.7 billion. As a comparison, the Human Genome Project cost around $3 billion. However, the subsequent ‘genomic revolution’ – with improvements in medicine, agricultural bioscience, biotechnology, and renewable energy – has benefitted the US economy by an estimated $1 trillion.
By forming a better understanding of life on Earth, the project hopes to provide better tools to aid conservation of endangered species, while also generating new drugs, biological synthetic fuels, biomaterials, and so on – useful commercial incentives to help protect the world’s biodiversity.
“Scientists believe that by the end of the century more than half of all species will vanish from the face of the Earth, and with consequences to human life that are unknown, but are potentially catastrophic,” says Harris Lewin, professor of evolution and ecology at University of California, Davis, and chair of the EBP. “The project will lay the scientific foundation for a new bio-economy that has the potential to bring innovative solutions to health, environmental, economic, and social problems to people across the globe, especially in under-developed countries that have significant biodiversity assets.”
The number of animals living on Earth has halved since 1970. In June 2017, the journal Proceedings of the National Academy of Science published a study stating that that we had entered the planet’s sixth mass extinction – sadly, one that scientists believe is entirely of our own making. Reversing decades of environmental abuse is a gargantuan task, but with the help of big data analytics, it’s hoped that technology can seek out solutions that will ultimately save our fragile planet.