OBSERVER: Revealing climate history through data rescue

OBSERVER: Revealing climate history through data rescue

OBSERVER: Revealing climate history through data rescue
evan

Wed, 18/09/2024 – 14:06

Historical information is crucial for investigating the trends and magnitude of climate change. To comprehend what is happening to our planet, we need high-quality, consistent data from as many locations as possible, reaching back as far as records allow. That’s why data rescue—recovering and digitising old meteorological records—is so important, as it provides the essential historical context needed to understand our changing world. By gathering data from diverse sources and ensuring open access, a vast reservoir of information becomes available to researchers and, ultimately, to policymakers. The Copernicus Climate Change Service (C3S) is supporting the coordination of such in situ data rescue activities, ensuring that invaluable historical data is preserved and made accessible for the benefit of all.

 

Climate change is a global challenge

When examining the real-world impact of weather and climate, information gathered from the lowest few meters of the atmosphere is what matters. In situ observations on variables which directly affect populations, such as temperature, precipitation, and wind are crucial for tailoring climate adaptation and resilience efforts.

Data rescue, which ensures historical information on these variables is digitised and made accessible, is important for accurately analysing long-term trends and understanding the changes in climate patterns we are facing today. Many projects to locate and rescue this historical data are being carried out place at a local scale, often led by national meteorological services or research institutions. However, there is inconsistency in how the data are processed, and the extent of the work varies greatly depending on resources available, which differ significantly between countries. Data rescue is particularly valuable in countries with sparser in situ observation networks, as these historical records can provide climate information that would otherwise be unavailable. However, because of the wide variation in extent and quality of data rescue efforts from country to country, the global dataset is uneven, creating gaps and inconsistencies in the information available. Since climate change impacts are global, it is essential to have comprehensive, consistent data from all regions, regardless of a country’s resources or wealth.

To coordinate these global efforts, C3S and the World Meteorological Organization (WMO) have worked together to launch a new joint data rescue portal. This builds on prior individual successes such as those led by the WMO, with strong support from the National Oceanic and Atmospheric Administration in the United States.

 

A new data rescue portal for users around the world

The new portal makes it straightforward for users to share information on past, current, and future data rescue projects and contribute data to international repositories. It promotes tools, best practices, and standards for all stages of the data rescue process.

The portal is intended for anyone involved in rescuing in situ data. As new records are uncovered, users can search the project inventories, connect with others in their region, and determine if the records have already been rescued or if there is an opportunity to collaborate and pool resources towards rescuing them.

Screenshot of a map showing the location of all projects listed in the new portal. Credit: Copernicus Climate Change Service

 

There are currently around 140 activities listed within the portal, covering a wide range of geographic areas, from a project in New South Wales, Australia, to analyse weather diaries, to one that aims to transcribe all the meteorological observations recorded at the McGill Observatory in Canada, as well as one in China that is recovering relative humidity data for Hong Kong and historical ship observations for the South China Sea and neighbouring waters.

Tools, support, and guidelines are available to assist with data rescue, including software tips on optical character recognition and version control, and best practice for formatting marine and land data.

‘The data rescue community is relatively small but dedicated. We owe it to everyone involved that we do everything we can to maximise the impact of their efforts. By merging our activities with those of the WMO and other international partners and providing a range of tools and support, we hope to have a huge impact on the breadth, diversity, quality, and usability of the data,’ said Paul Poli, In Situ Observations Manager at C3S.

For data that have been successfully recovered, quality-checked, and formatted appropriately, there is a service which supports data deposition through uploads to recognised global repositories. Data that are uploaded into these repositories are converted to a consistent format and then checked for quality and uniqueness before being passed on to the C3S Climate Data Store (CDS) for publication approximately once a year, with the aim that a user depositing data should be able to retrieve them from the CDS within 12 to 15 months.

‘The merger of the WMO and C3S portals, which KNMI, the Netherlands Royal Meteorological Institute, have managed individually for some years, signals that the importance of data rescue is becoming widely recognised. In particular, the link with the C3S ERA reanalysis products which use the rescued data to extend the record back in time makes data rescue immediately relevant for the whole climate change community. The connection to the WMO remains strong with the new joint portal, which is important for opening doors to the archives of the national meteorological services, while being easier for users to navigate and for the technical team to maintain,’ explained Marlies van der Schee, KNMI Research Scientist.

 

Rescued data are invaluable

A key dataset for investigating climate change is ERA5 from C3S. This is a reanalysis—it uses a combination of observations and computer models to recreate past conditions. The more observations that can be fed into ERA5, the better quality the output will be.

Professor Peter Thorne of Maynooth University, Ireland, is leading a project to add rescued data to the C3S reanalyses, including data from the C3S-WMO portal. In particular, he is focusing on the new reanalysis release, ERA6, which is expected to replace ERA5 in 2027. ‘While climate models evolve thanks to advances in computing and climate science, the raw observations that feed into these datasets, including reanalyses such as ERA5 and ERA6, are vital,’ he explained.

Peter aims to provide the best possible access to global observations of surface meteorological parameters covering land and the ocean. Over the last seven years, his team has collected information from over 300 sources, including the new data rescue portal.

 

Prioritising data for rescue

Data rescue projects involve a wide variety of records which may exist in many formats, including handwritten, typed, or obsolete digital formats. Over time, naming and reporting conventions may have changed, adding to the complexity of standardising and digitising these records. In addition, observations from stations might be found in multiple data sources due as they may have been shared more than once, or in some cases, only partial data from a station has been shared while the rest remains undiscovered.

Collating a comprehensive set of observations can be extremely labour-intensive as artificial intelligence (AI) and machine learning (ML) are not yet fully capable of replacing human input in the data rescue process. Therefore, it is crucial to focus limited resources on areas where they will have the most significant impact. For example, in Western Europe, where there are already  good-quality data from thousands of stations, adding a new set of observations may provide little added value. However, in certain island nations where consistent observations are lacking or non-existent, adding new data could be hugely beneficial. This is why C3S is working closely with the WMO to prioritise data rescue efforts in regions such as the Pacific Islands and developing states.

 

Africa’s climate records

Covering around 20% of Earth’s land area, Africa is of huge importance to understanding global climate change. Two projects focused on rescuing data in the region highlight some of the challenges faced and the creative solutions employed.

ACMAD data rescue initiative

In the late 1980s, the African Centre of Meteorological Applications for Development (ACMAD), together with the Royal Meteorological Institute (RMI) of Belgium, contacted meteorological services across sub-Saharan Africa to request original copies of their records. These records were transferred from hard copies to microfilm and microfiche, which were the best available media at the time. 30 years later, concerns arose that these records may have degraded, due to the fragility of microfilm and microfiche. In 2021, ECMWF/C3S co-funded a priority scanning project to rescue this valuable data. This effort successfully converted over 4 million images from 44 African countries into digital format, ensuring their preservation and accessibility for future research.

Kevin Healion, a Research Assistant at Maynooth University, has been leading ongoing efforts to further process and digitise the ACMAD data collection. Despite the initial success in creating digital images, the volume of data is so large that many records still require detailed processing and integration into usable formats. Kevin, together with colleague Simon Noone, has been exploring integrating data rescue into classroom activities to manage this task, involving undergraduate students in the digitisation process.

While the work has so far only scratched the surface of the ACMAD data, some have already been processed and should be available to users through the CDS before the end of 2024.

Data rescue project in the Congo Basin

Meanwhile, Derrick Muhecki, PhD Fellow at Vrije Universiteit Brussel in Belgium, is leading a project to digitise and transcribe hydroclimatic datasets from 36 climate stations across the Democratic Republic of the Congo from the early 1950’s to date

The project involves digitising over 10,000 paper records of daily precipitation and temperature stored in archives in both the Democratic Republic of the Congo and Belgium. This data is then transcribed using various methods, including machine learning, followed by rigorous quality control and assessment to ensure accuracy. The resulting time series, which includes millions of observations, will significantly improve the availability of hydroclimatic data within the Congo Basin, helping research on the region’s climate patterns.

The Congo Basin is one of the cloudiest regions on Earth, making satellite data collection particularly challenging. Rescuing in situ data in regions such as this one is crucial, as it contributes to building a comprehensive global climate record, filling gaps in data and supporting more accurate climate modelling and analysis. 

Digitisation of the INERA archives in Yangambi, Democratic Republic of the Congo. Credit: Derrick Muheki

 

Every piece of data is important

Although resources are limited, the data rescue community is passionate about preserving records, ensuring that future generations have as much information as possible about historical climate and weather at their disposal.

‘We aren’t stopping at the portal and waiting for data rescue to happen on centennial timescales—climate action is now. The information is needed without delay. We must build momentum and ensure historical data are valued,’ concludes C3S’s Paul Poli.

Readers who hold or know about archives and other historical data sources are encouraged to explore the portal and consider submitting a project.

Wed, 18/09/2024 – 12:00

Read More