Data Assimilation Software for Earth System Models

Sketch of the program flow with (right) and without (left) data assimilation. The yellow boxes show the function calls which are inserted to add the data assimilation functionality to the program.

Data Assimilation combines numerical models with observational data and is used to, for example, initialize model simulations and to improve models. In Earth system models (ESMs) this combination is done for the different components represented by the ESM, like the ocean or atmosphere. A commonly used variant is ensemble-based data assimilation. Here, an ensemble of model state realizations is used to represent the uncertainty of the simulated Earth system state. All model realizations are propagated in time in the computer to dynamically estimate the uncertainty.

Ensemble data assimilation needs large computing resources because of the propagation of the ensemble. To simply the setup of model systems for ensemble data assimilation, the parallel data assimilation framework (PDAF) was developed at the Alfred Wegener Institute Helmholtz Center for Polar and Marine Research (AWI).

In this study, recently published in the journal Geoscientific Model Development, Lars Nerger and his colleagues from the AWI develop a strategy to combine PDAF with Earth system models for highly efficient ensemble data assimilation using supercomputers. The combination is exemplified using the coupled model Alfred-Wegener-Institute Climate Model (AWI-CM). PDAF provides the ensemble environment, methods to handle the observations, and the actual methods to perform the data assimilation, i.e. the combination of model and observations.

The study shows that the ensemble data assimilation functionality can be added to an Earth system model with only very small changes to the model source code. The model can be used like the model without data assimilation, but with additional options to control the data assimilation. Utilizing the compute performance of the supercomputer by parallelization, the data assimilation only increases the run time of the program by about 11%. When more realizations are used in the ensemble, the execution time only increases marginally. This study builds the basis for the application of ensemble data assimilation in AWI-CM, but also shows how one can add ensemble data assimilation functionality to other Earth system models.

Nerger, L., Tang, Q., Mu, L. (2020). Efficient ensemble data assimilation for coupled models with the Parallel Data Assimilation Framework: Example of AWI-CM. Geoscientific Model Development, 13, 4305–4321, ​doi:10.5194/gmd-13-4305-2020