Machine learning (ML) is rapidly gaining momentum as a new toolbox for analysing atmospheric data. While there are now several workshops, fora and conferences to discuss ML applications in the weather and climate domain, discussions on ML applications for air quality remain fragmented. The ERC project IntelliAQ has explored several modern ML concepts for air quality research and we would like to engage in a discussion with the international community about the potential and limitations of ML in this field.
The workshop aims to bring together researchers from the air quality and machine learning communities for discussion of recent research progress and future priorities. We encourage oral and poster contributions from researchers in either of these areas, and particularly welcome contributions at the intersection of these fields.
We are looking forward to meet you in Cologne or online.
This presentation describes the IntelliAQ project and its main achievements. The ERC Advanced grant IntelliAQ has been one of the first initiatives to explore the rapidly developing advanced deep learning concepts in the realm of air pollution research. IntelliAQ defined three core targets for which deep learning solutions should be developed: spatiotemporal interpolation, forecasting, and data quality control. Until now the project has mainly focused on the first two objectives and on tropospheric ozone as key air pollutant. We expect, however, to broaden the scope during the remaining time of the project.
Besides the scientific development of deep learning applications, the project has also contributed to building scalable and reproducible machine learning workflows and it has allowed for the development of a modern, FAIR and open data infrastructure for global air pollution data.
Air pollution has been linked to several health problems in-
cluding heart disease, stroke and lung cancer. Modelling and
analyzing this dependency requires reliable and accurate air
pollutant measurements collected by stationary air monitor-
ing stations. However, usually only a low number of such
stations are present within a single city. To retrieve pollution
concentrations for unmeasured locations, researchers rely on
land use regression (LUR) models. Those models are typi-
cally developed for one pollutant only. However, as results
in different areas have shown, modelling several related out-
put variables through multi-task learning can improve the
prediction results of the models significantly.
In this work, we compared prediction results from single-
task and multi-task learning multilayer perceptron models
on measurements taken from the OpenSense dataset and the
London Atmospheric Emissions Inventory dataset. LUR fea-
tures were generated from OpenStreetMap using OpenLUR
and used to train hard parameter sharing multilayer per-
ceptron models. The results show multi-task learning with
sufficient data significantly improves the performance of a
Gaps in the measurement series of atmospheric pollutants can impede the reliable assessment of their impacts and trends. Data imputation methods to close the gaps in the observation series range from simple linear interpolation to machine learning. We propose a new method for missing data imputation of the air pollutant tropospheric ozone by using the graph machine learning algorithm ‘correct and smooth'. This algorithm uses auxiliary data that characterize the measurement location and, in addition, ozone observations at neighboring sites. Specifically, we apply this method to the missing data of a preliminary dataset from 278 stations of the year 2011 of the German Environment Agency (Umweltbundesamt - UBA) monitoring network. These data exhibit three distinct, frequently occurring gap patterns: shorter gaps in the range of hours, longer gaps of up to several months in length, and gaps occurring at multiple stations at once. We apply correct and smooth as a post-processing algorithm after imputing the missing data with different statistical and machine learning methods.
For short gaps of up to five hours, linear interpolation is most accurate with $R^2$ values of 0.91 - 0.97, RMSEs of 2.43 - 4.44 ppb, and indexes of agreement of 0.98 - 0.99. Longer gaps at single stations are most effectively imputed by a random forest in connection with correct and smooth, with $R^2$ values of 0.86 - 0.87, RMSEs of 5.64 - 6.18 ppb, and an index of agreement of 0.96. This case exhibits strong improvement through the correct and smooth algorithm, as the RMSEs decreased by 0.57 - 0.76 ppb compared to the random forest alone. For longer gaps at multiple stations, the correct and smooth algorithm improved the random forest RMSE by 0.07 ppb, despite a lack of data in the neighborhood of the missing values. Based on these results, we suggest applying a hybrid of linear interpolation and graph machine learning for the imputation of tropospheric ozone time series.
“Air quality interventions” refer to any actions that may lead to a change in air quality, whether intentionally such as clean air zone and unintentionally, such as COVID-lockdowns or Net Zero actions. Quantifying the impact of “interventions” on air quality is one of the key processes in air quality management. Observational data from monitoring networks are often used for assessing the air quality effectiveness of interventions. However, air pollution levels do not change linearly with emissions due to variations in weather conditions and chemical processes. Furthermore, they change on a seasonal and year by year basis at a specific location. Here, we will present our studies on the changes in air pollutant concentrations arising from emission changes due to clean air actions (such as clean air zone, clean heating policies) and the COVID-19 lockdowns based on a machine learning technique and a “synthetic control” method. These methods are able to detect sudden decreases in air pollutant concentrations due to “interventions”. They also provide a quantitative evaluation of “causal” effects of the interventions.
Surface ozone is an air pollutant that contributes to hundreds of thousands of premature deaths annually. Accurate short-term ozone forecasts may allow improved policy actions to reduce the risk to health, such as accurate and timely air quality warnings. However, forecasting surface ozone is a difficult problem, as its concentrations are controlled by a number of physical and chemical processes which act on varying timescales. Accounting for these temporal dependencies appropriately is a promising avenue to provide more accurate ozone forecasts. We therefore implement a state-of-the-art transformer-based model, the Temporal Fusion Transformer, trained on observational data from three European countries. In four-day forecasts of daily maximum 8-hour ozone (DMA8), our novel approach is highly skilful (MAE = 4.9 ppb, coefficient of determination R$^2$ = 0.81), and generalises well to data from 13 European countries unseen during training (MAE = 5.0 ppb, R$^2$ = 0.78). The model outperforms standard machine learning models on our data, and compares favourably to the performance of other published deep learning architectures tested on different data. Furthermore, we illustrate that the model pays attention to physical variables known to control ozone concentrations, and that the attention mechanism allows the model to use relevant days of past ozone concentrations to make accurate forecasts.
Providing accurate global estimates of pollution is essential to evaluate the global public health burden of disease associated with air pollution exposure, which in turn will help environmental policy making. Nevertheless, our current knowledge of air pollution suffers from large biases in model predictions and insufficient information from the current observing system that includes surface in situ and satellite measurements. Chemical data assimilation (DA) has made substantial progress in reproducing regional and global ozone and health impacts. Nevertheless, chemical DA is fundamentally limited by model fidelity both in terms of representation resolution and unresolved processes that can exhibit systematic patterns in model and data mismatches. Thus, the analysis and prediction of air quality at regional scales has been stymied by unresolved and highly non-linear physical and chemical processes.
In this study, we develop and utilize a novel, explainable machine learning (ML) model to break down regional bias dependence and provide scientific interpretation of ozone and its bias drivers by analyzing a large set of exogenous and input data sources provided by chemical DA and various observations. By doing so, we aim to provide new scientific insights into the factors that control bias in air quality assessment, and the drivers of global ozone trends and their impact on global air quality.
We obtained the differences between the analysis from JPL’s chemical DA system, MOMO-Chem, and the available independent observations from the surface Tropospheric Ozone Assessment report (TOAR) network for ozone for 2011-2015. These differences are used as the outputs of interest in the ML model, while various global chemical concentrations and meteorological (>100) parameters from MOMO-Chem, as well as high-resolution satellite measurements of relevant parameters, are used as inputs into the ML model. A regression-tree randomized ensemble ML approach was applied to model the patterns of the residual bias of the MOMO-Chem output. Importantly, we extracted and combined local and global measures of how inputs affected the predicted bias in MOMO-Chem output, therefore, providing ML model explanations and quantification of the impacts. Subsequently, we are able to identify a set of distinct areas and their corresponding drivers, which in turn will shed light on both resolved and unresolved model physical and chemical processes and their importance to air pollutant predictions.
Our analysis suggests that the developed ML framework is able to predict the overall model ozone bias magnitude and variations over North America, Europe, and East Asia. Our results also highlight that adding high-resolution satellite data, MODIS land cover data in this case, provides additional important constraints to improve the ML prediction, especially for highly polluted events. The framework is then used to identify global patterns of surface ozone and physical model bias drivers, demonstrating substantial contributions from parameters related to, for example, PBL mixing, sea breeze, photolysis rates, precursors, and topography, with significant seasonal and regional variations. This study offers a unique synthesis of model-based inference and explainable ML techniques for chemical transport modeling and data assimilation to identify regionally-dependent and process-level mechanisms driving near-surface pollution and correct for their impact on air quality predictions.
Weather and air pollution have much in common and if it is possible to forecast weather with deep neural networks, it should also be possible to forecast air pollution. This is why, in the IntelliAQ project, we have focused on weather forecasting to explore deep learning methods from video prediction. The presentation will cover results from IntelliAQ as well as from the European MAELSTROM project.
Accurate weather predictions are essential for many aspects of social society. Nowadays, weather prediction highly relies on numerical weather prediction (NWP) models, which require huge computational resources. Recently, the potential of deep neural networks to generate bespoken weather forecasts has been explored in a couple of scientific studies inspired by the successful applications in the computer vision domain. The super-resolution task aiming to project low resolution to a high-resolution field is somewhat analogous to downscaling, and video prediction is similar to weather forecasting in the meteorological domain. Inspired by this, we explore three case studies by exploring the state-of-the-art deep learning approaches for short-term weather forecasting and downscaling for 2 m temperature and precipitation.
In the first study, we focus on the predictability of the diurnal cycle of near-surface temperatures. A ConvLSTM, and an advanced generative network, the Stochastic Adversarial Video Prediction (SAVP), are applied to forecast the 2 m temperature for the next 12 hours over Europe. Results show that SAVP is significantly superior to the ConvLSTM model in terms of several evaluation metrics. Our study also investigates the sensitivity to the input data in terms of selected predictors, domain size, and the number of training samples. The results demonstrate that additional predictors, i.e., in our case the total cloud cover and the 850 hPa temperature, enhance the forecast quality. The model can also benefit from a larger spatial domain. By contrast, the effect of reducing the training dataset length from 11 to 8 years is rather small. Furthermore, we reveal a small trade-off between the MSE and the spatial variability of the forecasts when tuning the weight of the krenL1-loss component in the SAVP model.
In the second study, we explore a custom-tailored GAN-based architecture for precipitation nowcasting. The prediction of precipitation patterns at a high spatiotemporal resolution up to two hours ahead, also known as precipitation nowcasting, is of great relevance in weather-dependent decision-making and early warning systems. We developed a novel method named Convolutional Long-short term memory Generative Adversarial Network (CLGAN) to improve the nowcasting skills of heavy rain events with deep neural networks. The model constitutes a GAN architecture whose generator is built upon an u-shaped encoder-decoder network (U-Net) equipped with recurrent LSTM cells to capture spatiotemporal features. A comprehensive comparison between CLGAN, another advanced video prediction model PredRNN-v2 and the optical flow model DenseRotation is performed. We show that CLGAN outperforms in terms of point-by-point metrics as well as scores for dichotomous events and object-based diagnostics. The results encourage future work based on the proposed CLGAN architecture to improve the accuracy of precipitation nowcasting systems further.
In the last case study, we make use of deep neural networks with a super-resolution approach for statistical precipitation downscaling. We apply the Swin transformer architecture (SwinIR) as well as convolutional neural network (U-Net) with a Generative Adversarial Network (GAN) and a diffusion component for probabilistic downscaling. We use short-range forecasts from the Integrated Forecast System (IFS) on a regular spherical grid with xIFS=0.1° and map to the high-resolution observation radar data RADKLIM (xRK=0.01°). The neural networks are fed with nine static and dynamic predictors similar to the study by Harris et al., 2022. All the models are comprehensively evaluated by grid point-level errors as well as error metrics for spatial variability and the generated probability distribution. Our results demonstrate that the Swin Transformer model can improve accuracy with lower computation cost compared to the U-Net architecture. The GAN and diffusion models both further help the model to capture the strong spatial variability from the observed data. Our results encourage further development of DNNs that can be potentially leveraged to downscale other challenging Earth system data, such as cloud cover or wind.
In our study, we investigate the performance of the state-of-art deep learning models with a focus on the generative models for weather forecasting and downscaling task. Our results We analyzed the model performance of the deep learning model in-depth on various evaluation metrics. The results demonstrate that deep learning from computer vision attains some predictive skills in weather forecasting and downscaling. Particularly, the generative models can help to reserve small-scale details of the prediction.
Beyond the generative models, the success of vision transformer models (ViT) and graph neural networks (GNN) in image generation have gained tremendous attention recently. They also spark a huge performance revolution in weather forecasting. The models such as Pangu-Weather (Bi et al., 2022, Pathak et al., 2022, Lam, 2022), show the potential capability to surpass the NWP model. In future work, we will explore the tailored ViT models to better address meteorological problems.
Air pollution is the largest environmental cause of disease and premature death, resulting in more than 9 million premature deaths in 2015 - several times more than from AIDS, tuberculosis, and malaria combined [1-2]. While significant progress has been made in reproducing regional and global ozone fields and their attributions using chemical transport models and data assimilation techniques, including JPL’s Multi-mOdel Multi-cOnstituent Chemical data assimilation (MOMO-Chem) framework , there is still a challenge to reproduce surface ozone, especially at urban scales relevant for human health impact assessments. The NASA Jet Propulsion Laboratory’s Scientific Understanding from Data Science (SUDS) strategic initiative is designed to form interconnected teams between the data science and physical science communities to leverage data science techniques for improving scientific research through revealing new connections in data. The work presented here encapsulates the subgrid scale drivers of pollution SUDS project, mainly focusing on surface ozone and its precursors, inferred from model-based inference and machine learning (ML). Combining machine learning and data science techniques with the domain knowledge and science expertise from the SUDS team, the objectives of this work are to improve our understanding and prediction of global air quality with machine learning.
Using approximately 80 physical and chemical parameters from MOMO-Chem and TOAR-2 observations as the input feature space, maximum daily average 8 h (MDA8) ozone global bias is predicted using a Random Forest Regressor pipeline. This framework has been used to investigate global surface ozone variations and their drivers using explainable ML techniques (Miyazaki et al., in prep). Recent progress in the project has integrated open-source remote sensing data from Google Earth Engine (GEE), including MODIS land cover and Population Density data [4-5]. A generalizable data processing tool has been built for the ML pipeline to automatically extract and process data matching the MOMO-Chem global grid from Google Earth Engine to generate features to improve ML model performance. Early results of bias prediction across the global TOAR-2 ground station network with the added GEE features show an RMSE and R-squared performance improvement, with a 4% and 15.5% respective improvement in January and 2% and 8% performance respective improvement in July experiments. Especially, the high-resolution MODIS data provided additional constraints to improve the representation of high ozone events. Both the MODIS and Population Density features are ranked in the top 15 of permutation importance by the ML model, showing promise for the use of the added datasets in ozone bias prediction with Random Forest, and the potential of leveraging open-source data to improve our understanding of the global drivers of air quality. Future work will include leveraging the GEE tool utilize further high resolution data, such as the MODIS Burned Pixel Area Product and the VIIRS Nighttime Composites [6-7] imagery, to better understand the subgrid-scale drivers of global ozone, which would contribute to the broader TOAR-2 community.
 R. Fuller, P. Landrigan, K. Balakrishnan, G. Bathan, S. Bose-O’Reilly, and M. Brauer, “Pollution and health: a progress update,”The Lancet Planetary Health, vol. 6, no. 6, pp. E535–E547, 2022.
 World Bank, “Topics - Pollution”, Online. Available: https://www.worldbank.org/en/topic/pollution
 Miyazaki, K., Bowman, K. W., Yumimoto, K., Walker, T., and Sudo, K.: Evaluation of a multi-model, multi-constituentassimilation framework for tropospheric chemical reanalysis, Atmos. Chem. Phys., 20, 931–967, https://doi.org/10.5194/acp-20-931-2020, 2020
 Friedl, M., Sulla-Menashe, D. (2019). MCD12Q1 MODIS/Terra+Aqua Land Cover Type Yearly L3 Global 500m SIN Grid V006 [Data set]. NASA EOSDIS Land Processes DAAC. Accessed 2023-01-23 from https://doi.org/10.5067/MODIS/MCD12Q1.006
 Center for International Earth Science Information Network - CIESIN - Columbia University. 2018. Gridded Population ofthe World, Version 4 (GPWv4): Population Count, Revision 11. Palisades, New York: NASA Socioeconomic Data and Applications Center (SEDAC). https://doi.org/10.7927/H4JW8BX5.
 NASA EOSDIS Land Processes Distributed Active Archive Center (LP DAAC), MCD64A1.061 MODIS Burned Area Monthly Global 500m [Data set]. Accessed 2023-01-23 from https://developers.google.com/earth-engine/datasets/catalog/MODIS_061_MCD64A1
 C. D. Elvidge, K. E. Baugh, M. Zhizhin, and F.-C. Hsu, “Why VIIRS data are superior to DMSP for mapping nighttime lights,” Asia-Pacific Advanced Network 35, vol. 35, p. 62, 2013.
The AtmoRep project asks if one can train one neural network that represents and describes all atmospheric dynamics. AtmoRep’s ambition is hence to demonstrate that the concept of large-scale representation learning, whose principle feasibility and potential was established by large language models such as GPT-3, is also applicable to scientific data and in particular to atmospheric dynamics. The project is enabled by the large amounts of atmospheric observations that have been made in the past as well as advances on neural network architectures and self-supervised learning that allow for effective training on petabytes of data. Eventually, we aim to train on all of the ERA5 reanalysis and, furthermore, fine tune on observational data such as satellite measurements to move beyond the limits of reanalyses.
We will provide an overview of the theoretical formulation of AtmoRep, of our transformer-based network architecture, and of the training protocol for self-supervised learning that allows for unlabelled data such as reanalyses, simulation outputs and observations to be used for training and re-fining the network. We will also present the performance of AtmoRep for applications such as downscaling and forecasting and, furthermore, demonstrate that AtmoRep has substantial zero-short skill, i.e., it is capable to perform well on tasks it was not trained for. Although not specifically designed for air quality forecasting and analysis, we will also explain why AtmoRep provides a powerful basis for it and how a pre-trained AtmoRep network can be adapted for the task with limited computational costs.