Focus on:
All days
Mar 6, 2023
Mar 7, 2023
All sessions
Check-in and registration
Discussion
ML4O3 Kickoff meeting
Poster viewing
Welcome
Hide Contributions
Indico style
Indico style - inline minutes
Indico style - numbered
Indico style - numbered + minutes
Indico Weeks View
Back to Conference View
Choose Timezone
Use the event/category timezone
Specify a timezone
Africa/Abidjan
Africa/Accra
Africa/Addis_Ababa
Africa/Algiers
Africa/Asmara
Africa/Bamako
Africa/Bangui
Africa/Banjul
Africa/Bissau
Africa/Blantyre
Africa/Brazzaville
Africa/Bujumbura
Africa/Cairo
Africa/Casablanca
Africa/Ceuta
Africa/Conakry
Africa/Dakar
Africa/Dar_es_Salaam
Africa/Djibouti
Africa/Douala
Africa/El_Aaiun
Africa/Freetown
Africa/Gaborone
Africa/Harare
Africa/Johannesburg
Africa/Juba
Africa/Kampala
Africa/Khartoum
Africa/Kigali
Africa/Kinshasa
Africa/Lagos
Africa/Libreville
Africa/Lome
Africa/Luanda
Africa/Lubumbashi
Africa/Lusaka
Africa/Malabo
Africa/Maputo
Africa/Maseru
Africa/Mbabane
Africa/Mogadishu
Africa/Monrovia
Africa/Nairobi
Africa/Ndjamena
Africa/Niamey
Africa/Nouakchott
Africa/Ouagadougou
Africa/Porto-Novo
Africa/Sao_Tome
Africa/Tripoli
Africa/Tunis
Africa/Windhoek
America/Adak
America/Anchorage
America/Anguilla
America/Antigua
America/Araguaina
America/Argentina/Buenos_Aires
America/Argentina/Catamarca
America/Argentina/Cordoba
America/Argentina/Jujuy
America/Argentina/La_Rioja
America/Argentina/Mendoza
America/Argentina/Rio_Gallegos
America/Argentina/Salta
America/Argentina/San_Juan
America/Argentina/San_Luis
America/Argentina/Tucuman
America/Argentina/Ushuaia
America/Aruba
America/Asuncion
America/Atikokan
America/Bahia
America/Bahia_Banderas
America/Barbados
America/Belem
America/Belize
America/Blanc-Sablon
America/Boa_Vista
America/Bogota
America/Boise
America/Cambridge_Bay
America/Campo_Grande
America/Cancun
America/Caracas
America/Cayenne
America/Cayman
America/Chicago
America/Chihuahua
America/Ciudad_Juarez
America/Costa_Rica
America/Creston
America/Cuiaba
America/Curacao
America/Danmarkshavn
America/Dawson
America/Dawson_Creek
America/Denver
America/Detroit
America/Dominica
America/Edmonton
America/Eirunepe
America/El_Salvador
America/Fort_Nelson
America/Fortaleza
America/Glace_Bay
America/Goose_Bay
America/Grand_Turk
America/Grenada
America/Guadeloupe
America/Guatemala
America/Guayaquil
America/Guyana
America/Halifax
America/Havana
America/Hermosillo
America/Indiana/Indianapolis
America/Indiana/Knox
America/Indiana/Marengo
America/Indiana/Petersburg
America/Indiana/Tell_City
America/Indiana/Vevay
America/Indiana/Vincennes
America/Indiana/Winamac
America/Inuvik
America/Iqaluit
America/Jamaica
America/Juneau
America/Kentucky/Louisville
America/Kentucky/Monticello
America/Kralendijk
America/La_Paz
America/Lima
America/Los_Angeles
America/Lower_Princes
America/Maceio
America/Managua
America/Manaus
America/Marigot
America/Martinique
America/Matamoros
America/Mazatlan
America/Menominee
America/Merida
America/Metlakatla
America/Mexico_City
America/Miquelon
America/Moncton
America/Monterrey
America/Montevideo
America/Montserrat
America/Nassau
America/New_York
America/Nome
America/Noronha
America/North_Dakota/Beulah
America/North_Dakota/Center
America/North_Dakota/New_Salem
America/Nuuk
America/Ojinaga
America/Panama
America/Paramaribo
America/Phoenix
America/Port-au-Prince
America/Port_of_Spain
America/Porto_Velho
America/Puerto_Rico
America/Punta_Arenas
America/Rankin_Inlet
America/Recife
America/Regina
America/Resolute
America/Rio_Branco
America/Santarem
America/Santiago
America/Santo_Domingo
America/Sao_Paulo
America/Scoresbysund
America/Sitka
America/St_Barthelemy
America/St_Johns
America/St_Kitts
America/St_Lucia
America/St_Thomas
America/St_Vincent
America/Swift_Current
America/Tegucigalpa
America/Thule
America/Tijuana
America/Toronto
America/Tortola
America/Vancouver
America/Whitehorse
America/Winnipeg
America/Yakutat
America/Yellowknife
Antarctica/Casey
Antarctica/Davis
Antarctica/DumontDUrville
Antarctica/Macquarie
Antarctica/Mawson
Antarctica/McMurdo
Antarctica/Palmer
Antarctica/Rothera
Antarctica/Syowa
Antarctica/Troll
Antarctica/Vostok
Arctic/Longyearbyen
Asia/Aden
Asia/Almaty
Asia/Amman
Asia/Anadyr
Asia/Aqtau
Asia/Aqtobe
Asia/Ashgabat
Asia/Atyrau
Asia/Baghdad
Asia/Bahrain
Asia/Baku
Asia/Bangkok
Asia/Barnaul
Asia/Beirut
Asia/Bishkek
Asia/Brunei
Asia/Chita
Asia/Choibalsan
Asia/Colombo
Asia/Damascus
Asia/Dhaka
Asia/Dili
Asia/Dubai
Asia/Dushanbe
Asia/Famagusta
Asia/Gaza
Asia/Hebron
Asia/Ho_Chi_Minh
Asia/Hong_Kong
Asia/Hovd
Asia/Irkutsk
Asia/Jakarta
Asia/Jayapura
Asia/Jerusalem
Asia/Kabul
Asia/Kamchatka
Asia/Karachi
Asia/Kathmandu
Asia/Khandyga
Asia/Kolkata
Asia/Krasnoyarsk
Asia/Kuala_Lumpur
Asia/Kuching
Asia/Kuwait
Asia/Macau
Asia/Magadan
Asia/Makassar
Asia/Manila
Asia/Muscat
Asia/Nicosia
Asia/Novokuznetsk
Asia/Novosibirsk
Asia/Omsk
Asia/Oral
Asia/Phnom_Penh
Asia/Pontianak
Asia/Pyongyang
Asia/Qatar
Asia/Qostanay
Asia/Qyzylorda
Asia/Riyadh
Asia/Sakhalin
Asia/Samarkand
Asia/Seoul
Asia/Shanghai
Asia/Singapore
Asia/Srednekolymsk
Asia/Taipei
Asia/Tashkent
Asia/Tbilisi
Asia/Tehran
Asia/Thimphu
Asia/Tokyo
Asia/Tomsk
Asia/Ulaanbaatar
Asia/Urumqi
Asia/Ust-Nera
Asia/Vientiane
Asia/Vladivostok
Asia/Yakutsk
Asia/Yangon
Asia/Yekaterinburg
Asia/Yerevan
Atlantic/Azores
Atlantic/Bermuda
Atlantic/Canary
Atlantic/Cape_Verde
Atlantic/Faroe
Atlantic/Madeira
Atlantic/Reykjavik
Atlantic/South_Georgia
Atlantic/St_Helena
Atlantic/Stanley
Australia/Adelaide
Australia/Brisbane
Australia/Broken_Hill
Australia/Darwin
Australia/Eucla
Australia/Hobart
Australia/Lindeman
Australia/Lord_Howe
Australia/Melbourne
Australia/Perth
Australia/Sydney
Canada/Atlantic
Canada/Central
Canada/Eastern
Canada/Mountain
Canada/Newfoundland
Canada/Pacific
Europe/Amsterdam
Europe/Andorra
Europe/Astrakhan
Europe/Athens
Europe/Belgrade
Europe/Berlin
Europe/Bratislava
Europe/Brussels
Europe/Bucharest
Europe/Budapest
Europe/Busingen
Europe/Chisinau
Europe/Copenhagen
Europe/Dublin
Europe/Gibraltar
Europe/Guernsey
Europe/Helsinki
Europe/Isle_of_Man
Europe/Istanbul
Europe/Jersey
Europe/Kaliningrad
Europe/Kirov
Europe/Kyiv
Europe/Lisbon
Europe/Ljubljana
Europe/London
Europe/Luxembourg
Europe/Madrid
Europe/Malta
Europe/Mariehamn
Europe/Minsk
Europe/Monaco
Europe/Moscow
Europe/Oslo
Europe/Paris
Europe/Podgorica
Europe/Prague
Europe/Riga
Europe/Rome
Europe/Samara
Europe/San_Marino
Europe/Sarajevo
Europe/Saratov
Europe/Simferopol
Europe/Skopje
Europe/Sofia
Europe/Stockholm
Europe/Tallinn
Europe/Tirane
Europe/Ulyanovsk
Europe/Vaduz
Europe/Vatican
Europe/Vienna
Europe/Vilnius
Europe/Volgograd
Europe/Warsaw
Europe/Zagreb
Europe/Zurich
GMT
Indian/Antananarivo
Indian/Chagos
Indian/Christmas
Indian/Cocos
Indian/Comoro
Indian/Kerguelen
Indian/Mahe
Indian/Maldives
Indian/Mauritius
Indian/Mayotte
Indian/Reunion
Pacific/Apia
Pacific/Auckland
Pacific/Bougainville
Pacific/Chatham
Pacific/Chuuk
Pacific/Easter
Pacific/Efate
Pacific/Fakaofo
Pacific/Fiji
Pacific/Funafuti
Pacific/Galapagos
Pacific/Gambier
Pacific/Guadalcanal
Pacific/Guam
Pacific/Honolulu
Pacific/Kanton
Pacific/Kiritimati
Pacific/Kosrae
Pacific/Kwajalein
Pacific/Majuro
Pacific/Marquesas
Pacific/Midway
Pacific/Nauru
Pacific/Niue
Pacific/Norfolk
Pacific/Noumea
Pacific/Pago_Pago
Pacific/Palau
Pacific/Pitcairn
Pacific/Pohnpei
Pacific/Port_Moresby
Pacific/Rarotonga
Pacific/Saipan
Pacific/Tahiti
Pacific/Tarawa
Pacific/Tongatapu
Pacific/Wake
Pacific/Wallis
US/Alaska
US/Arizona
US/Central
US/Eastern
US/Hawaii
US/Mountain
US/Pacific
UTC
Save
Europe/Berlin
English (United States)
Deutsch (Deutschland)
English (United Kingdom)
English (United States)
Español (España)
Français (France)
Polski (Polska)
Português (Brasil)
Türkçe (Türkiye)
Монгол (Монгол)
Українська (Україна)
中文 (中国)
Login
IntelliAQ workshop on Machine Learning for Air Quality
from
Monday, March 6, 2023 (8:00 AM)
to
Tuesday, March 7, 2023 (6:00 PM)
Monday, March 6, 2023
9:00 AM
9:00 AM - 9:30 AM
Room: Lobby 4th floor
9:30 AM
9:30 AM - 10:00 AM
Room: Lecture hall 4th floor
10:00 AM
A minute treatment of atmospheric chemistry
-
Paul Griffiths
(
National Centre for Atmospheric Science, Cambridge University
)
A minute treatment of atmospheric chemistry
Paul Griffiths
(
National Centre for Atmospheric Science, Cambridge University
)
10:00 AM - 10:40 AM
Room: Lecture hall 4th floor
10:40 AM
Coffee break
Coffee break
10:40 AM - 11:10 AM
Room: Lobby 4th floor
11:10 AM
The IntelliAQ project
-
Martin Schultz
(
JSC
)
The IntelliAQ project
Martin Schultz
(
JSC
)
11:10 AM - 11:40 AM
This presentation describes the IntelliAQ project and its main achievements. The ERC Advanced grant IntelliAQ has been one of the first initiatives to explore the rapidly developing advanced deep learning concepts in the realm of air pollution research. IntelliAQ defined three core targets for which deep learning solutions should be developed: spatiotemporal interpolation, forecasting, and data quality control. Until now the project has mainly focused on the first two objectives and on tropospheric ozone as key air pollutant. We expect, however, to broaden the scope during the remaining time of the project. Besides the scientific development of deep learning applications, the project has also contributed to building scalable and reproducible machine learning workflows and it has allowed for the development of a modern, FAIR and open data infrastructure for global air pollution data.
11:40 AM
Evaluating the multi-task learning approach for land use regression modelling of air pollution
-
Anna Krause
(
University of Wuerzburg, Chair for Computer Science X Data Science
)
Evaluating the multi-task learning approach for land use regression modelling of air pollution
(Machine learning applications)
Anna Krause
(
University of Wuerzburg, Chair for Computer Science X Data Science
)
11:40 AM - 12:00 PM
Air pollution has been linked to several health problems in- cluding heart disease, stroke and lung cancer. Modelling and analyzing this dependency requires reliable and accurate air pollutant measurements collected by stationary air monitor- ing stations. However, usually only a low number of such stations are present within a single city. To retrieve pollution concentrations for unmeasured locations, researchers rely on land use regression (LUR) models. Those models are typi- cally developed for one pollutant only. However, as results in different areas have shown, modelling several related out- put variables through multi-task learning can improve the prediction results of the models significantly. In this work, we compared prediction results from single- task and multi-task learning multilayer perceptron models on measurements taken from the OpenSense dataset and the London Atmospheric Emissions Inventory dataset. LUR fea- tures were generated from OpenStreetMap using OpenLUR and used to train hard parameter sharing multilayer per- ceptron models. The results show multi-task learning with sufficient data significantly improves the performance of a LUR model.
12:00 PM
Graph machine learning for improved imputation of missing tropospheric ozone data
-
Clara Betancourt
(
Forschungszentrum Jülich
)
Graph machine learning for improved imputation of missing tropospheric ozone data
(Machine learning applications)
Clara Betancourt
(
Forschungszentrum Jülich
)
12:00 PM - 12:20 PM
Room: Lecture hall 4th floor and lobby
Gaps in the measurement series of atmospheric pollutants can impede the reliable assessment of their impacts and trends. Data imputation methods to close the gaps in the observation series range from simple linear interpolation to machine learning. We propose a new method for missing data imputation of the air pollutant tropospheric ozone by using the graph machine learning algorithm ‘correct and smooth'. This algorithm uses auxiliary data that characterize the measurement location and, in addition, ozone observations at neighboring sites. Specifically, we apply this method to the missing data of a preliminary dataset from 278 stations of the year 2011 of the German Environment Agency (Umweltbundesamt - UBA) monitoring network. These data exhibit three distinct, frequently occurring gap patterns: shorter gaps in the range of hours, longer gaps of up to several months in length, and gaps occurring at multiple stations at once. We apply correct and smooth as a post-processing algorithm after imputing the missing data with different statistical and machine learning methods. For short gaps of up to five hours, linear interpolation is most accurate with $R^2$ values of 0.91 - 0.97, RMSEs of 2.43 - 4.44 ppb, and indexes of agreement of 0.98 - 0.99. Longer gaps at single stations are most effectively imputed by a random forest in connection with correct and smooth, with $R^2$ values of 0.86 - 0.87, RMSEs of 5.64 - 6.18 ppb, and an index of agreement of 0.96. This case exhibits strong improvement through the correct and smooth algorithm, as the RMSEs decreased by 0.57 - 0.76 ppb compared to the random forest alone. For longer gaps at multiple stations, the correct and smooth algorithm improved the random forest RMSE by 0.07 ppb, despite a lack of data in the neighborhood of the missing values. Based on these results, we suggest applying a hybrid of linear interpolation and graph machine learning for the imputation of tropospheric ozone time series.
12:20 PM
12:20 PM - 12:30 PM
Room: Lecture hall 4th floor
12:30 PM
Lunch break
Lunch break
12:30 PM - 1:30 PM
Room: Lobby 4th floor
1:30 PM
Evaluating the causal effects of interventions on air quality using machine learning and synthetic control approaches
-
Zongbo Shi
(
University of Birmingham
)
Evaluating the causal effects of interventions on air quality using machine learning and synthetic control approaches
(Machine learning applications)
Zongbo Shi
(
University of Birmingham
)
1:30 PM - 1:50 PM
“Air quality interventions” refer to any actions that may lead to a change in air quality, whether intentionally such as clean air zone and unintentionally, such as COVID-lockdowns or Net Zero actions. Quantifying the impact of “interventions” on air quality is one of the key processes in air quality management. Observational data from monitoring networks are often used for assessing the air quality effectiveness of interventions. However, air pollution levels do not change linearly with emissions due to variations in weather conditions and chemical processes. Furthermore, they change on a seasonal and year by year basis at a specific location. Here, we will present our studies on the changes in air pollutant concentrations arising from emission changes due to clean air actions (such as clean air zone, clean heating policies) and the COVID-19 lockdowns based on a machine learning technique and a “synthetic control” method. These methods are able to detect sudden decreases in air pollutant concentrations due to “interventions”. They also provide a quantitative evaluation of “causal” effects of the interventions.
1:50 PM
Forecasting daily ozone air pollution across Europe with transformers
-
Sebastian Hickman
(
Cambridge University
)
Forecasting daily ozone air pollution across Europe with transformers
(Machine learning applications)
Sebastian Hickman
(
Cambridge University
)
1:50 PM - 2:10 PM
Room: Lecture hall 4th floor and lobby
Surface ozone is an air pollutant that contributes to hundreds of thousands of premature deaths annually. Accurate short-term ozone forecasts may allow improved policy actions to reduce the risk to health, such as accurate and timely air quality warnings. However, forecasting surface ozone is a difficult problem, as its concentrations are controlled by a number of physical and chemical processes which act on varying timescales. Accounting for these temporal dependencies appropriately is a promising avenue to provide more accurate ozone forecasts. We therefore implement a state-of-the-art transformer-based model, the Temporal Fusion Transformer, trained on observational data from three European countries. In four-day forecasts of daily maximum 8-hour ozone (DMA8), our novel approach is highly skilful (MAE = 4.9 ppb, coefficient of determination R$^2$ = 0.81), and generalises well to data from 13 European countries unseen during training (MAE = 5.0 ppb, R$^2$ = 0.78). The model outperforms standard machine learning models on our data, and compares favourably to the performance of other published deep learning architectures tested on different data. Furthermore, we illustrate that the model pays attention to physical variables known to control ozone concentrations, and that the attention mechanism allows the model to use relevant days of past ozone concentrations to make accurate forecasts.
2:10 PM
Driving mechanisms of global surface ozone and its bias in the chemical reanalysis products using machine-learning approach
-
Kazuyuki Miyazaki
(
NASA JPL
)
Driving mechanisms of global surface ozone and its bias in the chemical reanalysis products using machine-learning approach
(Machine learning applications)
Kazuyuki Miyazaki
(
NASA JPL
)
2:10 PM - 2:30 PM
Room: Lecture hall 4th floor and lobby
Providing accurate global estimates of pollution is essential to evaluate the global public health burden of disease associated with air pollution exposure, which in turn will help environmental policy making. Nevertheless, our current knowledge of air pollution suffers from large biases in model predictions and insufficient information from the current observing system that includes surface in situ and satellite measurements. Chemical data assimilation (DA) has made substantial progress in reproducing regional and global ozone and health impacts. Nevertheless, chemical DA is fundamentally limited by model fidelity both in terms of representation resolution and unresolved processes that can exhibit systematic patterns in model and data mismatches. Thus, the analysis and prediction of air quality at regional scales has been stymied by unresolved and highly non-linear physical and chemical processes. In this study, we develop and utilize a novel, explainable machine learning (ML) model to break down regional bias dependence and provide scientific interpretation of ozone and its bias drivers by analyzing a large set of exogenous and input data sources provided by chemical DA and various observations. By doing so, we aim to provide new scientific insights into the factors that control bias in air quality assessment, and the drivers of global ozone trends and their impact on global air quality. We obtained the differences between the analysis from JPL’s chemical DA system, MOMO-Chem, and the available independent observations from the surface Tropospheric Ozone Assessment report (TOAR) network for ozone for 2011-2015. These differences are used as the outputs of interest in the ML model, while various global chemical concentrations and meteorological (>100) parameters from MOMO-Chem, as well as high-resolution satellite measurements of relevant parameters, are used as inputs into the ML model. A regression-tree randomized ensemble ML approach was applied to model the patterns of the residual bias of the MOMO-Chem output. Importantly, we extracted and combined local and global measures of how inputs affected the predicted bias in MOMO-Chem output, therefore, providing ML model explanations and quantification of the impacts. Subsequently, we are able to identify a set of distinct areas and their corresponding drivers, which in turn will shed light on both resolved and unresolved model physical and chemical processes and their importance to air pollutant predictions. Our analysis suggests that the developed ML framework is able to predict the overall model ozone bias magnitude and variations over North America, Europe, and East Asia. Our results also highlight that adding high-resolution satellite data, MODIS land cover data in this case, provides additional important constraints to improve the ML prediction, especially for highly polluted events. The framework is then used to identify global patterns of surface ozone and physical model bias drivers, demonstrating substantial contributions from parameters related to, for example, PBL mixing, sea breeze, photolysis rates, precursors, and topography, with significant seasonal and regional variations. This study offers a unique synthesis of model-based inference and explainable ML techniques for chemical transport modeling and data assimilation to identify regionally-dependent and process-level mechanisms driving near-surface pollution and correct for their impact on air quality predictions.
2:30 PM
Thematic discussion
Thematic discussion
2:30 PM - 3:00 PM
Room: Lecture hall 4th floor
3:00 PM
Short interventions and discussion
Short interventions and discussion
3:00 PM - 3:30 PM
3:30 PM
Coffee break
Coffee break
3:30 PM - 4:00 PM
Room: Lobby 4th floor
4:00 PM
Deep learning for weather prediction
-
Bing Gong
(
Jülich Supercomputing Center
)
Deep learning for weather prediction
(Machine learning applications)
Bing Gong
(
Jülich Supercomputing Center
)
4:00 PM - 4:30 PM
Weather and air pollution have much in common and if it is possible to forecast weather with deep neural networks, it should also be possible to forecast air pollution. This is why, in the IntelliAQ project, we have focused on weather forecasting to explore deep learning methods from video prediction. The presentation will cover results from IntelliAQ as well as from the European MAELSTROM project. Accurate weather predictions are essential for many aspects of social society. Nowadays, weather prediction highly relies on numerical weather prediction (NWP) models, which require huge computational resources. Recently, the potential of deep neural networks to generate bespoken weather forecasts has been explored in a couple of scientific studies inspired by the successful applications in the computer vision domain. The super-resolution task aiming to project low resolution to a high-resolution field is somewhat analogous to downscaling, and video prediction is similar to weather forecasting in the meteorological domain. Inspired by this, we explore three case studies by exploring the state-of-the-art deep learning approaches for short-term weather forecasting and downscaling for 2 m temperature and precipitation. In the first study, we focus on the predictability of the diurnal cycle of near-surface temperatures. A ConvLSTM, and an advanced generative network, the Stochastic Adversarial Video Prediction (SAVP), are applied to forecast the 2 m temperature for the next 12 hours over Europe. Results show that SAVP is significantly superior to the ConvLSTM model in terms of several evaluation metrics. Our study also investigates the sensitivity to the input data in terms of selected predictors, domain size, and the number of training samples. The results demonstrate that additional predictors, i.e., in our case the total cloud cover and the 850 hPa temperature, enhance the forecast quality. The model can also benefit from a larger spatial domain. By contrast, the effect of reducing the training dataset length from 11 to 8 years is rather small. Furthermore, we reveal a small trade-off between the MSE and the spatial variability of the forecasts when tuning the weight of the krenL1-loss component in the SAVP model. In the second study, we explore a custom-tailored GAN-based architecture for precipitation nowcasting. The prediction of precipitation patterns at a high spatiotemporal resolution up to two hours ahead, also known as precipitation nowcasting, is of great relevance in weather-dependent decision-making and early warning systems. We developed a novel method named Convolutional Long-short term memory Generative Adversarial Network (CLGAN) to improve the nowcasting skills of heavy rain events with deep neural networks. The model constitutes a GAN architecture whose generator is built upon an u-shaped encoder-decoder network (U-Net) equipped with recurrent LSTM cells to capture spatiotemporal features. A comprehensive comparison between CLGAN, another advanced video prediction model PredRNN-v2 and the optical flow model DenseRotation is performed. We show that CLGAN outperforms in terms of point-by-point metrics as well as scores for dichotomous events and object-based diagnostics. The results encourage future work based on the proposed CLGAN architecture to improve the accuracy of precipitation nowcasting systems further. In the last case study, we make use of deep neural networks with a super-resolution approach for statistical precipitation downscaling. We apply the Swin transformer architecture (SwinIR) as well as convolutional neural network (U-Net) with a Generative Adversarial Network (GAN) and a diffusion component for probabilistic downscaling. We use short-range forecasts from the Integrated Forecast System (IFS) on a regular spherical grid with xIFS=0.1° and map to the high-resolution observation radar data RADKLIM (xRK=0.01°). The neural networks are fed with nine static and dynamic predictors similar to the study by Harris et al., 2022. All the models are comprehensively evaluated by grid point-level errors as well as error metrics for spatial variability and the generated probability distribution. Our results demonstrate that the Swin Transformer model can improve accuracy with lower computation cost compared to the U-Net architecture. The GAN and diffusion models both further help the model to capture the strong spatial variability from the observed data. Our results encourage further development of DNNs that can be potentially leveraged to downscale other challenging Earth system data, such as cloud cover or wind. In our study, we investigate the performance of the state-of-art deep learning models with a focus on the generative models for weather forecasting and downscaling task. Our results We analyzed the model performance of the deep learning model in-depth on various evaluation metrics. The results demonstrate that deep learning from computer vision attains some predictive skills in weather forecasting and downscaling. Particularly, the generative models can help to reserve small-scale details of the prediction. Beyond the generative models, the success of vision transformer models (ViT) and graph neural networks (GNN) in image generation have gained tremendous attention recently. They also spark a huge performance revolution in weather forecasting. The models such as Pangu-Weather (Bi et al., 2022, Pathak et al., 2022, Lam, 2022), show the potential capability to surpass the NWP model. In future work, we will explore the tailored ViT models to better address meteorological problems.
4:30 PM
4:30 PM - 5:00 PM
Room: Lobby 4th floor
7:30 PM
Conference dinner
Conference dinner
7:30 PM - 10:30 PM
Room: https://www.mashery-hummus.de/
Tuesday, March 7, 2023
10:00 AM
Integration of open-source remote sensing data for the investigation of subgrid scale drivers of pollution inferred from model-based inference and machine learning
-
Kelsey Doerksen
(
University of Oxford
)
Integration of open-source remote sensing data for the investigation of subgrid scale drivers of pollution inferred from model-based inference and machine learning
(Machine learning applications)
Kelsey Doerksen
(
University of Oxford
)
10:00 AM - 10:20 AM
Air pollution is the largest environmental cause of disease and premature death, resulting in more than 9 million premature deaths in 2015 - several times more than from AIDS, tuberculosis, and malaria combined [1-2]. While significant progress has been made in reproducing regional and global ozone fields and their attributions using chemical transport models and data assimilation techniques, including JPL’s Multi-mOdel Multi-cOnstituent Chemical data assimilation (MOMO-Chem) framework [3], there is still a challenge to reproduce surface ozone, especially at urban scales relevant for human health impact assessments. The NASA Jet Propulsion Laboratory’s Scientific Understanding from Data Science (SUDS) strategic initiative is designed to form interconnected teams between the data science and physical science communities to leverage data science techniques for improving scientific research through revealing new connections in data. The work presented here encapsulates the subgrid scale drivers of pollution SUDS project, mainly focusing on surface ozone and its precursors, inferred from model-based inference and machine learning (ML). Combining machine learning and data science techniques with the domain knowledge and science expertise from the SUDS team, the objectives of this work are to improve our understanding and prediction of global air quality with machine learning. Using approximately 80 physical and chemical parameters from MOMO-Chem and TOAR-2 observations as the input feature space, maximum daily average 8 h (MDA8) ozone global bias is predicted using a Random Forest Regressor pipeline. This framework has been used to investigate global surface ozone variations and their drivers using explainable ML techniques (Miyazaki et al., in prep). Recent progress in the project has integrated open-source remote sensing data from Google Earth Engine (GEE), including MODIS land cover and Population Density data [4-5]. A generalizable data processing tool has been built for the ML pipeline to automatically extract and process data matching the MOMO-Chem global grid from Google Earth Engine to generate features to improve ML model performance. Early results of bias prediction across the global TOAR-2 ground station network with the added GEE features show an RMSE and R-squared performance improvement, with a 4% and 15.5% respective improvement in January and 2% and 8% performance respective improvement in July experiments. Especially, the high-resolution MODIS data provided additional constraints to improve the representation of high ozone events. Both the MODIS and Population Density features are ranked in the top 15 of permutation importance by the ML model, showing promise for the use of the added datasets in ozone bias prediction with Random Forest, and the potential of leveraging open-source data to improve our understanding of the global drivers of air quality. Future work will include leveraging the GEE tool utilize further high resolution data, such as the MODIS Burned Pixel Area Product and the VIIRS Nighttime Composites [6-7] imagery, to better understand the subgrid-scale drivers of global ozone, which would contribute to the broader TOAR-2 community. References [1] R. Fuller, P. Landrigan, K. Balakrishnan, G. Bathan, S. Bose-O’Reilly, and M. Brauer, “Pollution and health: a progress update,”The Lancet Planetary Health, vol. 6, no. 6, pp. E535–E547, 2022. [2] World Bank, “Topics - Pollution”, Online. Available: https://www.worldbank.org/en/topic/pollution [3] Miyazaki, K., Bowman, K. W., Yumimoto, K., Walker, T., and Sudo, K.: Evaluation of a multi-model, multi-constituentassimilation framework for tropospheric chemical reanalysis, Atmos. Chem. Phys., 20, 931–967, https://doi.org/10.5194/acp-20-931-2020, 2020 [4] Friedl, M., Sulla-Menashe, D. (2019). MCD12Q1 MODIS/Terra+Aqua Land Cover Type Yearly L3 Global 500m SIN Grid V006 [Data set]. NASA EOSDIS Land Processes DAAC. Accessed 2023-01-23 from https://doi.org/10.5067/MODIS/MCD12Q1.006 [5] Center for International Earth Science Information Network - CIESIN - Columbia University. 2018. Gridded Population ofthe World, Version 4 (GPWv4): Population Count, Revision 11. Palisades, New York: NASA Socioeconomic Data and Applications Center (SEDAC). https://doi.org/10.7927/H4JW8BX5. [6] NASA EOSDIS Land Processes Distributed Active Archive Center (LP DAAC), MCD64A1.061 MODIS Burned Area Monthly Global 500m [Data set]. Accessed 2023-01-23 from https://developers.google.com/earth-engine/datasets/catalog/MODIS_061_MCD64A1 [7] C. D. Elvidge, K. E. Baugh, M. Zhizhin, and F.-C. Hsu, “Why VIIRS data are superior to DMSP for mapping nighttime lights,” Asia-Pacific Advanced Network 35, vol. 35, p. 62, 2013.
10:20 AM
AtmoRep: Large Scale Representation Learning for Atmospheric Data
-
Christian Lessig
(
Otto-von-Guericke-Universität Magdeburg
)
AtmoRep: Large Scale Representation Learning for Atmospheric Data
(Machine learning core)
Christian Lessig
(
Otto-von-Guericke-Universität Magdeburg
)
10:20 AM - 11:00 AM
The AtmoRep project asks if one can train one neural network that represents and describes all atmospheric dynamics. AtmoRep’s ambition is hence to demonstrate that the concept of large-scale representation learning, whose principle feasibility and potential was established by large language models such as GPT-3, is also applicable to scientific data and in particular to atmospheric dynamics. The project is enabled by the large amounts of atmospheric observations that have been made in the past as well as advances on neural network architectures and self-supervised learning that allow for effective training on petabytes of data. Eventually, we aim to train on all of the ERA5 reanalysis and, furthermore, fine tune on observational data such as satellite measurements to move beyond the limits of reanalyses. We will provide an overview of the theoretical formulation of AtmoRep, of our transformer-based network architecture, and of the training protocol for self-supervised learning that allows for unlabelled data such as reanalyses, simulation outputs and observations to be used for training and re-fining the network. We will also present the performance of AtmoRep for applications such as downscaling and forecasting and, furthermore, demonstrate that AtmoRep has substantial zero-short skill, i.e., it is capable to perform well on tasks it was not trained for. Although not specifically designed for air quality forecasting and analysis, we will also explain why AtmoRep provides a powerful basis for it and how a pre-trained AtmoRep network can be adapted for the task with limited computational costs.
11:00 AM
Coffee break
Coffee break
11:00 AM - 11:30 AM
11:30 AM
Short interventions and discussion
Short interventions and discussion
11:30 AM - 12:00 PM
12:00 PM
Concluding remarks: final discussion
-
Martin Schultz
(
JSC
)
Paul Griffiths
(
National Centre for Atmospheric Science, Cambridge University
)
Concluding remarks: final discussion
Martin Schultz
(
JSC
)
Paul Griffiths
(
National Centre for Atmospheric Science, Cambridge University
)
12:00 PM - 12:30 PM
12:30 PM
Lunch break
Lunch break
12:30 PM - 1:30 PM
Room: Lobby 4th floor
1:30 PM
1:30 PM - 3:20 PM
Room: Lecture hall 4th floor
3:20 PM
Coffee Break
Coffee Break
3:20 PM - 3:50 PM
Room: Lobby 4th floor
3:50 PM
3:50 PM - 5:00 PM
Room: Lecture hall 4th floor