Preprints
https://doi.org/10.48550/arXiv.2407.02089
https://doi.org/10.48550/arXiv.2407.02089
10 Oct 2024
10 Oct 2024
Status: this preprint is open for discussion and under review for Geoscientific Model Development (GMD).

GPTCast: a weather language model for precipitation nowcasting

Gabriele Franch, Elena Tomasi, Rishabh Wanjari, Virginia Poli, Chiara Cardinali, Pier Paolo Alberoni, and Marco Cristoforetti

Abstract. This work introduces GPTCast, a generative deep-learning method for ensemble nowcast of radar-based precipitation, inspired by advancements in large language models (LLMs). We employ a GPT model as a forecaster to learn spatiotemporal precipitation dynamics using tokenized radar images. The tokenizer is based on a Quantized Variational Autoencoder featuring a novel reconstruction loss tailored for the skewed distribution of precipitation that promotes faithful reconstruction of high rainfall rates. The approach produces realistic ensemble forecasts and provides probabilistic outputs with accurate uncertainty estimation. The model is trained without resorting to randomness, all variability is learned solely from the data and exposed by model at inference for ensemble generation. We train and test GPTCast using a 6-year radar dataset over the Emilia-Romagna region in Northern Italy, showing superior results compared to state-of-the-art ensemble extrapolation methods.

Gabriele Franch, Elena Tomasi, Rishabh Wanjari, Virginia Poli, Chiara Cardinali, Pier Paolo Alberoni, and Marco Cristoforetti

Status: open (extended)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • CEC1: 'Comment on egusphere-2024-3002', Juan Antonio Añel, 30 Oct 2024 reply
    • EC1: 'Reply on CEC1', David Topping, 30 Oct 2024 reply
      • CEC2: 'Reply on EC1', Juan Antonio Añel, 31 Oct 2024 reply
        • AC1: 'Reply on CEC2', Gabriele Franch, 01 Nov 2024 reply
  • RC1: 'Review of GPTCast - LLMs meet nowcasting', Anonymous Referee #1, 26 Dec 2024 reply
Gabriele Franch, Elena Tomasi, Rishabh Wanjari, Virginia Poli, Chiara Cardinali, Pier Paolo Alberoni, and Marco Cristoforetti

Data sets

Dataset for "GPTCast: a weather language model for precipitation nowcasting" Gabriele Franch, Elena Tomasi, Chaira Cardinali, Virginia Poli, Pier Paolo Alberoni, and Marco Cristoforetti https://doi.org/10.5281/zenodo.13692016

Model code and software

Code for "GPTCast: a weather language model for precipitation nowcasting" Gabriele Franch, Elena Tomasi, and Marco Cristoforetti https://doi.org/10.5281/zenodo.13832526

Interactive computing environment

Jupyter Notebooks for "GPTCast: a weather language model for precipitation nowcasting" Gabriele Franch, Elena Tomasi, and Marco Cristoforetti https://github.com/DSIP-FBK/GPTCast/tree/main/notebooks

Gabriele Franch, Elena Tomasi, Rishabh Wanjari, Virginia Poli, Chiara Cardinali, Pier Paolo Alberoni, and Marco Cristoforetti

Viewed

Since the preprint corresponding to this journal article was posted outside of Copernicus Publications, the preprint-related metrics are limited to HTML views.

Total article views: 181 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
181 0 0 181 0 0
  • HTML: 181
  • PDF: 0
  • XML: 0
  • Total: 181
  • BibTeX: 0
  • EndNote: 0
Views and downloads (calculated since 10 Oct 2024)
Cumulative views and downloads (calculated since 10 Oct 2024)

Viewed (geographical distribution)

Since the preprint corresponding to this journal article was posted outside of Copernicus Publications, the preprint-related metrics are limited to HTML views.

Total article views: 170 (including HTML, PDF, and XML) Thereof 170 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 21 Jan 2025
Download
Short summary
Our research introduces GPTCast, a novel method for very short-term precipitation forecasting using radar data. By applying advanced machine learning techniques inspired by large language models, we developed a system that generates accurate and realistic weather predictions. We trained the model using six years of radar data from Northern Italy, demonstrating its superior performance over leading ensemble extrapolation methods.