Preprints
https://doi.org/10.5194/egusphere-2025-4891
https://doi.org/10.5194/egusphere-2025-4891
29 Oct 2025
 | 29 Oct 2025

Wikimpacts 1.0: A new global climate impact database based on automated information extraction from Wikipedia

Ni Li, Wim Thiery, Shorouq Zahra, Mariana Madruga de Brito, Koffi Worou, Murathan Kurfalı, Seppe Lampe, Paul Muñoz, Clare Flynn, Camila Trigoso, Joakim Nivre, Jakob Zscheischler, and Gabriele Messori

Abstract. Climate extremes like storms, heatwaves, wildfires, droughts and floods significantly threaten society and ecosystems. However, comprehensive data on the socio-economic impacts of climate extremes remains limited. Here we present Wikimpacts 1.0, a global climate impact database built by extracting information from Wikipedia using natural language processing. Our method identifies relevant articles, extracts the information using GPT4o, post-processes the information and consolidates the database. Impact data is stored at the event, national, and sub-national levels, covering 2,928 events from 1034 to 2024, with 20,186 national and 36,394 sub-national entries. The database shows low error scores (range from 0 to 1) for event-level information like timing (0.05), deaths (0.03), and economic damage (0.12), and slightly higher error scores for injuries (0.21), homelessness (0.25), displacement (0.29), and damaged buildings (0.28) compared to manually annotated data from 156 events. Wikimpacts 1.0 provides broader impact coverage on storms than EM-DAT at the sub-national level. In comparing impact values, 38 out of 234 matched events have identical data for deaths, and 7 of 94 for injuries. However, there are notable discrepancies in information on homelessness and damage. Our public database highlights the potential of natural language processing to complement existing impact datasets and to provide robust information on climate impacts.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.
Share

Journal article(s) based on this preprint

04 Jun 2026
| Highlight paper
Wikimpacts 1.0: a new global climate impact database based on automated information extraction from Wikipedia
Ni Li, Wim Thiery, Shorouq Zahra, Mariana Madruga de Brito, Koffi Worou, Murathan Kurfalı, Seppe Lampe, Paul Muñoz, Clare Flynn, Camila Trigoso, Joakim Nivre, Jakob Zscheischler, and Gabriele Messori
Nat. Hazards Earth Syst. Sci., 26, 2609–2636, https://doi.org/10.5194/nhess-26-2609-2026,https://doi.org/10.5194/nhess-26-2609-2026, 2026
Short summary Editorial statement
Ni Li, Wim Thiery, Shorouq Zahra, Mariana Madruga de Brito, Koffi Worou, Murathan Kurfalı, Seppe Lampe, Paul Muñoz, Clare Flynn, Camila Trigoso, Joakim Nivre, Jakob Zscheischler, and Gabriele Messori

Interactive discussion

Status: closed

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on egusphere-2025-4891', Anonymous Referee #1, 17 Nov 2025
    • AC1: 'Reply on RC1', Ni Li, 06 Jan 2026
  • RC2: 'Comment on egusphere-2025-4891', Anonymous Referee #2, 21 Nov 2025
    • AC2: 'Reply on RC2', Ni Li, 06 Jan 2026

Interactive discussion

Status: closed

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on egusphere-2025-4891', Anonymous Referee #1, 17 Nov 2025
    • AC1: 'Reply on RC1', Ni Li, 06 Jan 2026
  • RC2: 'Comment on egusphere-2025-4891', Anonymous Referee #2, 21 Nov 2025
    • AC2: 'Reply on RC2', Ni Li, 06 Jan 2026

Peer review completion

AR – Author's response | RR – Referee report | ED – Editor decision | EF – Editorial file upload
ED: Reconsider after major revisions (further review by editor and referees) (19 Jan 2026) by Christos Giannaros
AR by Ni Li on behalf of the Authors (24 Feb 2026)  Author's response   Author's tracked changes   Manuscript 
ED: Referee Nomination & Report Request started (04 Mar 2026) by Christos Giannaros
RR by Anonymous Referee #1 (06 Mar 2026)
RR by Anonymous Referee #2 (22 Mar 2026)
ED: Publish subject to minor revisions (review by editor) (31 Mar 2026) by Christos Giannaros
AR by Ni Li on behalf of the Authors (10 Apr 2026)  Author's response   Author's tracked changes   Manuscript 
ED: Publish as is (21 Apr 2026) by Christos Giannaros
AR by Ni Li on behalf of the Authors (06 May 2026)  Manuscript 

Journal article(s) based on this preprint

04 Jun 2026
| Highlight paper
Wikimpacts 1.0: a new global climate impact database based on automated information extraction from Wikipedia
Ni Li, Wim Thiery, Shorouq Zahra, Mariana Madruga de Brito, Koffi Worou, Murathan Kurfalı, Seppe Lampe, Paul Muñoz, Clare Flynn, Camila Trigoso, Joakim Nivre, Jakob Zscheischler, and Gabriele Messori
Nat. Hazards Earth Syst. Sci., 26, 2609–2636, https://doi.org/10.5194/nhess-26-2609-2026,https://doi.org/10.5194/nhess-26-2609-2026, 2026
Short summary Editorial statement
Ni Li, Wim Thiery, Shorouq Zahra, Mariana Madruga de Brito, Koffi Worou, Murathan Kurfalı, Seppe Lampe, Paul Muñoz, Clare Flynn, Camila Trigoso, Joakim Nivre, Jakob Zscheischler, and Gabriele Messori
Ni Li, Wim Thiery, Shorouq Zahra, Mariana Madruga de Brito, Koffi Worou, Murathan Kurfalı, Seppe Lampe, Paul Muñoz, Clare Flynn, Camila Trigoso, Joakim Nivre, Jakob Zscheischler, and Gabriele Messori

Viewed

Total article views: 6,094 (including HTML, PDF, and XML)
HTML PDF XML Total Supplement BibTeX EndNote
3,505 2,364 225 6,094 368 211 182
  • HTML: 3,505
  • PDF: 2,364
  • XML: 225
  • Total: 6,094
  • Supplement: 368
  • BibTeX: 211
  • EndNote: 182
Views and downloads (calculated since 29 Oct 2025)
Cumulative views and downloads (calculated since 29 Oct 2025)

Viewed (geographical distribution)

Total article views: 6,023 (including HTML, PDF, and XML) Thereof 6,023 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 08 Jun 2026
Download

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Short summary
Climate extremes threaten society and ecosystems. Understanding impacts is critical, despite open databases like EM-DAT and DesInventar, reliable impact data remain scattered across various text sources. Wikimpacts 1.0, using GPT4o, provides comprehensive socio-economic impact data on 2,928 events from 1034 to 2024. It offers broader storm coverage and finer spatial resolution impact data than EM-DAT, showcasing the potential of natural language processing to enhance climate impact datasets.
Share