the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Wikimpacts 1.0: A new global climate impact database based on automated information extraction from Wikipedia
Abstract. Climate extremes like storms, heatwaves, wildfires, droughts and floods significantly threaten society and ecosystems. However, comprehensive data on the socio-economic impacts of climate extremes remains limited. Here we present Wikimpacts 1.0, a global climate impact database built by extracting information from Wikipedia using natural language processing. Our method identifies relevant articles, extracts the information using GPT4o, post-processes the information and consolidates the database. Impact data is stored at the event, national, and sub-national levels, covering 2,928 events from 1034 to 2024, with 20,186 national and 36,394 sub-national entries. The database shows low error scores (range from 0 to 1) for event-level information like timing (0.05), deaths (0.03), and economic damage (0.12), and slightly higher error scores for injuries (0.21), homelessness (0.25), displacement (0.29), and damaged buildings (0.28) compared to manually annotated data from 156 events. Wikimpacts 1.0 provides broader impact coverage on storms than EM-DAT at the sub-national level. In comparing impact values, 38 out of 234 matched events have identical data for deaths, and 7 of 94 for injuries. However, there are notable discrepancies in information on homelessness and damage. Our public database highlights the potential of natural language processing to complement existing impact datasets and to provide robust information on climate impacts.
- Preprint
(32354 KB) - Metadata XML
-
Supplement
(639 KB) - BibTeX
- EndNote
Status: open (until 10 Dec 2025)
- RC1: 'Comment on egusphere-2025-4891', Anonymous Referee #1, 17 Nov 2025 reply
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 215 | 34 | 8 | 257 | 20 | 7 | 8 |
- HTML: 215
- PDF: 34
- XML: 8
- Total: 257
- Supplement: 20
- BibTeX: 7
- EndNote: 8
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
General assessment
The topic is timely and relevant. A global, open, LLM-based impact database is of clear interest to NHESS readers. The manuscript reads as a data-and-methods paper describing a new dataset, its structure, extraction pipeline, evaluation, and example applications. This fits well within the journal’s scope for data-oriented contributions. I recommend major revisions to improve clarity, strengthen the discussion of limitations, and help users understand how to interpret and apply the dataset.
Major practical comment on the DB: Regarding event–article mapping and potential misclassification
While inspecting the public database, I noticed several cases where impacts appear to be drawn from broad multi-event Wikipedia pages (e.g. “Tropical cyclones in 2017”) even when a dedicated single-event article exists (e.g. “Cyclone Numa”). This can lead (it does actually) to incorrect country lists, the inclusion of non-impacted areas, or duplicate entries for the same event. The current manuscript does not fully explain how cross-references within multi-event articles are handled (for example, when one system contributes to another) nor how potential double-counting or mis-allocation of impacts is prevented. I recommend adding a subsection clarifying the filtering logic, giving examples of typical failure modes, and explaining whether any automatic or rule-based deduplication is applied during consolidation.
Specific comments on the article
Regarding the role of Wikimpacts 1.0 relative to existing datasets, it would help to clarify early in the Introduction whether the authors position this work as complementary to curated impact databases such as EM-DAT and DesInventar, or as an alternative source. It would also be useful to specify for which types of analyses the dataset is most appropriate (for example, global multi-hazard comparisons or exploratory sub-national studies) and which applications require caution (such as completeness-sensitive national loss or other variable accounting).
Coverage and Wikipedia reliance: the requirement that an English Wikipedia article must exist implies notability and language biases. Small-scale or local events, or those in regions with limited Wikipedia activity, are surely under-represented. I suggest adding a short paragraph addressing this and explaining how users should interpret the absence of events. This also helps contextualise the patterns shown in Fig. 4.
L1-L3 definitions (Sect. 2 and Sect. 3.3.3): the manuscript would benefit from a clearer and more consistent description of what “location” means at each level. At L80, levels are defined as event (L1), country (L2), and sub-national (L3). Later wording in Sect. 3.3.3 is less precise. Restating the definitions once, with consistent terminology, will help users interpret the later evaluation of location accuracy.
Hazard types (Sect. 3.4): the exclusion of events that cannot be mapped to the seven main hazard categories, such as landslides, should be made explicit as a limitation. It would help readers to understand whether landslide impacts are entirely lost or whether some are absorbed under parent storm or flood events.
Phrase “extensive spatio-temporal coverage” (L58): this wording may overstate completeness given Wikipedia's known biases. Moderating this statement or directing the reader to the evaluation and limitations sections would avoid possible misinterpretation.
Regarding the example of the 2011 European floods, the manuscript notes that the main event was categorised as an extratropical cyclone, although a flood category may be more appropriate. This is a useful illustration of hazard-type ambiguity. It would help if the authors commented briefly on how common such cases are and whether simple rule-based corrections might reduce them.
Temporal coverage: the database extends from 1034 to 2024. The sharp rise in event counts after the 19th century makes it clear that older entries are sparse. Adding one sentence advising users how to interpret pre-1900 records (highly incomplete, not suitable for quantitative trend analysis) would improve clarity.
Evaluation (Sect. 4): Its structure and intent are valuable. It would help the reader if Sect. 3 indicated that the extraction quality is assessed in Sect. 4. Furthermore, Sect. 4 would benefit from a short explanation of how the 70 + 156 gold-standard events were sampled (randomly or stratified). This may affect the generalisability of the error rates (?).
Regarding the interpretation of field-specific error rates, Table 6 reveals that location has a much higher error rate than other fields. The manuscript would be strengthened by explaining what types of errors dominate (for example, administrative-level mismatches, NULL penalties, or coordinate issues) and by offering guidance on how users should interpret L2 and L3 location fields. A short paragraph identifying which extracted fields are robust for most applications and which require caution would be particularly helpful.
The sentence “Within L1, event and timing data are highly accurate, while location data is less robust,” is not intuitive because L1 represents the aggregated event level. It would be useful to clarify what “location” means at L1 and why its accuracy is lower.
Comparison with EM-DAT: the manuscript would be improved by a more cautious framing of discrepancies. Differences may arise not only from extraction errors but also from different event definitions, thresholds, and loss components. I recommend explicitly positioning Wikimpacts as complementary to curated sources and offering guidance on how users might use them together.
Fig. 4: The distribution of events confirms that Wikimpacts 1.0 reflects high-impact, media-visible disasters rather than a full record of climate extremes. Making this explicit in the Discussion would prevent users from interpreting the dataset as complete.
Discussion: The existing section is strong, but could more directly address certain limitations:
• notability and language biases linked to Wikipedia;
• the deliberate exclusion of particular hazards (for example, landslides) and implications for multi-hazard studies;
• systematic weaknesses in LLM extraction for multi-country or compound hazards beyond aggregated error rates.
A short concluding paragraph offering explicit user guidance, indicating suitable and unsuitable use cases, would be valuable. For example, the database appears well-suited for global comparative studies, exploratory sub-national analyses where data exist, or cross-hazard synthesis, but less suitable for completeness-sensitive applications or studies focused on small, local events.
Minor comments
Regarding language and clarity, a few specific examples illustrate areas where editing would improve readability:
• The phrase “administrative units at the same level can also be highly variable” (L35) needs clarification as to what variability matters for impact analysis.
• The sentence beginning “Due to the categorization based on single hazards” (L40) would benefit from rephrasing for grammar and conceptual clarity.
• The reference to DesInventar at L42 would be clearer if the specific shared limitations were briefly stated.
• The last part of the Introduction (L55–70) blends methodological details that belong in Methods or Data Availability. Ending the Introduction with a clearer statement of objectives and contributions would strengthen the structure.
In Section 2, the opening paragraph focuses on repository and accessibility information rather than internal structure. Moving that material to a Data Availability section would allow Section 2 to begin more directly with the conceptual design. Similarly, some technical field-definition details could move to Supplementary Information.
Regarding abbreviations, SI at L88 should be defined on first use.
Regarding the example referring to 2025 Wikipedia data (L89), the manuscript could clarify how information “as of 2025” was obtained when the mining cut-off appears to be 2024. A brief explanation would prevent confusion. Consider a relevant note in Figure 1.