the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
BuRNN (v1.0): A Data-Driven Fire Model
Abstract. Fires play an important role in the Earth system but remain complex phenomena that are challenging to model numerically. Here, we present the first version of BuRNN, a data-driven model simulating burned area on a global 0.5° × 0.5° grid with a monthly time resolution. We trained Long Short-Term Memory networks to predict satellite-based burned area (GFED5) from a range of climatic, vegetation and socio-economic parameters. We employed a region-based cross-validation strategy to account for the high spatial autocorrelation in our data. BuRNN outperforms the process-based fire models participating in ISIMIP3a on a global scale across a wide range of metrics. Regionally, BuRNN outperforms almost all models across a set of benchmarking metrics in all regions. However, in the African savannah regions and Australia burned area is underestimated, leading to a global underestimation of total area burned. Through eXplainable AI (XAI) we unravel the difference in regional drivers of burned area in our models, showing that the presence/absence of bare ground and C4 grasses along with the fire weather index have the largest effects on our predictions of burned area. Lastly, we used BuRNN to reconstruct global burned area for 1901–2019 and compare the simulations against independent long-term historical fire observation databases in five countries and the EU. Our approach highlights the potential of machine learning to improve burned area simulations and our understanding of past fire behaviour.
- Preprint
(18938 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
- RC1: 'Comment on egusphere-2025-3550', Anonymous Referee #1, 03 Oct 2025
-
CEC1: 'Comment on egusphere-2025-3550 - No compliance with the policy of the journal', Juan Antonio Añel, 11 Oct 2025
Dear authors,
Unfortunately, after checking your manuscript, it has come to our attention that it does not comply with our "Code and Data Policy".
https://www.geoscientific-model-development.net/policies/code_and_data_policy.htmlIn your "Code and data availability" section you do not include repositories for the ISIMIP and SPEI data, but cite webpages to get access to them. We can not accept this. You must publish all the data necessary to train your model and their outputs in a suitable repository according to our policy (as you have done with others). Therefore, the current situation with your manuscript is irregular. Please, publish your data in one of the appropriate repositories and reply to this comment with the relevant information (link and a permanent identifier for it (e.g. DOI)) as soon as possible, as we can not accept manuscripts in Discussions that do not comply with our policy.
Also, you must include a modified 'Code and Data Availability' section in a potentially reviewed manuscript, containing the information of the new repositories.
I must note that if you do not fix this problem, we cannot accept your manuscript for publication in our journal.
Juan A. Añel
Geosci. Model Dev. Executive EditorCitation: https://doi.org/10.5194/egusphere-2025-3550-CEC1 -
AC1: 'Reply on CEC1', Seppe Lampe, 13 Oct 2025
Dear Dr. Juan A. Añel,
Thank you for your comment. The ISIMIP and SPEI data is already available on Zenodo and is referenced in the following sentence:
The 1901-2019 burned area simulation of BuRNN is available on Zenodo along with all pre-processed data to train BuRNN (https://zenodo.org/records/16918071; Lampe, 2025b).
We believe the confusion might come from this sentence a few lines down:
The original ISIMIP data is available through the ISIMIP data repository (https://data.isimip.org/), the authentic SPEI data from SPEIbase can be downloaded from https://spei.csic.es/database.html.
We provide a pre-processed version of the ISIMIP and SPEI data in our Zenodo repository (as requested by the topic editor) and additionally provide readers the links to the original sources of the data. We apologize for any confusing wording on our behalf in the Code and Data Availability section.
We will rephrase this to avoid any confusion possible upon receiving the final reviewer's comments. Additionally, we have now added the independent evaluation data and the FireMIP simulations to the repository.
We hope you consider this an adequate course of action.
Kind regards,
Seppe Lampe (on behalf of all co-authors)
Citation: https://doi.org/10.5194/egusphere-2025-3550-AC1 -
CEC2: 'Reply on AC1', Juan Antonio Añel, 13 Oct 2025
Dear authors,
Thank you for your clarifications. My comment was about the original data, not the pre-processed data. It is true that citing in the Code and Data Availability section information that is not relevant and does not serve the purpose of such section, such as the ISIMIP website, is inappropriate, but I insist, ideally you should store the original data, not only the pre-processed data. It would be good if you could do it.
Juan A. Añel
Geosci. Model Dev. Executive Editor
Citation: https://doi.org/10.5194/egusphere-2025-3550-CEC2
-
CEC2: 'Reply on AC1', Juan Antonio Añel, 13 Oct 2025
-
RC2: 'Comment on egusphere-2025-3550', Anonymous Referee #2, 16 Oct 2025
General Comment
Lampe et al. developed a data-driven fire model (BuRNN) based on Long Short-Term Memory to estimate global gridded burned area from 1901 to 2019. BuRNN was trained with satellite-based burned area (GFED5) multiple climates, land cover, vegetation dynamics, and socioeconomic datasets during 2001 to 2020. The trained BuRNN was then used to reconstruct the burned area from 1901 to 2019. Indeed, wildfire remains to be highly uncertain in process-based models. Data-driven methods have the potential to improve the estimation of wildfire, which has been shown in previous studies. This study contributes to our understanding of wildfire mechanisms by providing long-term global burned area estimation and factor importance analysis. However, I have some concerns about the reliability of the data-driven model. Please find the details of the major and specific comments in the following. I recommend a major revision for the current manuscript version.
Major Comments
- BuRRN consistently underestimate the burned area across different regions and at global scales. This underestimation seems to be systematic bias, however, I don’t understand why such error cannot be reduced in BuRRN, which is built on LSTM. I suppose machine learn approaches (including LSTM) is effective in reducing systematic bias. Please take a detailed analysis on this error. I suggest showing the validation at monthly scales. One possible reason is that BuRRN underestimate the maximum monthly burned area, which it may capture lower values well.
- There is a lack of validation of temporal extrapolation. The authors did the out-of-sample validation by splitting the global domain to 11 subregions. However, they didn’t split the dataset in time to validate the model performance in temporal extrapolation. As shown in Figure 4, BuRNN cannot capture the annual trend presented in the benchmark dataset (i.e., GFED5). One of the objectives of this study is to reconstruct global burned area from 1901 to 2019. The failure of capturing the sensitivity to the environment changes during training period may suggest the reconstruction in the past is highly uncertain.
Specific Comments
Line 29: Please specify how the burn area (i.e. 3.5 – 4.5 million km^2) was estimated. As in later sentence, the authors provide the burn area from satellite observation, it is helpful to know how the previous estimation was made.
Figure A1: Please specify which color represent training.
Line 118: Why not randomly grouping the regions into 11 folds?
Line 141: Is 3 years the sequence length used in LSTM? Please elaborate how the sequence length is determined.
Line 155: Could the authors provide some explanation on how the SHAP handle the correlation between input variables?
Line 214: Are those process-based models calibrated? If not, it is not surprise that data-driven method outperforms the process-based model.
Line 219: “between the FireMIP models” is repeated.
Line 223: The evaluation for the 14 fire regions is not consistent with the global monthly evaluation. According to Figure 3, the BuRNN bias does not seems to be significantly. However, Figure 4 shows BURNN significantly underestimated annual burned area at both regional and global scales. Please explain this inconsistency. In addition, BURNN systematically underestimate the burned area, which I don’t understand why the LSTM cannot resume this error.
Figure 4: I suggest the authors to report the slope for the annual burn area for both GFED5 and BuRNN to validate if BuRNN’s can simulate the annual trend. This will demonstrate if BuRRN can capture the sensitivity of wild fire to the changing climate in the past. For example, the global plot in Figure 4 shows GFED5 exhibit a decreasing trend, while BuRNN exhibit no trend.
Line 227: I cannot agree with “excellently modelled”. Those regions are better than other regions, but they are still underestimated in BuRRN.
Line 228: “BuRNN captures the pattern of the interannual variability well, but consistently underestimates the amplitude and total burned area,” This statement is repetitive, which is reported earlier in this paragraph.
Line 245: “in MIDE, BuRNN…”
Line 248: Could the authors try to explain why BuRNN has this behavior: “e higher the average monthly burned area in a region, the better/easier predictions are for that region”?
Line 262: Why not including those variables in BuRNN if they are considered as important? There are existing global datasets of evapotranspiration can be used.
Figure 5: The current text is not very clear to see, please consider increasing the font size. Is the feature importance plotted as Volin plot? I suggest changing to box plot, which gives the 25th, 50th, and 75th percentile. The current plot is dominated by the outliers, which does not help to interpret the data.
Line 269: Why C4 grass Africa is considered as a feature for EURO, BOAS, SHSA, and GLOBAL? The absence of African C4 grass for explaining the lower burned area does not make sense to me. Shouldn’t we focus on existing land covers in each region for the feature importance analysis?
Line 295: Please report the annual correlation.
Line 298 – Line 302: If EFFIS does not include cropland fire, I don’t see a point to benchmark BuRRN with it.
Citation: https://doi.org/10.5194/egusphere-2025-3550-RC2
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 1,512 | 102 | 21 | 1,635 | 12 | 16 |
- HTML: 1,512
- PDF: 102
- XML: 21
- Total: 1,635
- BibTeX: 12
- EndNote: 16
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
Review of “BuRNN (v1.0): A data-driven global burned-area model”
The manuscript presents a monthly 0.5° machine‑learning emulator of burned area trained on GFED5 with region‑blocked cross‑validation. It uses a Long Short‑Term Memory (LSTM) architecture to capture temporal dependence on the input features. The trained BuRNN model is then applied to reconstruct global burned area for 1901–2019, and the simulations are compared against independent long‑term fire observation databases from five countries and the EU. Overall, the manuscript is well written, and the trained model shows promising results. Yet several core claims are insufficiently supported. In particular, BuRNN is target‑aligned to GFED5 when compared with process models, which makes the comparison unfair. Model biases and observational uncertainties must be clearly distinguished. The transferability of a model trained on recent conditions to historical periods requires further discussion. The interpretability of the SHAP analysis assumes feature independence, which is strongly violated; this needs further evaluation.
Major comments
Minor comments