Preprints
https://doi.org/10.5194/egusphere-2022-489
https://doi.org/10.5194/egusphere-2022-489
08 Jul 2022
 | 08 Jul 2022

Refining data-data and data-model biome comparisons using the Earth Movers' Distance (EMD)

Manuel Chevalier, Anne Dallmeyer, Nils Weitzel, Chenzhi Li, Jean-Philippe Baudouin, Ulrike Herzschuh, Xianyong Cao, and Andreas Hense

Abstract. Biome reconstructions are commonly used in data-data and data-model comparison studies to understand past vegetation dynamics. However, most of these assessments are based on the direct comparison of dominant biomes inferred from pollen samples or vegetation simulations. Dominant biomes are deduced from pollen samples using biome affinity scores, which aggregate pollen percentages of taxa assigned to the different biomes. While this approach generates good results over a large range of temporal and spatial scales, reducing pollen assemblages to a single dominant biome can substantially simplify the vegetation signal preserved in pollen samples and even bias conclusions when, for instance, minimal changes in pollen percentages can change the inferred dominant biome. To resolve these issues, we propose to use the Earth Movers’ distance (EMD) as a new metric to compare distributions of biome scores. The EMD has two main advantages: 1) the distributions of biome scores do not need to be reduced to their dominant biome, and the full breadth of the data is taken into account, and 2) different weights can be given to different types of disagreements to account for the ecological distance (e.g. reconstructing a temperate forest instead of a boreal forest is ecologically less wrong than reconstructing the temperate forest instead of a desert). We also introduce EMD-based statistical tests that determine if the similarity of two samples is significantly better than a random association. This paper illustrates the use of the EMD across a series of palaeoecological data-data and data-model case studies based on published data and simulations. These applications highlight the diverse types of analysis where the EMD adds value compared to analyses of the dominant biomes only. The EMD and the statistical tests are included in the paleotools R package (https://github.com/mchevalier2/paleotools).

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.

Journal article(s) based on this preprint

30 May 2023
Refining data–data and data–model vegetation comparisons using the Earth mover's distance (EMD)
Manuel Chevalier, Anne Dallmeyer, Nils Weitzel, Chenzhi Li, Jean-Philippe Baudouin, Ulrike Herzschuh, Xianyong Cao, and Andreas Hense
Clim. Past, 19, 1043–1060, https://doi.org/10.5194/cp-19-1043-2023,https://doi.org/10.5194/cp-19-1043-2023, 2023
Short summary
Manuel Chevalier, Anne Dallmeyer, Nils Weitzel, Chenzhi Li, Jean-Philippe Baudouin, Ulrike Herzschuh, Xianyong Cao, and Andreas Hense

Interactive discussion

Status: closed

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on egusphere-2022-489', Anonymous Referee #1, 26 Aug 2022
    • AC1: 'Reply on RC1', Manuel Chevalier, 15 Feb 2023
  • RC2: 'Comment on egusphere-2022-489', Louis François, 18 Nov 2022
    • AC2: 'Reply on RC2', Manuel Chevalier, 15 Feb 2023

Interactive discussion

Status: closed

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on egusphere-2022-489', Anonymous Referee #1, 26 Aug 2022
    • AC1: 'Reply on RC1', Manuel Chevalier, 15 Feb 2023
  • RC2: 'Comment on egusphere-2022-489', Louis François, 18 Nov 2022
    • AC2: 'Reply on RC2', Manuel Chevalier, 15 Feb 2023

Peer review completion

AR: Author's response | RR: Referee report | ED: Editor decision | EF: Editorial file upload
ED: Reconsider after major revisions (07 Mar 2023) by Thomas Hickler
AR by Manuel Chevalier on behalf of the Authors (07 Mar 2023)  Author's response   Author's tracked changes   Manuscript 
ED: Publish subject to minor revisions (review by editor) (13 Apr 2023) by Thomas Hickler
AR by Manuel Chevalier on behalf of the Authors (18 Apr 2023)  Author's response   Author's tracked changes   Manuscript 
ED: Publish as is (02 May 2023) by Thomas Hickler
AR by Manuel Chevalier on behalf of the Authors (03 May 2023)

Journal article(s) based on this preprint

30 May 2023
Refining data–data and data–model vegetation comparisons using the Earth mover's distance (EMD)
Manuel Chevalier, Anne Dallmeyer, Nils Weitzel, Chenzhi Li, Jean-Philippe Baudouin, Ulrike Herzschuh, Xianyong Cao, and Andreas Hense
Clim. Past, 19, 1043–1060, https://doi.org/10.5194/cp-19-1043-2023,https://doi.org/10.5194/cp-19-1043-2023, 2023
Short summary
Manuel Chevalier, Anne Dallmeyer, Nils Weitzel, Chenzhi Li, Jean-Philippe Baudouin, Ulrike Herzschuh, Xianyong Cao, and Andreas Hense
Manuel Chevalier, Anne Dallmeyer, Nils Weitzel, Chenzhi Li, Jean-Philippe Baudouin, Ulrike Herzschuh, Xianyong Cao, and Andreas Hense

Viewed

Total article views: 600 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
413 172 15 600 7 8
  • HTML: 413
  • PDF: 172
  • XML: 15
  • Total: 600
  • BibTeX: 7
  • EndNote: 8
Views and downloads (calculated since 08 Jul 2022)
Cumulative views and downloads (calculated since 08 Jul 2022)

Viewed (geographical distribution)

Total article views: 576 (including HTML, PDF, and XML) Thereof 576 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 02 Sep 2024
Download

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Short summary
Data-data and data-model biome comparisons are commonly based on comparing single biome estimates. While this approach generates good results over large temporal and spatial scales, reducing pollen assemblages to a single biome can oversimplify the vegetation signal preserved in pollen samples. We propose to use a multivariate metric, the Earth Movers' Distance (EMD), to include more details about the vegetation structure when performing such comparisons.