Using ocean surface paleo-density to evaluate PMIP3 and PMIP4 Last Glacial Maximum climate simulations
Abstract. Quantitative reconstruction of ocean surface density during the Last Glacial Maximum (LGM) offers valuable insights into the ability of climate models to simulate past climate conditions, when global temperatures were about 4.5 °C to 6 °C colder than today. We assess the performance of the LGM climate simulations, as part of the 3rd and 4th phase of the Paleoclimate Modeling Intercomparisons Project, using a recent ocean surface density reconstruction based on the δ¹⁸O of foraminiferal calcite (δ¹⁸Oc). We consider the differences between the LGM and the preindustrial climates and each period separately, at both global and regional scales. Because surface density reflects the combined effects of temperature and salinity, we also examined sea surface temperature (SST) to better identify the processes underlying model–data differences.
Surface density reconstructions show greater variability than simulated surface density. Models therefore struggle to reproduce the spatial variability of the density difference (LGM – pre-industrial (PI)), but part of the mismatch may arise from the uneven spatial distribution of reconstructions, which are mostly located near coastal areas.
Density anomaly (LGM – PI) differences between data and models are largely controlled by sea surface salinity (SSS), with SST contributing to a lesser extent. This influence of SSS is directly linked to the reduction in tropical precipitation during the LGM: models that best match the large-scale density anomalies also simulate the strongest reductions in reconstructed low-latitude precipitation during the LGM, highlighting the key role of hydrological cycle changes in shaping surface density.
On a global scale, 100 % of model simulations show a statistically significant relationship with surface density reconstructions, looking at LGM and PI separately. However, on a regional scale, some features are poorly simulated, leading to weaker agreement between data and model simulations, particularly in the North Indian and Southern Oceans. Our analysis concludes with a focus in the Indo-Pacific Warm Pool. Past reconstructions indicate a LGM weakened Indian ocean west–east surface density gradient, but only 7 out of 14 models (50 %) reproduce this feature. These results highlight the need to better constrain regional hydrological cycle changes in models, as improving their representation is crucial to reduce uncertainties in both paleoclimate simulations and future climate projections.
Review by Chris Brierley
This manuscript by Barathieu et al moves paleoclimate data-model comparisons in a novel direction, by providing the first example of looking assessing ocean density changes. This is built off the foram-based compilation by Caley et al for the last glacial maximum, combined with the PMIP coupled model simulations.
I believe that this manuscript is a good fit for Climate of the Past and should be published after revision. While I outline some specific comments below, they all come under a single wide umbrella of a comment. I found this article contained so much detail, especially about the individual comparison results) that it obscured your message as times. I feel that you will gain more impact and traction from the manuscript if you are able to be more concise.
Specific comments (not in order of importance).
Sect 2.1.1 You describe the methodology of the Caley et al (2025) data compilation. However you do not describe any of the broad findings from this dataset. Neither do you provide any information about the uncertainty in the reconstructions, such as an approximate calibration error.
L210. I do not see why you would want to interpolate all these results onto a common grid resolution prior to computing the density. Surely it would be more accurate, and computationally efficient to perform all the analysis on each model’s native grid.
L214. Is there a reason that you choose to not use any density fields that had be stored on the ESFG?
Table 1. You classify HadCM3 and iLOVECLIM as CMIP6 models. HadCM3 was originally built as part of CMIP3. I accept that it has performed the PMIP4 protocol and is part of PMIP4, but surely not “HadCM3-PMIP3”. Please be more accurate in your model descriptions, as this has implications for your conclusions about the improvement of models between generations (such as in Fig 1).
L239-241. Please explain what these numbers mean, and why I need to know them.
L260. Please number the first figure you introduce in your manuscript as Fig 1.
L261. Please describe what the pseudo-proxy approach means here. I have heard the term as a way of combining changes in SST and SSS to give changes in (coral) d18O. But this clearly is not what you mean.
Fig. 1. Is this not global? So why is the x-axis labelled ‘in the basin’?
Fig. 1. Consider whether it is appropriate to subdivide between the two model generations.
(e.g.) L303. Why do you write *absolute* density anomalies. Do relative density anomalies have any meaning (I guess they will all be around 0.1%)? So what is the ‘absolute’ clarifying?
L308. Please put the values in Table 1.
Fig. 2 This is very hard to read, and especially see the difference between the model and proxy data. Consider things like different projections to minimise the amount of white space. Maybe consider a separate ensemble-mean map, which could be larger? Provide the units of the anomaly
Fig. 3. I find this very hard to interpret. Why not remove the indivuadal data points and plot as zonal means.
L348. Presumably you have been undertaking this decomposition at individual grid points, and comparing to the uncertainty at in each proxy reconstruction. However, I can’t see how this results in the Fig. 4. The bars look like there are guaranteed to add to 100%, but is there a bit of non-linearity in the equation of state. How have you accounted for this?
Fig. 4: Can you provide an estimate of the uncertainty in your decomposition?
L380. Remove repeated word: reconstruct
Fig. 5. Shouldn’t these regression lines go through the origin, by definition of being anomalies? What is the implication of the offsets?
Fig. 5. Previously you stated the global reconstruction mean of the density anomaly was 1.5 (L310), but here it is shown as 0.8. Please clarify!
L417. I believe that the proxy reconstruction IQRs are the range of the values average across the globe (or subsequently a region). But surely there is a calibration error attached to the reconstruction. How are you treating this uncertainty, and how does it alter the IQR?
Fig. 8. I do not understand what this figure is assessing, and how to interpret that from the analysis. Is it trying to measure the ability of the models to capture the spatial pattern of surface density at both the PI and LGM? If so, why don’t you present some maps. Or even better move to using a Taylor Diagram, to summarise the observational comparison?
L531. You perform a basin-by-basin analysis. But your previous section concluded the tropics and extratropics behaved quite differently. However, you chosen analysis folds these two regions into the same basin. Please justify your choice.
L542. Please comment on how this finding compares with your earlier findings.
L546. Please provide more detail. The model’s IQR are very small in the figures.
Fig. 10. I cannot see the IQR in the Southern Ocean panel, but the reason for this is never explained.
L566-7. I cannot tell from the figure which are the ‘several regions’ you are referring to.
Section 4. Personally, I would remove this section (as it replicates Fig. 8 which I didn’t understand).
L708-710. This sentence reads awkwardly. I suggest removing it.
Fig. 12. Can you please also add something to give readers an idea of how low-frequency variability in the IOD might compare to these changes in gradient.
L784. This paper also highlight that HadCM3 was the most realistic, if I recall correctly