the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Using a data-driven statistical model to better evaluate surface turbulent heat fluxes in weather and climate numerical models: a demonstration study
Abstract. This study proposes the use of a data-driven statistical model to freeze the errors due to differences in environmental forcing when evaluating the surface turbulent heat fluxes from weather and climate numerical models with the observations. It takes advantage of continuous acquisition over approximately ten years of near-surface sensible and latent heat fluxes (H and LE respectively) together with ancillary parameters over the supersite "Météopole" of the French national research infrastructure ACTRIS-FR, located in Toulouse. The statistical model consists of several multi-layer perceptrons (MLPs) with the same architecture. Thirteen variables characterizing the environmental forcing in the surface layer at an hourly time scale are used as input parameters to estimate H and LE simultaneously. The MLPs are trained using 5-year observational data under a 5-fold cross-validation. The remaining data is used to test the estimates on unknown conditions. A case study is performed with data from a regional climate simulation. The performance of the statistical model ranges within the state-of-the-art surface parametrization schemes on hourly and seasonal time scales. It has also a good generalization ability, but hardly estimates negative H and large LE. The statistical model is used to evaluate the simulated fluxes under the simulated environment to better examine the flaws of their numerical formulation throughout the simulation. Comparison of simulated fluxes with observed and MLP-based fluxes show different results. According to MLP-based fluxes in the simulated environment, the land surface scheme of this climate model tends to underestimate large sensible heat flux. Thus, it incorrectly partitions between surface heating and evaporation during the late summer. Our innovative method provides insight into differently evaluating the simulated near-surface turbulent heat fluxes when a long period of comprehensive observations is available. It can usefully support ongoing efforts for improvements of surface parametrization schemes.
- Preprint
(2767 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
CEC1: 'Comment on egusphere-2024-568 - No Compliance with GMD's policy', Juan Antonio Añel, 12 May 2024
Dear authors,
Unfortunately, after checking your manuscript, it has come to our attention that it does not comply with our "Code and Data Policy".
https://www.geoscientific-model-development.net/policies/code_and_data_policy.html
You have not published in a permanent repository neither the code used in your work nor the input or output data you use, which does not comply with our policy. Therefore, you must reply to this comment as soon as possible with the requested information (both links and DOIs).
Also, you must include the modified 'Code and Data Availability' section, including the mentioned information, in any potentially reviewed version of your manuscript.
If you do not fix this problem, we will have to reject your manuscript for publication in our journal. I should note that, given this lack of compliance with our policy, your manuscript should not have been accepted in Discussions. Therefore, the current situation with your manuscript is irregular.
Juan A. Añel
Geosci. Model Dev. Executive EditorCitation: https://doi.org/10.5194/egusphere-2024-568-CEC1 -
CC1: 'Reply on CEC1', Maurin Zouzoua, 21 May 2024
Hello,
Thank you for your message. I'm working to rectify this anomaly.
Regards,
Maurin ZOUZOUA
Citation: https://doi.org/10.5194/egusphere-2024-568-CC1 -
AC1: 'Reply on CEC1', Maurin Zouzoua, 24 May 2024
Dear Juan,
Thank you for your helpful feedback. An example of a workflow to evaluate the modelled turbulent heat fluxes using our approach can be found at this link: https://doi.org/10.5281/zenodo.11261853. The latest update of Meteopole observational data is now in use. The manuscript will be updated accordingly.
Citation: https://doi.org/10.5194/egusphere-2024-568-AC1
-
CC1: 'Reply on CEC1', Maurin Zouzoua, 21 May 2024
-
RC1: 'Comment on egusphere-2024-568', Anonymous Referee #1, 20 Jun 2024
The paper by Maurin ZOUZOUA et al. exhibits a demonstration study in which they propose a data-driven statistical model to better represent the turbulent heat flux exchanges in the atmospheric surface layer. The aim of this improvement serves as a potential alternative to the classical parameterization schemes that are inherently violated in weather and climate numerical models. The developed model uses approximately 10 years of near-surface sensible and latent heat fluxes over the supersite "Météopole" of the French national research infrastructure ACTRIS-FR, located in Toulouse, for training and testing.
The authors selected thirteen variables to characterize the environmental forcing at an hourly time scale to estimate these near-surface fluxes (heat and latent fluxes). The 10-year observational data was split half-half between training and testing. The suggested model suffers from predicting negatively large fluxes, a signature of the stable regimes which current models suffer from too.
Technically, the work is well executed. The title and the abstract are informative, succinct, and clear. The manuscript is well organized, logically and clearly conveyed. The authors took care of clearly explaining the motivation behind their work, depicting all the deployed measurement tools, and formulating the research topics to which the manuscript is answering. This is both instrumental for understanding the scope and content of the manuscript, making its reading easy and pleasant.
Although I think some of my comments below are significant and I would like to see the responses, there are number of points that must be clarified before accepting it for publication.
I recommend acceptance with minor revisions.
General comment: Some of the grammar/language shall be revisited
Specific comments:
Line 25: Mention also the first important source of bias.
Line 74: In section 5
3.1:
Why not splitting the training datasets in each stability category (Stable, Neutral, Unstable) and different friction velocity ranges
Line 157: (ii) instead of (iii)
3.2:
Coarse time resolution (half-hour averages data may mask physical information for smaller time scales, at what time step the simulations were run?)
Lines 201-202:
The most conventional meteorological variables, such as T and RH at 2 m agl are also available. The lowest level as mentioned is within 20 m (pressure level akin to sigma coordinate which dynamically changes with time), so half eta-level is still higher than 2m. Any interpolation here?
Although, you mentioned this here Lines 303-305:
Moreover, the meteorological variables are directly taken at the first half-eta level (M=1, around 8 magl) instead of diagnostic variables as much as possible.
But 2m and 8m are quite far especially under stable conditions… Maybe, you need to comment on this..
Line 202:
The data are stored at a temporal resolution of 3 hours …
How did you compare the instantaneous 3rd hour snapshot with observations averaged in half-hour chunks?
Line 204:
Meanwhile, the surface data, mostly provided by ORCHIDEE, consist of time-centred mean over a 3 hours window…
Specify what surface data…
Lines 209-213:
Could you show in Fig. 3 the grid-layout that shows the real observation geographic coordinate and the 2 nearest grid cells considered in the analysis.
In this context, given the locality of space (dx=dy=20 Km) and the point observations collected from a tower at specific coordinates, have you done sensitivity analysis on the effect of spatial averaging? Also, any comments on the local dynamics forcings in the considered two grid cells compared to the observations? You considered the surface type aspect as a main criterion in selecting these 2 grid cells, but what about the topography effects and the local environmental dynamics (wind speed and directionality, temperature, …) when comparing outputs in these 2 grid cells to the observation’s dataset at that specific grid location?
3.3.1:
What’s the criteria for selecting the training dataset?
Is it season independent?
Have you done any sensitivity analysis on choosing different dataset i.e., dataset convergence in terms of variability in the final weights and biases of MLP?
3.3.2:
Lines 298-300: Define these variable acronyms especially:
Eventually, 4 trigonometric temporal coordinates are added for the description of seasonal (dx, dy) and diurnal (hx, hy) cycles.
Citation: https://doi.org/10.5194/egusphere-2024-568-RC1 - AC2: 'Reply on RC1', Maurin Zouzoua, 30 Sep 2024
-
RC2: 'Comment on egusphere-2024-568', Anonymous Referee #2, 01 Aug 2024
General Comments
The paper by Zouzoua et. al shows a promising demonstration of a method for using MLPs to better understand the sources of uncertainties and errors in surface layer schemes. The work clearly outlines the method, and demonstrates it’s applicability at a representative site. The quality of the work is generally high, especially in setting up the background and motivation, however some additional polish, revision for clarity, as well as addressing a few critical questions, could improve the manuscript, especially in the presentation of the results from the demonstration site. The method has strong implications, and I greatly look forward to seeing this applied to a larger set of sites.
Specific Comments
The introduction, background, and methods are clear and well written, setting up for a promising manuscript. They detail both the significance of the work, potential applications, and provide sufficient information for others to apply a similar method to alternative flux tower sites. One exception to this may be figure 1. While a schematic figure is a good addition to the work, I found it hard to interpret. Color coding was used but not described, and it was a challenge to understand the processes shown in the figure without frequent back and forth with the text. The schematic would be stronger if revised to show a bit more detail so that it could stand independently from the text. A schematic diagram has very significant potential and I appreciate the choice to include one, however the current iteration adds little to the text.
The results section could use some additional context and discussion, as well as revision for clarity. The connection between the background/motivation/introduction and the results gets lost at times. One small change that could assist this is a more specific naming of the cases. In section 5 in particular, I found myself frequently confusing the two MLP based fluxes and struggled at times to immediately understand the comparison being made. Perhaps assigning abbreviated case names (one for each of the four (or five if you count the different grid cells): estimated fluxes, observed fluxes, fluxes in the same environment, simulated fluxes) as well as text reminding what exactly they represent within the section could improve this clarity. The authors could use the naming already present in the figures in the text, for example, to have strong consistency. Language throughout sections 4 and 5 connecting back to the goals and motivation in the beginning of the paper would also help promote cohesion and make it easier to interpret.
Finally, there are a few differences between the simulations and the tower observations (and the MLPs based on them) that hinder comparison. While the authors do not necessarily avoid talking about them, the discussion on them is scattered throughout and could be enhanced with a more detailed and focused discussion. In particular answering:
- How does the mismatch of temporal resolution (30 min vs 3 hour) affect the results, particularly since we would not expect fluxes to be stationary over 3 hours (especially during the mornings/evenings)?
- How effectively can we compare between 20 km grid cells with different (and heterogeneous) land cover and a tower with ~4m agl flux readings which is likely only reading a small area from a grassland?
- What is lost by neglecting the soil/surface temperature? Those should have a strong correlation with the sensible heat flux in particular.
Technical Corrections
20-21:... key drivers of atmospheric boundary layer (ABL) processes, such as …
25-26: Rephrase “... source of biases in simulations with the numerical models”
107-109: I would rephrase for clarity “... we propose an evaluation approach dedicated to a full numerical simulation…”
395: I think it would be good to elaborate here, as the use of coarse grid cells with a different land use type, versus the real grassland with the flux tower site creates a mismatch in the comparison that deserves further explanation/attention as it could explain some of the results.
524: remove “well”: “... simulated heat fluxes are not very well sensitive”
Figure 1: Color coding should be described in the figure description. Also, cyan color on white is a challenge to read. More overall comments on figure 1 in section above.
Citation: https://doi.org/10.5194/egusphere-2024-568-RC2 - AC3: 'Reply on RC2', Maurin Zouzoua, 30 Sep 2024
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
384 | 153 | 201 | 738 | 20 | 19 |
- HTML: 384
- PDF: 153
- XML: 201
- Total: 738
- BibTeX: 20
- EndNote: 19
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1