the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Technical note: A framework for causal inference applied to solar radiation and temperature effects on dissolved gaseous mercury
Abstract. Environmental science usually requires researchers to rely on observational data alone. However, researchers want to identify causal relationships and not only correlations between pollutant behaviour and other environmental factors such as weather. Previously it has been shown that solar radiation associates with the volatilisation and evasion of the hazardous pollutant mercury from sea surfaces into the atmosphere. Statistical and machine learning methods can help find and quantify such associations. However, association does not imply causation, and inferring causal relationships from observational data alone remains a significant challenge. Here, we aim to create an 'easy-to-follow' framework, to be used by environmental researchers, for using prior scientific knowledge encoded as graphical causal models to enable causal inference and to estimate effect sizes of different related factors using collected field data. We demonstrate the framework through a case study estimating the effect sizes of solar radiation and sea surface temperature on dissolved gaseous mercury (DGM) in seawater measured at the west coast of Sweden. Our causal analysis reveals that 32 % of the total effect of solar radiation on DGM is mediated indirectly via changes in sea surface temperature. Wind and instrumentation acted as confounders, biasing effect estimates by 4.5 %. Results from the case study show that our proposed framework allows for a rigorous design, validation, and reporting of causal inference in environmental science. It shows potential in modelling causes of pollutant dynamics and quantifying the effect of regulating policies such as the Minamata Convention For Mercury.
- Preprint
(8539 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 19 Dec 2025)
- RC1: 'Comment on egusphere-2025-4511', Anonymous Referee #1, 10 Nov 2025 reply
-
RC2: 'Comment on egusphere-2025-4511', Anonymous Referee #2, 11 Dec 2025
reply
Review for Manuscript
Title: Technical note: A framework for causal inference applied to solar radiation and temperature effects on dissolved gaseous mercury
Author(s): Hans-Martin Heyn and Michelle Nerentorp Mastromonaco
MS No.: egusphere-2025-4511
MS type: Technical noteGeneral
This paper provides a technical note on application of causal inference to the effects of solar radiation and water temperature on dissolved gaseous mercury (DGM). This research is really interesting, instrumental, and insightful.
This research showcases a wonderful collaboration between experimental scientists and causal inference scholars.
What a Wonderful World is this interdisciplinary field.
This paper is expected to embrace a wide range of readers, including those who know or have a good commend of causal inference already and also those who are lay people, well-trained in experimental sciences, yet knowing little about causal inference and how to use and apply it in their experimental science areas. The present reviewer is among the latter group. Hence, this review will focus on two aspects: (1) experimental and (2) how to help and guide the latter group of readers to follow, understand, and learn how to use causal inference by means of the case study provided by this paper. Some readers, if not many, may share the same or similar feedback as presented in this review.
Specific
- Paper title
The paper title uses the word “on dissolved gaseous mercury”. Perhaps, this term in the context of this research is kind of vague and could be more specific, say, on levels of DGM, or generation process and mechanism of DGM, or speciation of Hg, etc. So it’s a bit unclear what is exactly the effect (effect of solar radiation on what exactly, DGM level, dynamics, production?), since DGM itself is only a particular species of aquatic Hg.
- Experimental
Regarding the in-situ field measurement of DGM, a number of questions arise:
First, the citation for this method seems to use a less relevant paper (by Andersson ME et al., 2008b; see L140 in the present paper). I checked on this and found the relevant references probably would be:
- A description of an automatic continuous equilibrium system for measurement of dissolved gaseous mercury. By Andersson, Gardfeldt, and Wangberg, Anal. Bioanal. Chem. 391, 2277-2282, 2008a
- Seasonal and spatial evasion of mercury from the western Mediterranean Sea by Nerentorp Mastromonaco, Gardfeldt, and Wangberg, 2017 (L896-897 in the present paper).
Second, with limited time available, I consulted Ref. #2 above and had some findings as detailed below.
Ref. #2 shows that the researchers also used another manual method, i.e., purge-and-trap method, instead of the in-situ auto-method, to determine the DGM. For this manual method, first, the Hg(0) in a water sample of a certain volume is completely purged out of the water sample using zero air (or pure Ar or N2) and then collected on a Hg trap to analyze the total Hg(0) purged out of the water sample. By measuring the volume of the water sample and the total Hg(0) purged from the water sample and collected on the Hg trap, the DGM can thus be calculated to be DGM = (total Hg(0) purged)/(volume of water sample). This method gives a clear determination of the DGM for the water sample without confusion or misunderstanding.
Moreover, Ref. #2 also mentioned that they compared the DGM results from the auto-method and the manual method and found “a good correlation” between the two method results. This means that the DGM calculated using Eq. 1 and the DGM obtained by the manual method differ, although correlated, that is, one may not replace the other, but one can be obtained from another using the correlation.
However, Ref. #2 does not mention or indicate if they used the correlation (or calibration) to get the DGM corresponding to the actual DGM (calibrated by the manual method), or they simply took the DGM results calculated using the equation of DGM = Ca(1/H + ra/rw) (L141 Eq. 1 in the present paper). This missing detail is a highly important technical detail, which is connected to the credibility of this auto-method, and subsequent causal inference operations and outcomes.
I’d think the correlation (equation) should be reported and used to get the real DGM as calibrated using the correlation, rather than just using the DGM results directly from the calculations using Eq. 1, for the reasons given below. By the way, it’s understandable there is a need to have an in-situ auto method to continuously measure DGM in the field.
But, it remains unclear for the present paper under review, all the DGM results used for the causal inference are those directly from the calculation using Eq. 1, or those after processing using the correlation between the auto and manual methods (calibration of the auto-method by the manual method). This important technical detail needs to be clarified.
The auto-method appears not quite straightforward in conjunction with Eq. 1. By the auto-method, a given water volume is first pumped into the inner cylinder. Then (or simultaneously) zero air is used to purge the Hg(0) in the given water to the headspace of the inner cylinder. Then the air concentration of Hg(0) in that headspace is measured by Lumex (or Tekran 2537A). By the way, the efficiency of the purging is not mentioned or discussed in this paper. The efficiency of purging is certainly critical for the manual method. Incomplete purging of the DGM can cause under-estimation of real DGM level.
It is very curious why Eq. 1 is used to calculate the real DGM of the sea water, instead of using the same approach as the manual method to get the total Hg purged out of the water left in the cylinder headspace and then the DGM thus determined. It is also highly curious why the DGM is the Hg(0) concentration in the water of the cylinder supposedly at equilibrium with the Hg(0) purged out of the same water then present in the headspace measured by Lumex. Intuitively, this is quite confusing and not revealing. The key point here is why the equilibrium of Hg(0) distribution between air and water gets involved in the DGM determination? In any context, it is the real DGM of interest, not the equilibrium DGM.
It is very hard to see and understand how this so-calculated equilibrium Hg(0) concentration can represent the real DGM in the water sample. First of all, the real DGM should be the one at the equilibrium with the ambient air Hg(0) above the sea, rather than with the Hg(0) purged out of the water sample in the cylinder headspace, unless coincidentally, the Hg(0) in the ambient air has the same concentration as the purged Hg(0) in the headspace. It is very hard to see the materialization of such a coincidence, consistently occurring all the time. Or was this coincidence confirmed experimentally?
Using the Henry’s law method to get DGM only gives the Hg(0) concentration in the water at the equilibrium, while as known, water is commonly saturated or often over-saturated with Hg(0), i.e., DGM at equilibrium < or << DGM-real.
Table 3 and Fig. 6f all show quite low levels of DGM, as compared to many studies that reported higher DGM levels for various waters. This suspected underestimation of the DGM might be due to that the calculated DGM is only for the equilibrium condition as calculated using Henry’s law.
The unclarity and confusion regarding the meaning and credibility of the DGM calculated using Eq. 1 need to be resolved in the first place before readers go further to see any causal inference using the DGM results.
- Causal inference general
Before and during reading this paper for a while, I always thought this causal inference model or operation can determine if two factors given are actually indeed causally related, instead of simply correlated. In other words, the expectation was that by running the causal inference (going through the entire framework and running the causal inference operations or models), it can be determined if one factor is causally related to another, followed further by the effect size.
But, the more I read through, the more I thought or realized (maybe I’m still wrong or doesn’t get it) that actually, it seems that to begin the causal inference, one needs to assume, in the first place, the two factors are indeed causally related, and then running the causal inference through the framework would provide more knowledge about the relationship between the two factors, like the effect size, this percentage for this factor, or that percentage for that factor, etc.
So, top front, it would be very helpful to provide a general description of the causal inference, it’s goal, logic assumption and framework, approach, what the causal inference is and can or could do, what we can or could expect the causal inference to offer, and moreover, what the causal inference cannot offer or do. This general introduction is much needed. Or, readers, like me, would be struggling in the confusion about if the causal inference can settle the case to determine the causality, or instead, only can provide more inference about the relationship between two or more factors and the effect size of each factor, beyond simple correlation analysis.
So, if the causal inference cannot determine if two or more given factors are indeed causally related, and which is the cause of which (or otherwise), then this nature of the causal inference needs to be stated/indicated clearly in the very beginning. This would help and benefit many readers, like me, who, inference-via-scientific-experiments oriented, probably first time encounter a detailed case like the one provided by this paper. For example, a lot has been known about how solar radiation can causally induce and enhance DGM generation via photochemical reactions by means of well-controlled manipulative experiments (with only one factor tested in variation and other factors fixed to logically satisfy both necessity and sufficiency requirements for causal-effect relationship determination).
III. Comments and thoughts
Line 62 (L62), “Hg…water-to-air evaporation”, evaporation refers to the escape of molecules of the liquid from liquid phase of that particular molecule to gas phase (e.g., pure water evaporation), but here, there is no liquid Hg involved, only dissolved gaseous Hg or Hg atoms as the solute in water (the solvent), the liquid is water. So rigorously, Hg evasion or emission, not evaporation, is more appropriate or accurate.
By the way, as mentioned before, three issues are involved here: DGM generation, DGM emission or evasion, and DGM concentrations or levels. The title and the paper use “…causal inference applied to solar radiation and temperature effects on DGM. Then, exactly, which factor we are looking at? The DGM generation or emission, or concentration, which are the factors under consideration or treatment with the causal inference? This is unclear, another potential confusion point.
L103-107, the campaign was 2019-2020, but the data used for this study was from 2024 April 1 to April 25. This is another potential confusion point. Which data were used? If the latter, why mentioning the 2019-2020 campaign?
L140-148, all parameters or quantities should be given together with their individual units, if any.
Here, it may be helpful to mention the DGM, Solar, and T data are given or summarized in Table 3 and Fig. 6. At any rate, the data used for this study need to be presented clearly top front, rather than later. We need to know in the first place clearly what are the measurement data used for this study. This data can help readers to see or inspect, now, before the causal inference, the potential causal relationship, intuitively, or based on previous research experiences, independent of the causal inference.
Fig. 6e has no legend, but it has two parameters, which is for which?
L141, from subsequent info, we know ra/rw < 1, this means for Eq. 1, DGM roughly = Ca/H, if so, why leave the item of ra/rw in the equation. This needs to be discussed. When the whole equation is needed, when the approximate, simplified one may be relevant in use. By the way, if the simplified equation is used, then the question regarding the meaning of the so calculated DGM arises, as discussed previously.
Table 3 and Fig. 6 show the DGM levels are quite low, as mentioned before. This is curious.
L169, it is unclear which step in the framework will determine if the two or more factors are causally related, if the causal inference can determine that?
L180-185, it appears that the causal arrow is what we assign or assume before the causal inference, rather than an outcome of the causal inference. This is, among others, what confuses me.
From time to time, this becomes unclear: the casual inference is for solar and Ca or for solar and DGM?
L250-251, regarding the nature of the effect, direct or indirect, again it seems that we need to pre-assign or assume it like the causal arrow, rather than an outcome of the causal inference.
L306-308, how were the simulated data generated? From the data of Table 3 and Fig. 6, or from running the causal inference model? This is unclear. What software used to generate the simulated data?
A general comment, by the way, throughout this paper, it is always unclear if the causal inference was run or conducted by what software or causal inference model(s), any commercial software? If so, unless it is copyright or patent protected and thus cannot be disclosed, we need to know the brands or names of all the software and models used in this study, and which is used in which step to do what. This important info is missing and needs to disclosed in the early beginning as given by a list (like for experimental work, a list of chemicals and equipment used), like in a methodology section for the causal inference.
Furthermore, each time when a specific causal inference operation along the way going through the framework, we’d like to know what specific software or model(s) was used for this specific step or task or operation, with relevant references provided for more technical details.
L390, how to verify?
L498-499, total effect = direct effect + indirect effect, this is valid only for the cases where both effects are positive or negative, i.e., same direction. If one is positive and the other is negative, that total effect sum is not valid, or what is the meaning of that sum? For example, solar effect on T, two effects, one effect is that solar can enhance DGM generation, leading more DGM in water, while on the other hand, the other effect is that solar can increase water T, which in turn can lead to higher Henry’s coefficient, and thus less DGM at the higher T, e.g., at Tw = 1 C, DGM at equilibrium = 7.2 pg/L, at 25 C, DGM = 3.8 pg/L. So, the two effects of solar radiation are opposite in direction. Then, how can these two opposite effects be additive in the causal inference? Or how the causal inference handles the opposite effects? Or the direction of the effect does not matter, since the cause inference tells if the effect is operative or not and in what extent?
L582-583, What can the causal inference tell about the factors and their relationships that we still don’t know, as from this particular study regarding DGM? In other words, what are new from the causal inference that has not been achieved by scientific experiments and field measurements?
L588-589, pump speed or water flow rate rw, L119 mentions that rw varied between 0 and 40 L/min. Then, first, if rw = 0, rA/rw is meaningless mathematically; if rw = 40, then rA/rw is 1/5/40 = 0.0375, very small, and so this item can be ignored, then DGMcal = Cmw/H. So this pump speed variation largely limits the accuracy of this auto-method. By the way, it remains hard to grasp or understand why DGM-real can be obtained by Cmw(1/H + rA/rw), how equilibrium gets there and why rA and rw got involved. The first item in Eq. 1 is about equilibrium and the second one is about the dynamics of the sampling flow, and then why DGM involves both equilibrium and dynamics?
The pump speed involves measurement operational error or artifact, and so it is not a real physical effect for DGM like solar and/or Tw. Pump speed is not a direct effect, nor an indirect effect; it just has operational errors. One is about aquatic mechanisms and processes involving DGM generation kinetics and equilibrium and the other is about DGM measurement and measurement errors. Mixing the two in the causal inference is confusing.
32% effect for solar radiation is due to indirect effect of water temperature. But, as mentioned before, the effects of solar and T on DGM are opposite. This result of 32% effect size seems to show that T has a positive effect just like solar radiation, higher solar higher DGM, but higher T, lower DGM based on equilibrium.
By the way, in many cases as shown by many field studies, the water T varied quite less during a day (as compared to solar radiation), only to a small extend as a result of very high specific heat of pure water (due to the Hydrogen bonding of the highly polar water molecules). But, 32% is almost 1/3, which means the effect of T is almost very strong.
On the other hand, T can not only change Henry’s constant and the Hg air/water distribution equilibrium (constant), but also can change the kinetic rate constants (and rates) of photochemical and/or thermal reduction of Hg(II) to Hg(0). This is another effect of water T. Then this effect is positive, enhancing DGM generation, like solar radiation. Thus, T has two opposite effects: positive to enhance the kinetics, and negative to increase H, then decrease DGM at equilibrium.
Last but not the least, it would be helpful to provide a short glossary of the terms as an appendix, especially those involving causal inference.
Many thanks.
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 207 | 58 | 21 | 286 | 20 | 20 |
- HTML: 207
- PDF: 58
- XML: 21
- Total: 286
- BibTeX: 20
- EndNote: 20
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
The paper introduces a Bayesian graphical causal inference framework to investigate solar radiation and temperature effects on dissolved gaseous mercury (DGM) concentrations. This is an exciting contribution with clear potential to advance environmental data analysis. However, major revisions are required to ensure that the method is applied following best practices and clearly communicated to a broader audience in environmental sciences who may not have a statistical background.
Major Comments
The study does not explicitly demonstrate that frequentist methods fail or that Bayesian inference provides a clear empirical advantage. No comparison is made (e.g., between regression or structural equation models and their Bayesian alternatives) to show instability or bias under a frequentist framework. Since Bayesian methods are technically more complex, the manuscript should clarify when and why they are preferable and under what conditions their use provides meaningful benefits.
The authors claim that previous studies suffered from temporal limitations. While this study uses high-frequency data, the model itself does not incorporate time as a structural or dynamic dimension—it treats each time step as an independent observation. The manuscript should clearly explain how this approach differs from earlier studies and whether the higher temporal resolution truly enhances inference or simply provides finer data granularity.
The assumption of a Normal likelihood for C_{MW}is weakly justified. While the Normal distribution is commonly used, its prevalence does not imply appropriateness; the appeal to the Central Limit Theorem oversimplifies environmental concentration data, which are typically multiplicative and right-skewed -- Figure 11(e) shows a long-tailed distribution. The authors could either demonstrate that residuals are approximately normal (supported by residual–fitted value plots) or acknowledge this limitation and discuss whether a log-normal likelihood would be more appropriate.
For model m4, the paper discusses indirect effects through Sol → T_S → C_{MW} and Sol → W → C_{MW} but omits the valid multi-step path Sol → T_S → r_W → C_{MW}. The authors should clarify whether such compound mediation effects are included in the total indirect effect and provide clearer guidance on interpreting direct, indirect, and total effects from the DAG.
The causal conclusions rely on the correctness of the assumed DAG structure in many aspects, in addition to independence, mis-specified relationships or omitted variables - such as unmodeled nonlinear effects or unobserved confounders - could lead to misleading causal inferences. The authors should discuss the potential impact of those DAG misspecification.
Minor Comments
The priors (e.g., Normal(0.5, 1), Normal(0.5, 0.5)) appear somewhat arbitrary and not elicited from domain experts. The study would be strengthened by (a) justifying these priors through expert input or empirical reasoning, or (b) using uninformative priors.
Please clarify how model convergence was assessed under the Bayesian MCMC framework. Including trace plots or diagnostics is important for verifying convergence. A useful reference is: Reich, Brian J., and Sujit K. Ghosh. Bayesian Statistical Methods. Chapman and Hall/CRC, 2019.
Both R2 and WAIC are reported and appear consistent. However, if they diverged, how should this be interpreted? A short explanation of their conceptual difference would improve clarity.
Figure 13(b) seems to show narrower confidence intervals than (a), but this is hard to discern. The figure could be redesigned for better contrast. Also, revise the phrasing “noisier but also more reliable,” as “noisier” typically suggests lower precision.
The rationale for preferring graphical causal models over alternatives (e.g., Granger causality, potential outcomes) is generally sound. Graphical models do enhance transparency and facilitate the integration of mechanistic knowledge. However, they do not eliminate assumptions or guarantee correctness. Traditional causal frameworks are not inherently “non-transparent” but rely on different theoretical foundations. Acknowledging this nuance would make the argument more balanced.
Appendix E Figure E1, used to validate statistical independence, could be clearer. Adding fitted lines with distinct colors for different temperature levels would improve readability and interpretation.