Fractional Empirical Orthogonal Functions for Geophysical Fields with Anomalous Transport: Theory and Validation

Chishtie, Farrukh

doi:10.5194/egusphere-2026-753

Preprints

https://doi.org/10.5194/egusphere-2026-753

Preprints

12 Mar 2026

| 12 Mar 2026

Fractional Empirical Orthogonal Functions for Geophysical Fields with Anomalous Transport: Theory and Validation

Farrukh Chishtie

Abstract. Empirical Orthogonal Function (EOF) analysis and its rotated variant (REOF) are foundational tools in the geosciences for decomposing spatiotemporal variability. However, the standard methodology implicitly assumes Gaussian statistics and exponentially decaying correlations, assumptions that are violated in many geophysical systems exhibiting anomalous diffusion, heavy-tailed distributions, and long-range spatial correlations. We develop a theoretical framework for fractional EOF (fEOF) analysis that extends the standard methodology by incorporating the fractional Laplacian operator into the covariance structure. The governing dynamics are formulated using the Riemann–Liouville fractional time derivative of order μ > 0, which is not restricted to the interval (0,1] and thereby accommodates both subdiffusive and superdiffusive transport regimes within a single formalism. The resulting fractional covariance operator naturally captures power-law correlations characteristic of anomalous transport in geophysical flows. We prove that the eigenvalue spectrum of the fractional covariance operator exhibits enhanced power-law decay λ_m^(α) ∼ m^{−(1+α+β/d)}, where the spatial fractional order α ∈ (0,2) provides a tunable control parameter independent of the underlying spectral slope β. The temporal evolution of fractional principal components follows Mittag-Leffler relaxation, interpolating between stretched exponential and power-law regimes. We validate the theoretical predictions through three independent approaches: (i) exact analytical results for fractional Brownian surfaces across three Hurst exponents (H = 0.3, 0.5, 0.7), confirming eigenvalue steepening to within 6 % of theoretical predictions with finite-domain corrections identified; (ii) spectral analysis of fields generated by the space-time fractional diffusion equation across spectral slopes β = 2, 3, 4, recovering predicted exponents to within 2–6 %; and (iii) Monte Carlo experiments over 50 realizations demonstrating that the eigenvalue scaling is distribution-independent, holding identically for Gaussian and heavy-tailed Student-t fields while achieving 5- to 8-fold reductions in the number of modes required for 95 % variance capture. The sensitivity analysis across fractional orders α ∈ [0, 1.8] confirms the predicted linear steepening relation ν_α = ν₀ + α to within 3 % throughout. The framework applies to a broad class of geophysical fields exhibiting anomalous transport, including oceanic tracer dispersion, flood inundation dynamics, atmospheric constituent spreading, and soil moisture redistribution. Connections to Okubo's empirical oceanic diffusion scaling and the Forecasting Inundation Extents using REOF (FIER) framework are discussed as illustrative applications.

Received: 07 Feb 2026 – Discussion started: 12 Mar 2026

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Farrukh Chishtie

Status: final response (author comments only)

RC1:
'Comment on egusphere-2026-753', Anonymous Referee #1, 29 Apr 2026
General comment:

The paper contributes to an extensively used method (EOF) in numerous disciplines related to geophysics. The classical method relies on Gaussian statistics and exponentially decaying correlations, which are known limiting factors. In that optic, the use of fractional derivation to introduce some global behavior is an interesting idea.
The development of the theory from Riemann–Liouville derivation is theoretically sound, clear, and really to the point for applications. The limitations of the method appear clearly, and the results are well highlighted.
The paper remains general while also incorporating ideas from the authors’ expertise (flood/risk assessment?), which provide some applications that are enjoyable to read even for people outside of the field. Part 6, for example, gives cases where the application of FEOF over EOF might be valuable.
I personally enjoyed the paper; however, my main remark is that I find the order somewhat confusing at times:
Some concepts are explained/illustrated later on, whereas some intuition would be needed at earlier stages. The elements are there most of the time, but the order means that the reader has to do a lot of back and forth. It would help tremendously to introduce the concepts at the right time (see specific comments below).

Similarly, the numerical configurations would benefit from a brief synthesis before presenting the results, along with some regrouping. As it stands, the configurations are described together with the results (in Section 4), which weakens the message. Furthermore, the progression between the configurations (fBm, spectral slopes, MC, fBm) does not seem natural to me and could perhaps be reconsidered. A large number of configurations/parameter choices are used, but no clear picture emerges, as new choices are constantly introduced throughout the results.

Specific comments:
Section 2.2.2 would benefit from some improvements. A short introduction (perhaps just a line or two) is needed to explain why a generalization of the exponential is required (to highlight the departure from exponential behavior and its relevance in fractional calculus, linking with the Laplacian next). Section 4.6 seems out of place among the results and should be moved here. This would improve understanding at early stages and help streamline the results section later on. More emphasis could also be placed on the description of the two parameters (relaxation and initial conditions).

L137: it would already be interesting to discuss memory here, since this differentiates classical Fickian diffusion from anomalous diffusion. More intuition on the role of \alpha would be appreciated at this stage, given its importance in the study.

Sec. 2.3: similarly, the physical interpretation given in L261–267 would be better placed here, even if it involves some repetition.

L149: I would emphasize the distinction between space and time more strongly.

L152–155: I am not sure I fully follow. Is this due to the restriction of \alpha \in [0,1] ? If \mu > 1, is the ratio necessarily greater than 1/2? What do you mean by not needing the spatial operator? Does this imply that the fractional Laplacian is no longer required, and that a time-fractional diffusion alone is sufficient to characterize the superdiffusive regime?

Sec. 3.1: I would emphasize the role of eigenvectors and eigenvalues, which represent spatial modes and scales here, if possible. Eigenvalues are particularly important, as small values correspond to smooth/large-scale modes, while large values correspond to high-frequency/small-scale modes. The Fiedler value \mu_2 also relates to the connectivity of the graph and to the scale at which the system has essentially forgotten its initial conditions.

L171–177: could you elaborate more on these two cases? Is there also a connection with the explanation of fractional weighting in L226–230? A clearer link between the two would be appreciated.

L198: Definition 3.1 — could you refer directly to the equation instead, or use a more explicit way to reference it?

L204–209: does the steepening imply that the information is contained in fewer modes (i.e., the leading ones)?

L226–230: I find the explanation interesting, but I would appreciate a stronger connection with Sec. 3.1 to clarify it.

L266: as noted earlier, Fig. 6 should probably be Fig. 1 and introduced already in Sec. 2.2.2, so that the reader does not have to refer to a figure located at the end of the paper.

Fig. 6: there is a mismatch in the color coherence of the \mu values between the two panels.

Sec. 4: as mentioned in the general comments, I find the order somewhat confusing. The progression from fBm to spectra to MC and back to fBm, with additional elements on Mittag-Leffler dynamics in between, does not feel natural. The results following the MC section seem disconnected from the rest and are not clearly announced in the structure. A regrouping of configurations would be beneficial. I also find it problematic that the configurations, methods, and results are introduced simultaneously... sometimes with new parameter choices appearing within the results ! This makes interpretation more difficult and does not do justice to your results. A summary table of configurations and numerical parameters presented before (and separately from) the results would greatly improve clarity.

L294: the time dimension should be specified more clearly, particularly when the process is white in time and the snapshots are independent.

L311–313: are the spectra computed spatially on a single 32*32 field? In that case, time would not be relevant here, correct? Has ensemble averaging been considered to estimate uncertainty in the measurements? If so, it should be specified.

Are the Hurst exponent and \alpha related, since both introduce a form of memory? The connection is not clearly established in the text and could be strengthened using earlier explanations from Secs. 2.1 and 3.1.

Fig. 1 & 2: it would be nice to superimpose the reference slopes onto both EOF and fEOF results.

Sec. 4.2: I find it problematic that numerical configuration, method, and results are mixed within the same paragraph. Time is also introduced via independent realizations, which is not entirely clear and may suggest a treatment similar to fBm.

L342–344: the physical interpretation should be presented earlier, ideally at the beginning.

L363: again, the treatment of time is confusing. Parameters such as N_t = 200, 1024 are introduced later and should instead be specified earlier or clearly separated from the results.

L375: Theorem 3.2 - are you referring to equation (22)?

Secs. 4.4 and 4.5: these sections could be integrated into a unified fBm results section.

Sec. 4.6: this section does not seem to belong in the results.

L478–479: does this imply fractality?

L583–585: Caputo derivatives were not used because they are restricted to the subdiffusive regime, but has any comparison been performed when possible?
Citation: https://doi.org/10.5194/egusphere-2026-753-RC1
- AC1: 'Reply on RC1', Farrukh Chishtie, 15 Jun 2026
  
  Please see attached pdf file with the responses to referee RC1.
  
  Citation: https://doi.org/10.5194/egusphere-2026-753-AC1
RC2:
'Comment on egusphere-2026-753', Anonymous Referee #2, 18 May 2026
The author proposes in this manuscript a linear generalization of the Empirical Orthogonal Functions (EOF) technique, namely the Fractional Empirical Orthogonal Functions (fEOF), for dimensionality reduction. An application linked to the fractional diffusion equation is also proposed. Some background is provided in Sec. 2 and the mathematical details of the dimensionality reduction technique itself are presented in Sec. 3. Section 4 applies the derived formulas on synthetic low-dimensional random datasets, Sec. 5 further exposes some properties of fEOF and Sec. 6 discusses some potential applications in geophysics. Section 7 summarizes the work and further discusses some technical points.
As noted in the introduction of the manuscript, geophysical fluids are subject to anomalous diffusion, which translates to power laws of the form ~k^-^β with β≠2. Because of this, the usual EOF technique selects the largest-scale Fourier modes (and the explained variances are linked to the power spectrum of the field). The main idea of the manuscript is essentially to compute EOFs based on a modified covariance matrix C_α (eq. (15)), which is the usual covariance matrix left- and right-multiplied by a fractional power of the Laplacian, i.e. by L^-α^/2 for α>0 (the case α=0 is the usual EOF technique). As shown by the author, the spectrum of eigenvalues of C_α can be made steeper by increasing α, making the largest eigenvalues even larger with respect to the small ones (see also below).
I think there is a major flaw in the technique proposed in the manuscript, and I therefore recommend not publishing this manuscript.
Main comment:
The eigenvalues of C_α are interpreted as the variances explained by the associated eigenvectors (cf. M₉₅ number in Table 1), leading to the conclusion that the eigenvectors of C_α are more efficient to represent the field x. This is not true, the explained variances of the field x are the eigenvalues of the usual covariance matrix. Actually, the author has not shown or argued that this generalization of the covariance matrix and the associated eigenvalues are interesting objects for dimensionality reduction. It is not because the spectrum of eigenvalues of C_α is steeper for α>0 that it is meaningful to base a dimensionality reduction method on it.
Further intuition about the technique proposed by the author is gained when understanding the properties of C, C_α and L^-^α^/2 in a Fourier basis. It is indeed expected in the geosciences community that eigenvectors of the covariance matrix are close to Fourier modes. From what I found in the literature, eigenvectors are actually exactly Fourier modes if the field is periodic and spatially homogeneous (see note at the end of the comment). The L^-^α^/2 operator multiplies each Fourier mode by the norm of its wavevector to the power -α (eq. (10)), so L^-^α^/2 is also diagonal in a basis of Fourier modes. C and L^-^α^/2 commute as matrices, so that C_α = L^-^αC and C and C_α have the same eigenvectors (which are Fourier modes):
C_αϕ = L^-^αCϕ = λL^-^αϕ = λk^-2αϕ
where
k is the norm of the associated wavevector ϕ

λ is the eigenvalue of ϕ with respect to C (and is related to the power spectrum of the field evaluated at the wavevector of ϕ, see line 193).

The eigenvalues of C_α are λk^-2^α, consistently with eq. (17), and the eigenvectors are the same as those of C, i.e. Fourier modes (this is supported by Fig. 5 of the manuscript). Since C and C_α have the same eigenvectors, why would the EOF technique based on C_α be more efficient to represent the field x ? (I personally think that there is no useful information in C_α and its eigenvalues which is not already present in C and its eigenvalues.)
The proposed technique can be seen yet in another way: the fractional covariance matrix C_α computed from x is equal to the usual covariance matrix C (=C₀) computed with the field L^-α/2x, so that the eigenvalues of C_α are actually the explained variances of the usual EOFs of L^-α/2x. Thinking in a Fourier basis, the field L^-α/2x is the original field where each Fourier mode has been weighted by k^-α. Multiplying by k^-α has the effect to reduce the contribution to x of small scales with respect to the large scales (for α>0), and thus steepens the power spectrum (or equivalently, the spectrum of eigenvalues of the covariance). In these terms, my critique of the technique is that x and L^-α/2x are simply different (but related) fields, with different (but related) properties. Why would we base a dimensionality reduction technique on L^-α/2x ? It is not because L^-α/2x has better properties than x that it is meaningful to base the dimensionality reduction on it.
Conclusion of the main comment:
I apologize if I missed the point of the proposed dimensionality reduction method, or any argument supporting it, but I currently do not see the relevance of the proposed for dimensionality reduction. Therefore, I recommend not publishing this manuscript. The author needs to motivate (with either heuristic or formal arguments) the consideration of eigenvalues of C_α for dimensionality reduction or, equivalently, to motivate the consideration of the covariance of L^-^α^/2x to reduce the dimension of x.
I actually see two points which, according to me, totally prevent any use of the proposed dimensionality reduction technique:
The parameter α is totally unconstrained, so the spectrum of eigenvalues of C_α can be made arbitrarily steep, and M₉₅ (Table 1) can made equal to 1 if α is increased sufficiently. Any field could then be represented with 1 mode. I think that the high level of arbitrariness here shows that the proposed technique does not provide a more efficient representation of the physical processes governing the considered field.

EOF/PCA is known to be the most efficient linear dimensionality reduction method in terms of the Mean Square Error (L² norm), while the proposed method here is indeed linear and using the L² norm (since it uses explained variances). If another technique is more efficient, it will either be in terms of another metric, or the method itself will be nonlinear.

Other major comments:
The space-time fractional diffusion equation is also mentioned in the manuscript. However, it is not clear to me what is the purpose of introducing this equation in this context. Is it to propose a prognostic model for geophysical fields ? The fractional diffusion equation has already been used in geophysical applications, by Kavvas et al. (2020) for example. What is the difference between the latter work and the application proposed in this manuscript ?
The caption of Fig. 5 claims that the eigenvectors of C_α are different from the usual EOFs. I think however that they are the same (see computation above). Only small differences are visible in Fig. 5, which would be due to the finite number of samples used for numerical computations (as much as there is a ~5% accuracy on the measured steepening of the eigenvalues). A more detailed comparison is needed to support the method.
I also think that the current organization of the manuscript is too much oriented on mathematics. Section 2.2 in particular feels like a textbook introduction on fractional calculus, introducing equations which are not used in the following (eqs. (6), (11) and (13), and it is not clear to me whether eqs. (7), (8) and (9) are useful later on). To be the most accessible to the geoscientific community, I think that fractional operators should be introduced by working definitions instead of analytical equations. Equations (10) and (14) fill this role for the fractional Laplacian, but it is for example not explained how to compute fractional time derivatives of time-gridded fields.
To further connect with the works in the geoscientific community, the techniques proposed in this manuscript should be actually applied to real datasets (Sec. 6 only suggests geophysical applications). Such a work would show how the proposed fractional EOFs encode the physics more efficiently.
Note on the link between the eigenvectors of the covariance matrix and Fourier modes:
Eigenvectors of the covariance matrix are Fourier modes if the field is periodic and spatially homogeneous. For vectors (1D), this is because the covariance matrix is a circulant matrix, for which eigenvectors are Fourier modes (Gray (2006), see also Senn et al. (2026)). When computing eigenmodes of d-dimensional fields, the d-dimensional arrays are flattened to vectors. The covariance matrix is in this case block-circulant with circulant blocks, and the eigenvectors are d-dimensional Fourier modes when reshaped back in the original shape (Henriques et al., 2013).
References:
Gray RM (2006), "Toeplitz and Circulant Matrices: A Review". Foundations and Trends in Communications and Information Theory, Vol. 2 No. 3 pp. 155–239, doi:https://doi.org/10.1561/0100000006
Henriques, J. F., Carreira, J., Caseiro, R., & Batista, J. (2013). Beyond hard negative mining: Efficient detector learning via block-circulant decomposition. In proceedings of the IEEE International Conference on Computer Vision (pp. 2760-2767).
Kavvas, M. L., Tu, T., Ercan, A., and Polsinelli, J.: Fractional governing equations of transient groundwater flow in unconfined aquifers with multi-fractional dimensions in fractional time, Earth Syst. Dynam., 11, 1–12, https://doi.org/10.5194/esd-11-1-2020, 2020
Senn, G., Tjelmeland, H., Glatt-Holtz, N., Walker, M., & Holbrook, A. (2026). Bayesian Semi-Blind Deconvolution at Scale. arXiv preprint arXiv:2601.09677
Citation: https://doi.org/10.5194/egusphere-2026-753-RC2
- AC2: 'Reply on RC2', Farrukh Chishtie, 15 Jun 2026
  
  Please see responses to Referee RC2 in pdf file attached.
  
  Citation: https://doi.org/10.5194/egusphere-2026-753-AC2

Farrukh Chishtie

Viewed

Total article views: 898 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
585	252	61	898	77	67

HTML: 585
PDF: 252
XML: 61
Total: 898
BibTeX: 77
EndNote: 67

Views and downloads (calculated since 12 Mar 2026)

Month	HTML	PDF	XML	Total
Mar 2026	401	138	53	592
Apr 2026	104	54	2	160
May 2026	57	41	5	103
Jun 2026	8	7	0	15
Jul 2026	15	12	1	28

Cumulative views and downloads (calculated since 12 Mar 2026)

Month	HTML	PDF	XML	Total
Mar 2026	401	138	53	592
Apr 2026	104	54	2	160
May 2026	57	41	5	103
Jun 2026	8	7	0	15
Jul 2026	15	12	1	28

Viewed (geographical distribution)

Total article views: 881 (including HTML, PDF, and XML) Thereof 881 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 25 Jul 2026

Short summary

Standard methods for decomposing geophysical data assume simple diffusion, but ocean currents, floods, and atmospheric plumes spread with long-range correlations. We developed a generalized decomposition using fractional calculus that captures these behaviors. Tests show the method needs five to eight times fewer patterns to represent the same variability. This has direct implications for flood forecasting and ocean tracer analysis.


Total:	0
HTML:	0
PDF:	0
XML:	0