the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Approximating the universal thermal climate index using sparse regression with orthogonal polynomials
Abstract. The Universal Thermal Climate Index (UTCI) is a measure of thermal comfort that quantifies how humans experience environmental conditions. Due to its robustness and versatility as a bioclimatic indicator, it has been extensively employed across a wide range of studies in bioclimatology and is increasingly used as an operational measure of outdoor thermal comfort. At the same time, calculating the UTCI value from the relevant environmental parameters is nominally not straightforward, which is why using a 6th-degree polynomial approximation has become the standard way to calculate UTCI values. At the same time, although it is computationally efficient, the error of this polynomial approximation can be substantial. The goal of this study was to develop an improved version of the polynomial approximation – one that retains comparable computational efficiency but is more robust in terms of numerical stability and substantially more accurate, particularly in reducing the frequency of larger errors. This goal was successfully achieved using sparse orthogonal regression, namely sparse regression with an orthogonal polynomial basis, which not only substantially reduces the average errors (i.e., the mean error, the mean absolute error, and the root mean square error) but also drastically reduces the frequency of large errors. By leveraging Legendre polynomial bases, approximation models could be constructed that efficiently populate a Pareto front of accuracy versus complexity and exhibit stable, hierarchical coefficient structures across varying model capacities. Training the new approximation models over only 20 % of the data, with the testing performed over the remaining 80 %, highlights successful generalization, with the results also being robust under bootstrapping. The decomposition effectively approximates the UTCI as a Fourier-like expansion in an orthogonal basis, yielding results near the theoretical optimum in the L2 (least squares) sense.
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2025-5461', Anonymous Referee #1, 15 Feb 2026
- AC1: 'Reply on RC1', Sabin Roman, 16 Apr 2026
-
RC2: 'Comment on egusphere-2025-5461', Anonymous Referee #2, 26 Mar 2026
The article presents a new approximation for the Universal Thermal Climate Index by applying sparse regression with orthogonal polynomials to the Fiala thermo-physiological model. The proposed approach improves predictive accuracy and numerical stability over the existing standards (particularly in extrapolation), while maintaining comparable computational efficiency.
The manuscript is very well written and well illustrated .I understand that the techniques (sparse model discovery) used for the regression are standard and no novelty is presented in that front. But I would add at least the kind of equations that these methods are aiming at. This will help interpreting the results (e.g. Table 1 or Figure 3, Why for a given polynomial degree the number of parameters change?).
Is the proposed function a linear combination of Legendre polynomials? Are Ta, va, Tr−Ta and rH its input variables?
Please present the shape of the polynomial basis expansions that you are fitting.
Minor comments
- In the introduction, when it is explained that the water vapor is not included, I would add that the relative humidity is included to account for its effect.
- "Training is conducted on only 20% of the available data, while performance is assessed on the remaining 80%"
Are these sets taken randomly?
Is this needed for a better fitting?
Is the number of points a limitation of the regression method?Citation: https://doi.org/10.5194/egusphere-2025-5461-RC2 - AC2: 'Reply on RC2', Sabin Roman, 16 Apr 2026
Data sets
ESM 4 Peter Bröde et al. https://static-content.springer.com/esm/art%3A10.1007%2Fs00484-011-0454-1/MediaObjects/484_2011_454_MOESM2_ESM.zip
Model code and software
Code for Approximating the universal thermal climate index (UTCI) using sparse regression with orthogonal polynomials Sabin Roman https://zenodo.org/records/17465548
Viewed
Since the preprint corresponding to this journal article was posted outside of Copernicus Publications, the preprint-related metrics are limited to HTML views.
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 191 | 0 | 1 | 192 | 0 | 0 |
- HTML: 191
- PDF: 0
- XML: 1
- Total: 192
- BibTeX: 0
- EndNote: 0
Viewed (geographical distribution)
Since the preprint corresponding to this journal article was posted outside of Copernicus Publications, the preprint-related metrics are limited to HTML views.
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
Summary:
The Universal Temperature Index is a measure of thermal comfort or discomfort perceived by humans, and is estimated by a environmental model from the measured values of air temperature, radiation, humidity etc. The model is complex to run, and so polynomial approximations for a quick, albeit not totally accurate, estimations have been developed. The standard polynomial approximation incurs in errors that are deemed too large. The present study presents another approximation method, based on orthogonal polynomial regression that seems to provide more accurate results.
Recommendation: The manuscript is well written and the study seems to have not technical flaws. For that standpoint I have very few comments. However, I do have a more general question on the motivation of the study, which I think the authors should address or justify more thoroughly
Main point
1) The manuscript mentions another alternative method, namely interpolation from an available look-up table that contains about 100 thousand values. This is also the approach recommended by Bröde (2021a). The manuscript argues that the storage of 100 thousand values makes the calculation cumbersome, but I clearly disagree. This storage would amount to roughly 1 MB of data, which is a very small space. Intuitively, I would argue that an interpolation of that table can produce very accurate values with a simple spline or linear algorithm. So the question arises as what would be the advantages of the algorithm presented in this manuscript relative to the look-up table interpolation.
I am not arguing that the study is not valuable, as it presents a possible way of producing more accurate estimation of the index, but the reader would ask themselves if it really worth the effort.
Bröde (2021a) argues that " This chapter provides hints and guidelines on how to handle these issues, and especially encourages the application of the hardly used look-up table approach, which will help avoiding many, if not all concerns related to UTCI calculation via the regression polynomial"
Minor points
2) The labels in Figure 1 are too small. This also the case to a lesser degree in other Figures. Figure 3 is ok, so I would recommend to homogenize the font size in all figures.
3) In table 2, the reader has to infer which is the train loss and the test loss. It seems that the train loss is the upper number, but this could be indicated more explicitly. It seems that the train loss numbers require to be wrapped by a []