Journal Article

A singularity regression kriging for spatial prediction

AuthorsKai Ren, Yongze Song^*, Min Chen, Qiang Yu

JournalGIScience & Remote Sensing, 2026

Keywords Spatial prediction Kriging Singularity Spatial block cross-validation Geochemical mapping

https://doi.org/10.1080/15481603.2026.2690341

Overview

This paper develops singularity regression kriging (SRK) for spatial prediction in heterogeneous environments. SRK derives multi-scale singularity features from environmental covariates, incorporates them into random forest trend estimation, and applies ordinary kriging to the residuals. The model is evaluated through simulations and trace-element mapping of zinc and cobalt in a mineralised region of Western Australia.

Abstract

Accurate spatial prediction remains challenging in heterogeneous environments where environmental variables exhibit nonlinear, multiscale, and non-Gaussian characteristics. Traditional kriging methods rely on simple trend models that cannot adequately capture nonlinear and multiscale spatial structures, leading to degraded prediction performance under complex conditions. This study proposes a singularity regression kriging (SRK) model that explicitly incorporates singularity-based anomaly features derived from environmental covariates into random forest trend estimation, followed by ordinary kriging of the resulting residuals. By computing covariate singularity indices at multiple spatial scales, SRK captures local multiscale heterogeneity without requiring response observations at prediction locations, ensuring robust generalisation under spatial block cross-validation. Simulation experiments demonstrate that SRK consistently improves prediction accuracy relative to ordinary kriging, with particularly significant improvements under skewed and long-tailed distributions. The SRK model is implemented in trace-element mapping of Zn and Co in a mineralised region of Western Australia and evaluated using spatial block cross-validation and parameter sensitivity analysis. Compared with ordinary kriging, regression-based methods, and machine-learning approaches, SRK produces more coherent spatial patterns and reduced prediction uncertainty. These results indicate that explicitly incorporating covariate singularity features into trend estimation enhances both the accuracy and reliability of spatial predictions in heterogeneous environments.

Method Implementation

SRK keeps the regression-kriging decomposition, but changes the trend model by adding singularity information derived from environmental covariates. The singularity indices are calculated from \(X_k\), not from the response \(Z\) and not from random-forest residuals, so they can be evaluated both at sampled locations and at unsampled prediction locations.

\[ \hat{Z}(\mathbf{s}_0) = \hat{\mu}(\mathbf{s}_0) + \hat{\varepsilon}(\mathbf{s}_0), \qquad \hat{\mu}(\mathbf{s}_0) = g\!\left(\mathcal{F}(\mathbf{s}_0)\right) \]

Construct singularity features from environmental covariates

For each environmental covariate \(X_k(\mathbf{s})\), local covariate intensity is measured within neighbourhoods \(A(\mathbf{s},r)\) across multiple spatial scales. In the paper, this intensity is estimated as the mean absolute covariate value in a square window:

\[ \widehat{C}_k\!\left(A(\mathbf{s},r)\right) = \frac{1}{\left|A(\mathbf{s},r)\right|} \sum_{\mathbf{s}' \in A(\mathbf{s},r)} \left|X_k(\mathbf{s}')\right| \]

Repeating this over scales \(r_1 < r_2 < \cdots < r_J\) gives a local log-log relationship:

\[ \log \widehat{C}_k\!\left(A(\mathbf{s},r)\right) = \left(\alpha_k(\mathbf{s}) - 2\right)\log r + c_k(\mathbf{s}) + e_{k,r} \]

The singularity index is then estimated as \(\alpha_k(\mathbf{s}) = \hat{\beta}_{1,k}(\mathbf{s}) + 2\). A value near 2 represents a locally uniform field, \(\alpha_k(\mathbf{s}) < 2\) indicates local enrichment, and \(\alpha_k(\mathbf{s}) > 2\) indicates local depletion. In the case study, singularity indices were computed from 2 to 20 km at 2-km intervals. Each scale required at least three valid neighbouring samples, at least two valid scales were required for estimation, and near-constant singularity features were removed using a standard-deviation threshold of 0.5.
Build the augmented feature matrix

The original covariates and the retained singularity features are combined into one predictor set. If \(q\) singularity features pass the filtering step, the feature vector at location \(\mathbf{s}\) is:

\[ \mathcal{F}(\mathbf{s}) = \left\{ X_1(\mathbf{s}),\ldots,X_p(\mathbf{s}), \alpha_{k_1}(\mathbf{s}),\ldots,\alpha_{k_q}(\mathbf{s}) \right\} \]

This is the central distinction of SRK: the random forest receives both the original environmental variables and their covariate-based multiscale anomaly descriptors.
Estimate the nonlinear trend with random forest

A random forest model \(g(\cdot)\) is trained on the augmented feature matrix to estimate the deterministic trend:

\[ \hat{\mu}(\mathbf{s}_i) = g\!\left(\mathcal{F}(\mathbf{s}_i)\right) \]

Because the singularity variables describe local multiscale heterogeneity, the RF trend component can absorb more structured variation before kriging is applied. The real-data implementation used a random forest with 500 trees.
Calculate residuals at sampled locations

The residual at each observed sample is computed by subtracting the singularity-augmented RF trend from the observed target value:

\[ \varepsilon(\mathbf{s}_i) = Z(\mathbf{s}_i) - \hat{\mu}(\mathbf{s}_i) \]

These residuals are expected to be more concentrated around zero and closer to stationarity than residuals from a model using only the original covariates.
Interpolate the residuals using ordinary kriging

A variogram is fitted to the residuals, and ordinary kriging estimates the residual component at an unsampled location:

\[ \hat{\varepsilon}(\mathbf{s}_0) = \sum_{i=1}^{n} \lambda_i\,\varepsilon(\mathbf{s}_i), \qquad \sum_{i=1}^{n}\lambda_i = 1 \]

In the case study, variogram fitting and residual kriging were implemented with autoKrige from the R package automap, using an isotropic ordinary-kriging setup.
Combine RF trend and kriged residuals

The final prediction is obtained by adding the random-forest trend and the ordinary-kriging residual estimate:

\[ \hat{Z}(\mathbf{s}_0) = \hat{\mu}_{\mathrm{RF+singularity}}(\mathbf{s}_0) + \hat{\varepsilon}_{\mathrm{OK}}(\mathbf{s}_0) \]

The kriging variance from the residual interpolation also provides a kriging-based uncertainty measure, \(\delta(\mathbf{s}_0)=\sqrt{\sigma_K^2(\mathbf{s}_0)}\). Model performance in the paper was evaluated with 5-fold spatial block cross-validation using a 15 km block size.

Figures

Figure 1. Simulated spatial fields with normal, skewed, and long-tailed distributions. — Figure 1. Spatial distributions and histograms of the three simulated datasets: (a) normal, (b) skewed, and (c) long-tailed. 图 1. 正态、偏态和长尾情景下的模拟空间场及数值分布。

Figure 2. Singularity features, random forest importance, and prediction comparisons in simulation experiments. — Figure 2. Illustration of the SRK modelling process using three simulated datasets, including the normal (a–d), skewed (e–h), and long-tailed (i–l) cases, showing the spatial distribution of the covariate singularity index (a, e, i), the relationship between the singularity index and the response variable for training samples (b, f, j), the random forest variable importance comparing original covariate and singularity features (c, g, k), and comparisons between SRK and ordinary kriging predictions (d, h, l). 图 2. 模拟情景中的协变量奇异性特征、响应关系、随机森林重要性及普通克里金与奇异性回归克里金（SRK）预测对比。

Figure 3. Study area, geology, and zinc and cobalt observations in Western Australia. — Figure 3. Study area in Western Australia (b), lithological distribution (a), and spatial distributions of Zn (c) and Co (d) samples within the study area. 图 3. 西澳大利亚研究区的位置、地质背景以及 Zn 和 Co 观测值的空间分布。

Figure 4. Workflow of singularity regression kriging modelling and validation. — Figure 4. Workflow of the singularity regression kriging (SRK) model for trace element prediction in a mining region. 图 4. 矿区微量元素预测的奇异性回归克里金（SRK）模型工作流程。

Figure 5. Spearman correlations among trace elements and environmental covariates. — Figure 5. Spearman correlation matrices between trace elements and explanatory variables for (a) Zn and (b) Co. 图 5. Zn、Co、岩性变量、高程、坡度和坡向之间的 Spearman 相关矩阵。

Figure 6. Covariate singularity maps, feature importance, residual distributions, and relationships with observations. — Figure 6. Analysis of covariate singularity features and residual improvement in the SRK model for Zn (a–d) and Co (e–h). (a, e) Spatial distributions of the selected covariate singularity feature, *sv(hm)*. (b, f) Random forest variable importance, showing the relative contribution of original covariates and singularity features. (c, g) Residual density distributions for RF and SRK, illustrating the effect of singularity feature augmentation on residual concentration and spread. (d, h) Relationships between the singularity feature *sv_hm* and the observed element concentration, with the fitted trend line and correlation coefficient shown. 图 6. Zn 和 Co 的协变量奇异性特征、随机森林重要性、残差分布及奇异性值与观测值的关系。

Figure 7. Sensitivity analysis of singularity scale and threshold parameters. — Figure 7. Sensitivity analysis of the SRK model under different parameter combinations for Zn and Co. (a, c) Heatmaps of mean R² for Zn and Co, respectively, across combinations of maximum singularity scale and singularity feature selection threshold. (b, d) Heatmaps of mean RMSE for Zn and Co, respectively, under the same parameter settings. (e) Relative changes in R² and RMSE with respect to the baseline parameter settings with the maximum singularity scale of 20 km and singularity selection threshold of 0.5 for Zn and Co, with dashed lines indicating the ±5% range used to assess model robustness. 图 7. Zn 和 Co 预测性能对最大奇异性尺度与奇异性阈值参数的敏感性。

Figure 8. Observed versus predicted zinc and cobalt values for six spatial prediction methods. — Figure 8. Comparisons between observations and predictions of Zn and Co concentrations for the validation data under spatial block cross-validation of the SRK and benchmark models (OK, IDW, LM, RF, and RFK). 图 8. 普通克里金、反距离加权、线性模型、随机森林、随机森林克里金与 SRK 的 Zn 和 Co 观测值与预测值对比。

Figure 9. Spatial prediction maps of zinc and cobalt from six methods. — Figure 9. Spatial predictions of trace elements, Zn (a) and Co (b), in the study area using the SRK model and benchmark methods (OK, IDW, LM, RF, and RFK). For each element, the six prediction maps are displayed using a unified colour scale to ensure direct visual comparability across models. 图 9. SRK 与五种对比方法生成的 Zn 和 Co 空间预测图。

Figure 10. Cross-section comparison of zinc and cobalt predictions. — Figure 10. Comparisons of predicted Zn and Co concentrations derived from SRK and benchmark models along a horizontal cross-section at y = −3733945 m in the study area. 图 10. 六种空间预测方法得到的 Zn 和 Co 预测结果剖面对比。

Figure 11. Prediction uncertainty comparison between random forest kriging and singularity regression kriging. — Figure 11. Uncertainty analysis and comparison between the SRK and RFK models for Zn and Co prediction. (a, d) Spatial distributions of prediction uncertainty derived from the RFK and SRK models for Zn and Co, respectively. (b, e) Probability density distributions of uncertainty values for RFK and SRK, with dashed lines indicating mean uncertainty levels. (c, f) Spatial differences in uncertainty, defined as Δ = δ_RFK − δ_SRK, where positive values indicate lower uncertainty for SRK than for RFK. 图 11. Zn 和 Co 预测中随机森林克里金与 SRK 不确定性的空间分布及统计分布对比。

Data and Code Availability

The data and code supporting this study are publicly available through both Figshare and GitHub.

Figshare: https://doi.org/10.6084/m9.figshare.31073173
GitHub: https://github.com/renkaigis/Singularity_Regression_Kriging

Funding

This work was supported by the National Natural Science Foundation of China (Outstanding Young Scholars Program, Grant No. 42325107) and the Natural Science Foundation of Inner Mongolia Autonomous Region of China (2024QN03070).

Overview

Abstract

Method Implementation

Construct singularity features from environmental covariates

Build the augmented feature matrix

Estimate the nonlinear trend with random forest

Calculate residuals at sampled locations

Interpolate the residuals using ordinary kriging

Combine RF trend and kriged residuals

Figures

Data and Code Availability

Funding