ATLAS-SOC Core v0.1
To access high-resolution data for your area(s) of interest, contact our team.
High-Level Description
ATLAS-SOC Core v0.1 was the first release of Perennial Climate Inc.'s digital soil mapping-based soil organic carbon (SOC) quantification framework. ATLAS-SOC—short for Advanced Terrestrial machine-Learning Analysis System for Soil Organic Carbon—leverages cutting-edge machine learning to predict SOC stocks and stock changes at high spatial resolution with broad geographic applicability. ATLAS-SOC Core v0.1 has been peer-reviewed: view the article from the journal MDPI Remote Sensing.
The Core class of models is designed for general-purpose SOC quantification without requiring new soil sample data, making it ideal for large-scale analyses, such as policy development, identification of degraded lands, carbon accounting at national or regional scales, and baseline assessments. Trained on up to 350,000 soil samples from Perennial's extensive sample archive, the Core class of models provides reliable SOC stock estimates while balancing spatial coverage and predictive performance.
Unlike ATLAS-SOC Fine-Tuned models—optimized for specific farms or carbon projects using localized soil data—Core v0.1 provides an "out-of-the-box" solution with wide applicability but without site-specific calibration.
Technical Specifications
- Release Year: 2021
- Resolution: 10 m (native)
- Spatial Extent: Continental United States (CONUS) croplands
-
Domain: Non-tree row crop agricultural lands
-
Depth of Prediction: 0–30 cm
-
Outputs and Units
-
SOC % by mass (primary model output)
-
SOC stock (t C/ha) derived by combining SOC % predictions with depth-weighted bulk density values from SoilGrids 2.0
-
Inputs:
- Long-term physical climate proxies (e.g., WorldClim variables)
- Short-term climate and weather data (e.g., NCEP CFSv2 summaries)
- Soil texture and edaphic variables (e.g., SoilGrids)
- Topographic variables (e.g., USGS DEM)
- Optical and radar remote sensing time-series (e.g., Sentinel-2, MODIS, SMAP, Sentinel-1 SAR)
-
Temporal Range: Based on recent soil samples (2020–2021) and RaCA legacy samples (2010–2011)
-
Temporal Frequency: Static SOC stock map (0–30 cm depth) with potential for future time series modeling
Algorithm Theoretical Basis
ATLAS-SOC Core v0.1 is a digital soil mapping (DSM) framework that employs a gradient-boosted regression tree (XGBoost) as its core predictive algorithm. The model integrates multiple environmental covariates—climate data, soil properties, topography, and remote sensing time-series—to predict SOC content in the top 30 cm of soil.
Key methodological highlights:
-
Trained on over 5,230 in-situ soil samples across 47 U.S. states, including USDA RaCA legacy data and recent field samples from Perennial's archive.
-
Utilizes time-series remote sensing summaries to capture ecosystem dynamics. The source data are from Sentinel-2, MODIS LST, and SMAP soil moisture to capture the impacts of vegetation dynamics and soil moisture variability on SOC.
-
Feature importance analysis shows Sentinel-2 optical indices (e.g., NDVI, SAVI) as primary predictors, followed by weather-related variables and soil texture.
-
Cross-validation demonstrated a strong relationship between predicted and observed SOC values at both sample and field levels (R² = 0.81, RMSE = 0.041% SOC by mass at the field level).
Model Assumptions & Constraints
-
SOC % Only: ATLAS-SOC Core v0.1 predicts SOC % by mass but does not model bulk density directly. As such, SOC stock values (t C/ha) were derived by integrating the SOC % outputs with depth-weighted bulk density values sourced from SoilGrids v2.0. More recent versions of ATLAS-SOC Core predict SOC % and Bulk Density in tandem.
-
Generalization over Specificity: The model is designed for broad geographic applicability and may not capture localized management effects without fine-tuning.
-
Depth Standardization: SOC predictions are standardized for the 0–30 cm depth profile, following carbon accounting best practices.
-
Static Representation: Version 0.1 represents SOC at a fixed time point, assuming stable land use and management over the data period.
-
Sample Bias: While the model includes samples from diverse regions, some agricultural areas may be underrepresented, affecting localized accuracy.
-
Data Limitations: Legacy soil samples (e.g., RaCA) lack corresponding modern remote sensing data, introducing potential gaps in model sensitivity for these areas.
-
No Uncertainties Included: This version does not provide explicitly quantified uncertainties of the map values. This has been added in more recent versions.
Known Issues
-
Lower Performance in High Variability Soils: The model may underpredict SOC in soils with high organic content (>3%) and overpredict in sandy soils with low organic matter.
-
Temporal Gaps in Covariate Data: Some areas exhibit reduced prediction accuracy due to data gaps (e.g., cloud cover in Sentinel-2 composites).
-
Boundary Effects: SOC estimates near field or landcover boundaries may be less accurate due to mixed pixel effects.
What’s New in v0.1?
-
First Implementation of ATLAS-SOC Framework: Establishes the baseline methodology for future Core and Fine-Tuned models.
-
Incorporation of Time-Series Remote Sensing: Sentinel-2 and MODIS time-series data improve sensitivity to land management effects.
-
Field-Scale Validation: Cross-validated predictions at the field level (n = 165 fields) outperform existing public SOC maps (e.g., SoilGrids v2.0) (Kellner et al. 2025, in review).
Data Acknowledgements & Partners
-
WorldClim v2 – Long-term climate data
- WorldClim version 2.1 provides high-resolution global climate data, including 19 bioclimatic variables and monthly temperature and precipitation estimates, based on the 1970–2000 period. These datasets are available at multiple spatial resolutions, ranging from 10 arc-minutes to 0.5 arc-minutes. For detailed information and access, visit the WorldClim website.
-
NCEP CFSv2 – Short-term weather data
- The National Centers for Environmental Prediction (NCEP) Climate Forecast System Version 2 (CFSv2) is a coupled atmosphere-ocean-land model that provides operational weather forecasts and reanalysis data. Operational since March 2011, CFSv2 offers comprehensive short-term weather data essential for various environmental modeling applications.
-
SoilGrids v2.0 – Global soil texture, pH, and bulk density datasets
- SoilGrids 2.0 is a global gridded soil information system developed by ISRIC – World Soil Information. It offers maps of soil properties, including texture fractions, pH, and bulk density, at a 250 m spatial resolution for six standard depth intervals up to 2 meters. These maps are generated using machine learning models trained on over 230,000 soil profiles and more than 400 environmental covariates.
-
RaCA – Soil samples
- The Rapid Carbon Assessment (RaCA), led by the USDA NRCS, provides nationwide soil organic carbon (SOC) data across diverse U.S. land uses, supporting carbon stock assessments and land management. The dataset is publicly available and widely used in soil science and carbon markets.
Key References:
-
Fu et al., 2024. Accurate Quantification of 0–30 cm Soil Organic Carbon in Croplands over the Continental United States Using Machine Learning. Remote Sensing, 16, 2217.
-
Kellner et al., 2025. Digital soil mapping in support of voluntary carbon market programs in agricultural land. PLOS ONE. In review.
Accessing Data
Contact our team to request high-resolution data for your area(s) of interest.