What does Scale Prognostics actually predict?

Capacity retention vs. cycle number for silicon-graphite lithium-ion cells, plus per-cycle decomposition into degradation mechanisms (LLI, LAM-Si, LAM-Gr, transport-driven failure). The engine simulates 1,000 cycles in ~30 ms.

How accurate is the model?

Calibrated against 8 published silicon-anode datasets — 6 at A+ (RMSE 0.20–0.78 percentage points) and 2 at A on knee-prone cells (RMSE 1.12–1.17 pp). Mean RMSE 0.64 pp. Held-out forward-prediction RMSE on knee-prone cells is materially higher (mean 11.86 pp at 70% retention cutoff) and is the focus of ongoing architecture-sprint work; see /methodology for the full breakdown. Validation spans 2.5–20% silicon, 0.5C to 2C charge rates, and 25°C to 45°C temperatures.

How does this compare to PyBaMM or running my own model?

PyBaMM solves full PDE (Doyle-Fuller-Newman) systems — rigorous but ~1000x slower and requires extensive parameterization. Our engine runs 1,000 cycles in ~30ms with 10 coupled aging mechanisms, built-in uncertainty quantification, and auto-calibration. Use PyBaMM for voltage-level predictions; use us for lifetime predictions at scale.

What chemistries are supported?

Silicon-graphite anodes with NMC cathodes, with silicon content from 2.5% up to 20%. Pure-graphite cells work but the competitive advantage is highest for silicon-containing cells. LFP cathode support is on the roadmap.

Free trial: 50 simulations/month, no credit card. Starter: $499/month for 2,500 simulations. Professional: $1,499/month for 10,000. Enterprise pricing on request.

← Back to home

Methodology & Validation

What we measure, how we measure it, and where the model is honest about its limits. Last updated 2026-07-28. Engine version v18.5 (July 2026 bounded-physics recalibration).

Two metrics, not one

Two distinct accuracy numbers describe a battery degradation model and they are not interchangeable. We report both because the difference matters.

Calibrated-fit RMSE — the residual between the model and a published retention curve after the model has been fit to that curve. Measures whether the engine can reproduce the data it has been calibrated against.
Held-out forward-prediction RMSE — the residual between the model and the published curve when the model is calibrated using only the early portion of the data (e.g., cycles 0–500 at 100% retention down to 90%) and then extrapolated forward to predict the rest. Measures the metric that matters for cycle-life forecasting.

Calibrated-fit numbers are reported throughout the published literature and the marketing copy of most commercial degradation tools. Held-out forward-prediction is the standard since Severson et al. Nature Energy 2019 introduced it as the appropriate benchmark for cycle-life prediction. We publish both.

Calibrated-fit performance (8 datasets)

Each preset is a hand-tuned PhysicsConfig that fits the published curve. Calibration uses Nelder–Mead in log-space across 4–8 rate constants (SEI, Si-cracking, plating, transport). Six of eight are fully-attributable peer-reviewed curves; two are synthesized references (one proxy fit, one literature composite) and are labeled accordingly in the table below.

Dataset	Provenance	Si %	C-rate	T (°C)	RMSE (pp)	R²	Grade
LG_M50T	SYNTHESIZED (Kirkaldy 2022 proxy)	2.5	1.0	25	0.65	0.995	A+
NatComms_20Si_LPD	Nat. Commun. 2021, 12, 2811	20	0.5	25	0.43	0.996	A+
NatComms_20Si_HPD	Nat. Commun. 2021, 12, 2811	20	0.5	25	1.12	0.994	A (knee-prone)
HPQ_GEN3_18650	HPQ Silicon / Novacium GEN3, 2024	18	0.5	25	0.20	0.992	A+
SiGr_5pct_45C_1C	SYNTHESIZED (Dressler-style composite)	5	1.0	45	0.45	0.998	A+
Kirk_2024_Moderate	Kirk et al., ACS Energy Lett. 2024	10	0.5	25	0.35	0.997	A+
Kirk_2024_FastCharge	Kirk et al., ACS Energy Lett. 2024	10	2.0	30	0.78	0.998	A+ (knee captured)
Dose_2023_1C	Dose et al., J. Power Sources 2023	8 (nano)	1.0	25	1.17	0.981	A

Summary (6 attributable + 2 synthesized): 6 A+ (RMSE 0.20–0.78 pp), 2 A (RMSE 1.12–1.17 pp). Mean RMSE across all 8: 0.64 pp. Numbers reflect the v18.5 bounded-physics recalibration (July 2026): both knee cells improved (Kirk_FastCharge 1.01 → 0.78, NatComms_HPD 1.23 → 1.12) and several smooth cells loosened slightly — the deliberate trade for physically bounded behavior beyond each cell's validated cycle range (collapse-class extrapolation bugs fixed and fenced). The two synthesized references (LG_M50T proxy, SiGr_5pct_45C_1C composite) reproduce literature-typical aging shapes for chemistries where the original cycler data was not publicly available at the time of build — they are useful for regression testing but should not be cited as independent validation against those specific cells. Re-sourcing real data for both is on the post-sprint backlog.

Threshold convention disclosure

Our grade scheme uses the following thresholds, defined in model/battery_model_v18.py (ValidationMetrics.grade — the single grading source for the API, reports, and this page):

A+: RMSE < 1.0 percentage points
A: 1.0 ≤ RMSE < 2.0 pp
B: 2.0 ≤ RMSE < 3.5 pp
C: 3.5 ≤ RMSE < 5.0 pp
D: RMSE ≥ 5.0 pp

Stricter conventions in some national lab reviews use A+ < 0.5 pp / A < 1.0 pp. By those thresholds, our calibrated count would be 4 A+ (HPQ_GEN3 0.20, Kirk_Moderate 0.35, NatComms_LPD 0.43, SiGr_5pct 0.45) plus 2 A (LG_M50T 0.65, Kirk_FastCharge 0.78) plus 2 B (NatComms_HPD 1.12, Dose_2023 1.17). All eight datasets remain within 1.5 pp of the published curves either way.

Held-out forward-prediction performance

Held-out protocol: calibrate the model using only data up to the 90% retention cutoff, then run forward prediction to the 70% retention cutoff (typical end-of-warranty checkpoint). Compute RMSE on the held-out portion only.

Cell class	Calibrated-fit RMSE	Held-out forward RMSE	Status
Linear-degradation cells (5 datasets)	0.20–0.78 pp	~1–3 pp (mean)	Production-ready
LG_M50T-class (low-Si EV)	0.65 pp	4.98 pp (Track 2 expanded fit)	Sprint target hit
Knee-prone cells (HPD-graphite, fast-charge)	1.12–1.17 pp	~12 pp (mean)	Active research direction

The architecture sprint May 4–10, 2026 closed the gap on LG_M50T-class cells from 14.10 pp to 4.98 pp by expanding the fit-parameter set to engage v18.5's cathode LAM, electrolyte depletion, and R-overpotential coupling mechanisms (Track 2). The remaining held-out gap on knee-prone cells is the work for the N≥30 cross-cell validation panel beginning 2026-05-11.

What this means for the cycle-life forecasting use case

Use the calibrated-fit numbers when evaluating whether the model can reproduce a curve you've already characterized — design-of-experiments sweeps, sensitivity analysis, mechanism decomposition on cells you have full data for.

Use the held-out forward-prediction numbers when evaluating whether the model can predict cycle life from limited early-cycle data — warranty modeling, fleet replacement-rate forecasting, BMS calibration from sparse field measurements. Linear-degradation cells are production-ready; knee-prone cells are active research.

The /api/calibrate response surfaces both numbers when you supply enough data to compute them. The dispatch_to field on the pre_screen endpoint flags cells that route to the identifiability-limited branch — those are the ones where the held-out gap matters.

Out-of-validated-envelope behavior

Calibrated envelope: 2.5–20% silicon, 0.5C–2C charge, 25–45°C ambient.Outside this envelope every predict response includes a validation_warnings array with field-specific messages and an out_of_validated_range boolean.

Two known model gaps in the customer-visible warning:

Cold-charge plating (T < 0°C): the model's plating mechanism does not activate at sub-zero charging. Real silicon-graphite cells plate severely in this regime. The validation warning correctly flags T < 25°C as outside the calibrated envelope; the underlying model behavior underestimates degradation here. Do not use for cold-climate cycle-life forecasting until this is resolved.
Extreme fast charge (≥3C): mechanical fatigue and transport collapse activate but the magnitude is not validated against published >3C cycling data. Treat predictions as directional, not certified.

Reproducibility

Every simulation run produces an AuditRecord with deterministic run_id (SHA-256 of config + protocol), config_hash, model_version, output_schema_version, timestamp, platform info, and invariant-check messages. Run the same config + protocol against the same model version at any point in the future and you get an identical retention curve.

Version retention policy: model versions are pinnable via the ?model_version= query parameter and remain callable for a minimum of 24 months after a successor version ships. See /sla for full details.

Open peer review welcome

We publish both calibrated-fit and held-out numbers because the difference matters and because we expect peer reviewers, national lab researchers, and warranty actuaries to compute their own. If you've reproduced these numbers and disagree, write us at jason@scaleprognostics.com with the data and we'll publish the disagreement here.