NOW LET US – AI RAG SaaS Studio TP.HCM
NOW LET US
Digital Product Studio
Back to news
AGENTIC-SYSTEMS...1 min read

How Uncertainty Estimation Scales with Sampling in Reasoning Models

Share
NOW LET US Article – How Uncertainty Estimation Scales with Sampling in Reasoning Models

The study explores how reasoning models estimate uncertainty using verbalized confidence and self-consistency, finding that a hybrid approach significantly outperforms individual methods even with minimal sampling.

Computer Science > Artificial Intelligence

Title: How Uncertainty Estimation Scales with Sampling in Reasoning Models

Uncertainty estimation is critical for deploying reasoning language models, yet remains poorly understood under extended chain-of-thought reasoning. We study parallel sampling as a fully black-box approach using verbalized confidence and self-consistency. Across three reasoning models and 17 tasks spanning mathematics, STEM, and humanities, we characterize how these signals scale.

Both self-consistency and verbalized confidence scale in reasoning models, but self-consistency exhibits lower initial discrimination and lags behind verbalized confidence under moderate sampling. Most uncertainty gains, however, arise from signal combination: with just two samples, a hybrid estimator improves AUROC by up to +12 on average and already outperforms either signal alone even when scaled to much larger budgets, after which returns diminish. These effects are domain-dependent: in mathematics, the native domain of RLVR-style post-training, reasoning models achieve higher uncertainty quality and exhibit both stronger complementarity and faster scaling than in STEM or humanities.

© 2026 Now Let Us. All rights reserved.

Source: arXiv cs.AI Recent

Advertisement
Ad slot ready: 5887729102

More in this category

EXPLORE TOPICS

Discover All Categories

Deep dive into the specific technology sectors that matter most to you.