Corrector Sampling in Language Models

Itai Gat, Neta Shaul, Uriel Singer, Yaron Lipman

Advances in Neural Information Processing Systems 38 (NeurIPS 2025) Main Conference Track

Autoregressive language models accumulate errors due to their fixed, irrevocable left-to-right token generation. To address this, we propose a new sampling method called Resample-Previous-Tokens (RPT). RPT mitigates error accumulation by iteratively revisiting and potentially replacing tokens in a window of previously generated text. Fine-tuning a pretrained 8B parameter model with RPT for only 100B resulted in ~10% relative improvements on reasoning and coding benchmarks compared to the standard sampling.