Behavior Injection: Preparing Language Models for Reinforcement Learning

Cen, Zhepeng; Yao, Yihang; Han, William; Liu, Zuxin; ZHAO, DING

Behavior Injection: Preparing Language Models for Reinforcement Learning

Zhepeng Cen, Yihang Yao, William Han, Zuxin Liu, DING ZHAO

Advances in Neural Information Processing Systems 38 (NeurIPS 2025) Main Conference Track

Abstract

Reinforcement learning (RL) has emerged as a powerful post-training technique to incentivize the reasoning ability of large language models (LLMs). However, LLMs can respond very inconsistently to RL finetuning: some show substantial performance gains, while others plateau or even degrade. To understand this divergence, we analyze the per-step influence of the RL objective and identify two key conditions for effective post-training: (1) RL-informative rollout accuracy, and (2) strong data co-influence, which quantifies how much the training data affects performance on other samples. Guided by these insights, we propose behavior injection, a task-agnostic data augmentation scheme applied prior to RL. Behavior injection enriches the supervised finetuning (SFT) data by seeding exploratory and exploitative behaviors, effectively making the model more RL-ready. We evaluate our method across two reasoning benchmarks with multiple base models. The results demonstrate that our theoretically motivated augmentation can significantly increase the performance gain from RL over the pre-RL model.

Abstract

Name Change Policy