From Noise to Diversity: Random Embedding Injection in LLM Reasoning
We isolate the simplest form of soft prompt — Random Soft Prompts (RSPs) — training-free vectors freshly resampled from a Gaussian fitted to the embedding table. RSPs reach accuracy comparable to optimized soft prompts on math reasoning benchmarks, suggesting the gain comes from the act of injection rather than learned content. The mechanism unfolds in two stages: attention absorbing a never-seen-before random position flattens early-token distributions and branches reasoning trajectories; later generation dilutes this influence so the response commits. At inference, RSPs lift early-stage token diversity and widen Pass@N; the same effect carries into DAPO training with practical gains.