Prompting as Scientific Inquiry

Ari Holtzman, Chenhao Tan

Advances in Neural Information Processing Systems 38 (NeurIPS 2025) Position Paper Track

Prompting is the primary method by which we study and control large language models. It is also one of the most powerful: nearly every major capability attributed to LLMs—few-shot learning, chain-of-thought, constitutional AI—was first unlocked through prompting. Yet prompting is rarely treated as science and is frequently frowned upon as alchemy. We argue that this is a category error. If we treat LLMs as a new kind of organism—complex, opaque, and trained rather than programmed—then prompting is not a workaround. It is behavioral science. Mechanistic interpretability peers into the neural substrate, prompting probes the model in its native interface: language. We argue that prompting is not inferior, but rather a key component in the science of LLMs.