
New research has found ChatGPT-5.2 can generate original mathematical proofs, introducing “vibe-proving” as a new AI reasoning method. AI accelerates discovery, but human verification remains necessary.
Researchers at VUB’s Data Analytics Lab report that commercial language models can produce original mathematical proofs. In their study, the team shows that OpenAI’s large language model ChatGPT-5.2 (Thinking) was able to solve a mathematical problem on its own.
The case focused on proving a 2024 conjecture proposed by mathematicians Ran and Teng. A conjecture is a statement believed to be true based on patterns or repeated results, but it has not yet been formally proven. Once a valid proof is established, the conjecture becomes a theorem.
According to the study, the final proof emerged from seven chat sessions with ChatGPT and four evolving versions of the argument. The model played a key role in exploring possible approaches, while human researchers ensured the reasoning was correct and logically complete.
ChatGPT’s Role in Mathematical Discovery
The researchers found that ChatGPT-5.2 (Thinking) developed much of the proof’s structure with limited human input. As they note, “With the Data Analytics Lab, we are one of the first to demonstrate that a commercially available LLM can independently develop original mathematical proofs.”
“I had long suspected that ChatGPT could help me prove unsolved mathematical problems,” says Brecht Verbeken (postdoctoral researcher in the Data Analytics Lab VUB research group). “And yet I was surprised at how efficiently that worked out.”
The team places this work within a broader approach they call vibe-proving, where language models help organize and explore complex theoretical ideas. They also raise the question of whether this method could advance as quickly as AI-assisted programming, known as vibe-coding, which has already progressed from simple tools to near-autonomous code generation. “We often hear how people think that the creativity of systems is fundamentally limited to reformulations of their training data,” says VUB professor Vincent Ginis (Data Analytics Lab). “Glad we can dispel that misconception with our work as well.”
Human Verification and the Future of AI Research
Despite the model’s strong contribution, the researchers stress that human involvement remains essential for final verification and resolving any remaining gaps in the proof. The process also highlights where language models are most helpful and where challenges in validation still exist.
This work represents a significant step for AI in theoretical research. Beyond supporting coding or writing tasks, language models may now contribute to original mathematical discoveries when paired with careful human oversight. “Formulating candidate proofs can now be much faster, but the bottleneck then becomes human verification. That takes time. But language models will help us there too,” concludes VUB professor Andres Algaba (Data Analytics Lab VUB).
Reference: “Early Evidence of Vibe-Proving with Consumer LLMs: A Case Study on Spectral Region Characterization with ChatGPT-5.2 (Thinking)” by Brecht Verbeken, Brando Vagenende, Marie-Anne Guerry, Andres Algaba and Vincent Ginis, 21 February 2026, arXiv.
DOI:10.48550/arXiv.2602.18918.
Never miss a breakthrough: Join the SciTechDaily newsletter.
Follow us on Google and Google News.