
2026-05-07 - Adam Murphy
Our AI Found the Same Error a Physicist Published a Formal Rebuttal About
A paper in the Proceedings of the Royal Society claimed to solve the Schrödinger equation exactly from classical mechanics. Our AI review system flagged a foundational mathematical error — the same one a physicist independently identified in a formal published comment days later.
On May 7, 2026, we submitted a paper to TOEShare's multi-agent review system. The paper — "On computing quantum waves exactly from classical action" by Lohmiller & Slotine — had been published in the Proceedings of the Royal Society A, one of the oldest and most prestigious scientific journals in the world. The authors are from MIT.
Our system scored it 2 out of 5 and recommended major revisions.
The reason: a foundational mathematical error. The paper claims to derive quantum wave functions exactly from classical mechanics. Our Math/Logic specialists identified that the derivation omits a critical term — the quantum potential — by neglecting the spatial derivatives of the probability density amplitude. Without that term, the claimed "exact" equivalence is actually just the standard semiclassical approximation that's been known since the 1920s.
Three days earlier, a physicist at Eötvös Loránd University in Budapest had posted a formal comment on arXiv identifying the exact same error, in the exact same terms.
We didn't know about the comment when we ran the review. The system found it independently.
What the Paper Claims
Lohmiller & Slotine propose that a quantum wave function constructed from the classical action φ and classical density ρ:
ψⱼ = √ρⱼ · e^(iφⱼ/ℏ)
is an exact solution to the Schrödinger equation. Their central claim (Theorem 2 / Lemma 3.1) is that substituting this ansatz into the Schrödinger equation yields exactly the classical Hamilton-Jacobi equation, with no remainder.
If true, this would mean quantum mechanics reduces to classical mechanics — a result that would overturn a century of physics.
What Our System Said
Four Math/Logic specialists from two providers (Anthropic and OpenAI) evaluated the mathematical validity. Their findings converged on the same point:
"The paper's central theorem is stated without derivation of sufficient depth for verification. Substituting Ψ = √ρ e^{iΦ/ℏ} into the Schrödinger equation gives a quantum-potential term Q[ρ] that the paper ignores."
The system scored mathematical validity at 2/5 with high confidence. The overall assessment explicitly called out the missing quantum potential and noted that the paper's examples work only because they belong to special cases where the omitted term happens to be zero.
The recommendation: Revise — the central mathematical claim is not supported by the derivation presented.
What the Published Rebuttal Says
Gábor Vattay's formal comment (arXiv:2605.02621), posted May 4, 2026, states:
"By neglecting the spatial derivatives of the probability density amplitude, the authors inadvertently omit the quantum potential — the term originally identified by Madelung and later emphasised by Bohm. Consequently, their proposed equivalence is not exact but rather constitutes the standard semiclassical approximation."
The language is almost verbatim.
Side-by-Side
| TOEShare AI Review (May 7) | Vattay Published Comment (May 4) | |
|---|---|---|
| Error identified | Omission of quantum potential Q[ρ] | Omission of quantum potential Q = -(ℏ²/2M)(∇²√ρ)/√ρ |
| Mechanism | Paper ignores spatial derivatives of density amplitude | Authors "set ∇√ρⱼ = 0, eliminating the quantum potential" |
| Consequence | Claimed exact equivalence is actually semiclassical | "Not exact but rather constitutes the standard semiclassical approximation" |
| Why examples work | Special cases where the omitted term vanishes | "Quantum potential vanishes identically due to geometry" or "imports quantum eigenfunctions through initial conditions" |
| Verdict | 2/5, Revise | "Foundational error" — paper represents "a standard semiclassical approximation, not an exact mathematical equivalence" |
The system and the physicist arrived at the same conclusion, through the same reasoning, identifying the same mathematical object, with the same downstream analysis of why the examples appear to work.
We Then Reviewed the Rebuttal
To complete the validation, we submitted Vattay's comment paper through the same review system.
Result: 4/5, Approved for Publication.
Nine specialist evaluations across four models, zero contested dimensions. The Math specialists confirmed:
"The comment is a tightly-argued mathematical critique that correctly identifies a specific, well-defined error. The Madelung decomposition shown is mathematically correct... all mathematical claims are standard and correctly applied."
The system correctly gave the flawed original a 2/5 and the correct critique a 4/5. It discriminated between a paper with a foundational error and a paper that correctly identifies that error.
Why This Matters
1. Prestige doesn't protect bad math.
The original paper was published in Proc. R. Soc. A by MIT researchers. Our system doesn't know or care about publication venue or institutional affiliation. It evaluates the mathematics. The math was wrong. It scored a 2.
2. The system finds real errors, not just style issues.
This isn't "the abstract could be clearer" or "needs more references." The quantum potential is a specific mathematical term with a specific formula. Either you include it or you don't. The paper doesn't include it. The system caught that.
3. Independent convergence is the strongest validation.
We didn't train the system on this paper. We didn't know about Vattay's comment. We didn't prime the review in any direction. Multiple AI models from different providers, evaluating independently, converged on the same finding that a domain expert subsequently published. That's not a coincidence — it's evidence that the system captures genuine mathematical reasoning.
4. The review pipeline discriminates correctly.
Flawed paper from a prestigious journal: 2/5.
Correct critique identifying the flaw: 4/5.
The system knows which is which.
Context: Our Calibration Program
This result sits within a broader calibration effort:
- Preregistered tier study (v1.0): Monotonic score ordering across four quality tiers with p = 0.001
- Adversarial calibration (v1.1): 0/13 judges moved on five attack inputs
- Known-flaw benchmark (v1.2): System identified the right region in a Lean-verified error
- This result (v2.2): System independently arrived at a published rebuttal's central finding
Each layer tests something different. The tier study shows discrimination. The adversarial tests show robustness. The known-flaw benchmark shows localization. This result shows that the system's mathematical reasoning matches expert judgment in a real-world case where the answer wasn't known in advance.
What the System Can't Do (Yet)
Our system didn't call this an "error" in the absolute sense. It said the derivation was insufficiently supported and the claimed exactness was not established. It flagged the specific mathematical gap. But it expressed this as "the quantum potential term is missing from the analysis" rather than "the theorem is wrong."
That's an important distinction. The gap between "this step is unverified" and "this step is invalid" is the gap between a reviewer and a proof checker. We're building tools to close that gap — Mathematical Risk Flags (shipped May 7) surface flagged equations as visible warnings, and formal verification via Lean is on the roadmap — but the current system operates as a careful reviewer, not a theorem prover.
For this paper, careful review was enough. The error was identifiable from the mathematics alone, without needing machine-verified proof. The system found it.
Read the Reviews
Original paper review (2/5 — Revise):
Available on the platform — search "On computing quantum waves exactly from classical action"
Comment paper review (4/5 — Approved):
Comment on 'On computing quantum waves exactly from classical action'
TOEShare uses independent specialist AI agents from multiple providers with coordinator synthesis to produce structured, paradigm-neutral review of scientific work. The platform's calibration methodology — preregistered hypotheses, adversarial testing, known-flaw benchmarks, and prompt version tracking — is documented publicly at /calibration.