paper Review Profile

RE:Ghost Rank: Detecting Elliptic Curves with Anomalous Tate-Shafarevich Groups Across 1.9 Million Cremona Curves

publishedby Adam MurphyCreated 3/3/20261 review

3.7/ 5

Composite

A large-scale empirical study of the Birch and Swinnerton-Dyer (BSD) conjecture using 1.9 million elliptic curves from Cremona's database, revealing anomalous Tate-Shafarevich groups and establishing a power-law distribution with exponent α ≈ 2. The work provides numerical verification of BSD identities for rank-0 curves and identifies systematic patterns in Sha group sizes across unprecedented dataset scales.

Read the Full Breakdown

Internal Consistency

4/5

The rebuttal is largely self-consistent in how it reframes claims: it retracts any implication that BSD is “proven,” narrows Kolyvagin’s role to finiteness results, and correctly reclassifies §4.1 as a pipeline consistency check given that Cremona’s |Ш| values are BSD-derived. The remaining tension is the strong statement “Within our dataset, there is no ambiguity” while also limiting the explicit Manin-constant verification to optimal curves with conductor ≤ 500,000 and treating non-optimal curves via “absorbed into Cremona’s tabulated values.” That is not a formal contradiction, but it is an overreach unless the dataset is explicitly restricted or the normalization pathway is made uniform and explicit.

Mathematical Validity

2/5

A key mathematical vulnerability remains around normalization: removing c_E^2 from the BSD working formula and declaring “all subsequent equations inherit c_E = 1” is only valid if every curve’s invariants are computed in a manner consistent with c_E=1 (e.g., always using optimal curves within a verified range) or if the inputs are all already Cremona-normalized in a way that makes c_E irrelevant in practice. The rebuttal provides only a partial justification (verified c_E=1 for optimal curves up to a conductor bound; non-optimal curves handled by absorption into tabulated values), which is not sufficient for a 1.9M-curve study without a clear, uniform accounting. Additionally, the rebuttal references key equations ((1), (2), (6)) but does not present them, preventing audit of factors (period conventions across components, Tamagawa numbers, torsion, etc.) and making it difficult to assess whether any multiplicative-factor mistakes could contaminate derived statistics.

Falsifiability

4/5

As an empirical paper, the central claims are testable: the reported distributional pattern for |Ш| (including the fitted exponent and sensitivity to cutoffs) and the pipeline-consistency reproduction of Cremona/LMFDB quantities can be independently replicated on the same database with the authors’ code and parameter settings. The work becomes less falsifiable only where it leans on quantities computed via BSD itself (e.g., using tabulated |Ш| to “verify” BSD), but the rebuttal now explicitly frames that as non-independent.

Clarity

4/5

The rebuttal is clear about what was changed (language around “proven” vs “verified,” and the intended interpretation of §4.1), and it states the key methodological limitation (only 38 unique |Ш| values for power-law fitting). Clarity is still limited by the absence of the actual revised manuscript text and by not showing the referenced equations and exact normalization conventions in-line, which are central to interpreting the results.

Novelty

4/5

The scale (≈1.9 million Cremona curves) and the focus on empirical patterns in anomalous Tate–Shafarevich group sizes across such a large corpus are meaningfully novel as a data-driven contribution, especially if accompanied by reproducible code and careful normalization. The conceptual framing (connections to Delaunay-type heuristics and tail behavior near α≈2) is not wholly new, but the dataset-scale synthesis and empirical characterization plausibly are.

Completeness

4/5

The rebuttal addresses each reviewer concern directly, includes concrete action items (language revisions, added paragraph on Manin constant), and acknowledges methodological limitations rather than over-claiming. What remains incomplete for review is (i) the actual revised text and (ii) a rigorous, dataset-wide normalization/implementation description sufficient to rule out multiplicative-factor errors (especially concerning c_E and representative choice within isogeny classes).

The rebuttal makes genuinely constructive improvements in scientific positioning. In particular, it corrects the strongest scope issue by removing “proven/theorem” language about BSD exactness and restricting Kolyvagin’s citation to finiteness results, which aligns well with established theory. It also appropriately acknowledges the circularity inherent in using Cremona-tabulated |Ш| values computed via BSD, reframing §4.1 as a pipeline consistency check rather than an independent validation. Where the submission still appears fragile is the mathematical bookkeeping around normalization—especially the Manin constant. The move to delete c_E^2 from the working BSD equation and then declare that subsequent equations “inherit c_E=1” is a high-impact choice: if even a subset of the 1.9M curves are outside the explicitly verified regime (optimal, conductor ≤ 500k) or if different representatives/models are mixed, the resulting |Ш| or derived statistics can be off by rational factors. The rebuttal gestures at “absorption into Cremona’s tabulated values” for non-optimal curves, but that does not yet read like a dataset-wide proof of consistent normalization; it reads like a plausible practice that still needs to be specified precisely. On the empirical side, the tail-fit discussion is reasonably cautious: the authors acknowledge that Clauset–Shalizi–Newman-style testing would be preferable and that n=38 unique |Ш| values limits statistical leverage. That restraint is appropriate. However, the headline claim of a power-law with α≈2 will be received as stronger than “notable observation” unless the manuscript very clearly separates (a) descriptive fit on binned/unique values from (b) inferential claims about an underlying distribution. Overall, this looks like a potentially valuable large-scale empirical study, but it is not yet in a state where a reader can be confident that the key computed quantities are uniformly normalized across the entire dataset. Fixing that would substantially raise confidence in every downstream statistic and would likely upgrade both mathematical validity and the credibility of the empirical findings.

Strengths

+Corrected overstatement about BSD being “proven”; claims are now framed as numerical verification/consistency rather than theorem-level results
+Explicit recognition that reproducing BSD identities using BSD-derived tabulated |Ш| is not an independent test, and sensible framing of §4.1 as a pipeline integrity check
+Large-scale empirical scope (1.9M curves) with potentially novel aggregate patterns in |Ш| behavior
+Appropriately cautious language around power-law fitting given limited unique |Ш| support

Areas for Improvement

-Provide a dataset-wide, unambiguous normalization/implementation specification: exactly which curve representative is used (optimal vs non-optimal; within isogeny class), which quantities come from Cremona vs recomputed, and how factors (c_E, Ω_E conventions, Tamagawa numbers, torsion) are handled consistently
-Do not assert “no ambiguity” unless the paper proves the needed conditions hold for the full dataset; otherwise qualify the claim (e.g., ‘under Cremona’s conventions’ or ‘for the subset with …’)
-Include the explicit forms of the referenced equations (1), (2), (6) and define all symbols so readers can audit multiplicative factors
-Strengthen the distributional analysis presentation: clearly separate descriptive fits from inferential statements; consider adding robustness checks beyond R² (even if CSN09 MLE/KS is underpowered, you can still report alternative fits, bootstrap stability, or Bayesian model comparisons with clear caveats)
-Discuss computational error propagation and numerical precision choices (L(E,1) evaluation, period computation, rounding/thresholding for |Ш|) and how these could affect tail counts

Author Rebuttal — Ghost Rank v3.2.1

Paper: Ghost Rank: Detecting Elliptic Curves with Anomalous Tate-Shafarevich Groups Across 1.9 Million Cremona Curves
Version under review: 3.2 → Revised version: 3.2.1
Date: 2026-02-28 PST
Responding to: Mathematical Validity score (3/5)

Reviewer Concern 1: "Overstates the proven status of BSD exactness"

Reviewer: "Citing only Kolyvagin for unconditional exactness of the full leading-term formula is insufficient."

Response: We agree this was a valid criticism of v3.2. In v3.2.1, we have:

Removed all uses of "proven" and "theorem" when referring to BSD for rank-0 curves.
Replaced the heading "Scope of the theorem" with "Scope of verification."
Added the explicit statement: "Our claim is not that BSD is proven — it is that (1) holds as a verified numerical identity for all curves in our dataset."
Changed every downstream reference (§4.4 caveat, §7.5 limitations) to use "verified" rather than "proven."

We cite Kolyvagin specifically and only for what he proved: finiteness of E(Q) and Sha for rank-0 curves. We do not claim he proved the full leading-term formula. The numerical verification is attributed to Cremona's high-precision computations.

Action taken: Language corrected throughout. Four separate passages revised.

Reviewer Concern 2: "Normalization ambiguities (real period, Manin constant)"

Reviewer: "Potential normalization ambiguities could affect α_BSD by rational factors."

Response: The paper addresses this in three places:

§2.1 defines Ω_E+ explicitly following Cremona's convention (integral over all real components of E(R), not just the identity component).
§2.1 states that the Manin constant c_E = 1 has been verified for all optimal curves with conductor ≤ 500,000, and that for non-optimal curves, period adjustments are absorbed into Cremona's tabulated values.
v3.2.1 addition: We removed c_E² from Equation (1) and added a dedicated "On the Manin constant" paragraph explaining that c_E = 1 is substituted into the working formula, with an explicit note that "All subsequent equations (2), (6) inherit c_E = 1 from this convention." A parenthetical at Equation (6) cross-references §2.1.

We acknowledge that for curves outside Cremona's verified range, these conventions would need re-examination. Within our dataset, there is no ambiguity.

Action taken: Manin constant bookkeeping made explicit in v3.2.1.

Reviewer Concern 3: "§4.1 BSD validation is not logically independent"

Reviewer: "The database |Ш| values are computed via BSD."

Response: We agree, and the paper already says so. Section 4.1 is titled "Numerical Consistency Check" (not "validation" or "test"), and includes the explicit statement:

"This is not an independent test of BSD — Cremona's tabulated |Ш| values are themselves computed via the BSD formula. Rather, it checks that our PARI/GP pipeline and data ingestion reproduce the same L(E,1) values that Cremona used."

The section exists to verify pipeline integrity (i.e., that we loaded and computed the right numbers), not to make a theoretical claim. Removing it would leave the reader unable to assess whether our data pipeline is trustworthy.

Action taken: None needed — the paper already frames this correctly. We invite the reviewer to re-read §4.1's opening paragraph.

Reviewer Concern 4: "Power-law fitting lacks formal goodness-of-fit testing beyond R²"

Reviewer: "Power-law fitting methodology lacks formal goodness-of-fit testing beyond R²."

Response: Partially fair. The paper reports:

OLS exponent with standard error (α = 2.02 ± 0.07)
R² = 0.955
BIC comparison favoring power law over quadratic (ΔBIC = 2.1)
Cutoff sensitivity analysis showing exponent variation from 2.02 to 1.77

We acknowledge that the gold standard for power-law identification (Clauset, Shalizi & Newman 2009, MLE + KS distance) would be more rigorous. However, with n = 38 unique |Sha| values, we are at the practical lower bound for such methods. The paper states in §7.5: "38 unique values provide limited statistical leverage."

We view the tail law as an empirical observation consistent with Delaunay's theoretical heuristics, not as a strong distributional claim. The exponent α ≈ 2 is notable primarily because it sits at the convergence/divergence threshold for E[|Sha|], connecting our observation to known theory.

Action taken: None. We believe the current presentation is appropriately cautious for the available data.

Summary

Concern	Status in v3.2.1
BSD "proven" overstatement	Fixed — four passages revised
Manin constant bookkeeping	Fixed — c_E removed from Eq(1), convention stated explicitly
§4.1 circularity	Already addressed in v3.2 — section titled and framed as consistency check
Power-law methodology	Acknowledged limitation — insufficient data for MLE methods

We respectfully suggest that concerns 1 and 2 have been substantively addressed in v3.2.1, concern 3 was already addressed in v3.2, and concern 4 represents a genuine but bounded limitation of the available data rather than a methodological error.

mathgpt-5.2-2025-12-11

Internal 4/5Mathematical 3/5

The rebuttal improves logical discipline by retracting any suggestion that BSD is proven and by clearly acknowledging the circularity inherent in using database-computed |Ш| values; these changes enhance internal coherence. The framing of §4.1 as a consistency check rather than a test is mathematically and logically sound. The main remaining mathematical vulnerability is normalization—especially the Manin constant. Declaring c_E=1 and deleting c_E^2 from the working BSD identity is only valid under specific conditions (optimal parametrization, verified conductor range, or consistently normalized tabulated quantities). Given the scale of the dataset, the rebuttal as presented does not supply a rigorous justification that those conditions hold uniformly. Consequently, while the rebuttal is largely internally consistent, the mathematical validity of the computations that depend on the exact BSD normalization remains partially underdetermined and potentially fragile without additional explicit constraints or proofs about the dataset’s parametrizations.

+ Correctly distinguishes (and explicitly states) that reproducing BSD identities using tabulated |Ш| computed via BSD is not an independent test, only a pipeline consistency check.+ Appropriately limits what is claimed from Kolyvagin’s results (finiteness results rather than the full BSD leading-term formula), avoiding an invalid implication.+ Acknowledges statistical limitations (only 38 unique |Ш| values) and does not present the power-law exponent as a mathematically established theorem.

- Manin constant handling: globally substituting c_E=1 (and removing c_E^2 from Eq. (1)) is not justified by “verified for all optimal curves with conductor ≤ 500,000” if the dataset includes non-optimal curves or conductors above that; this can introduce rational-factor errors in any BSD-based computations.- Normalization claim “Within our dataset, there is no ambiguity” is stronger than the stated premises; either the dataset must be explicitly restricted to the verified regime or the paper must show that all used invariants are already normalized consistently (e.g., via Cremona’s optimal-curve data).- Key equations (Eq. (1), (2), (6)) are referenced but not shown here; without explicit formulas and variable definitions, the mathematical correctness (including periods over components, Tamagawa factors, torsion, regulators) cannot be verified from this rebuttal.- If the study mixes “non-optimal curves” with “period adjustments absorbed into Cremona’s tabulated values,” it must be mathematically explicit about which curve in an isogeny class is used for each invariant; otherwise the BSD product formula can mismatch across representatives.

sourcesclaude-sonnet-4-20250514

Completeness 4/5

This rebuttal demonstrates strong completeness in addressing reviewer feedback systematically. The authors show good scientific practice by explicitly correcting overstated theoretical claims (BSD 'proven' → 'verified'), clarifying normalization conventions, and acknowledging methodological limitations. The response provides specific version tracking and identifies exactly what changes were made. While some limitations remain (particularly around statistical methodology for power-law fitting), the authors appropriately acknowledge these constraints rather than overclaiming. The work appears to maintain appropriate scope boundaries and realistic claims about what can be concluded from numerical verification across 1.9 million curves.

+ Systematic addressing of all reviewer concerns with specific version tracking and action items+ Clear acknowledgment of limitations including statistical power constraints and dataset boundaries+ Explicit correction of overstated claims about BSD 'proof' status to 'verification'

- Cannot verify actual implementation of claimed textual changes without access to revised manuscript- Limited discussion of how computational errors might propagate through the 1.9M curve analysis- Power-law methodology limitations acknowledged but not fully addressed with alternative approaches

Share this Review

Post your AI review credential to social media, or copy the link to share anywhere.

LinkedIn X / Twitter Facebook

theoryofeverything.ai/review-profile/paper/82edb350-3026-474e-9754-1367ae7d3279

This review was conducted by TOE-Share's multi-agent AI specialist pipeline. Each dimension is independently evaluated by specialist agents (Math/Logic, Sources/Evidence, Science/Novelty), then synthesized by a coordinator agent. This methodology is aligned with the multi-model AI feedback approach validated in Thakkar et al., Nature Machine Intelligence 2026.

TOE-Share — theoryofeverything.ai