PaperRRD

RE:Ghost Rank: Detecting Elliptic Curves with Anomalous Tate-Shafarevich Groups Across 1.9 Million Cremona Curves

Adam MurphyPublished 3/11/2026AI Rating: 3.7/5

A large-scale empirical study of the Birch and Swinnerton-Dyer (BSD) conjecture using 1.9 million elliptic curves from Cremona's database, revealing anomalous Tate-Shafarevich groups and establishing a power-law distribution with exponent α ≈ 2. The work provides numerical verification of BSD identities for rank-0 curves and identifies systematic patterns in Sha group sizes across unprecedented dataset scales.

Top 10% Internal Consistency

View Shareable Review Profile- permanent credential link for endorsements

Revisions Suggested

Internal Consistency4/5

The rebuttal is largely self-consistent in how it reframes claims: it retracts any implication that BSD is “proven,” narrows Kolyvagin’s role to finiteness results, and correctly reclassifies §4.1 as a pipeline consistency check given that Cremona’s |Ш| values are BSD-derived. The remaining tension is the strong statement “Within our dataset, there is no ambiguity” while also limiting the explicit Manin-constant verification to optimal curves with conductor ≤ 500,000 and treating non-optimal curves via “absorbed into Cremona’s tabulated values.” That is not a formal contradiction, but it is an overreach unless the dataset is explicitly restricted or the normalization pathway is made uniform and explicit.

Mathematical Validity2/5

A key mathematical vulnerability remains around normalization: removing c_E^2 from the BSD working formula and declaring “all subsequent equations inherit c_E = 1” is only valid if every curve’s invariants are computed in a manner consistent with c_E=1 (e.g., always using optimal curves within a verified range) or if the inputs are all already Cremona-normalized in a way that makes c_E irrelevant in practice. The rebuttal provides only a partial justification (verified c_E=1 for optimal curves up to a conductor bound; non-optimal curves handled by absorption into tabulated values), which is not sufficient for a 1.9M-curve study without a clear, uniform accounting. Additionally, the rebuttal references key equations ((1), (2), (6)) but does not present them, preventing audit of factors (period conventions across components, Tamagawa numbers, torsion, etc.) and making it difficult to assess whether any multiplicative-factor mistakes could contaminate derived statistics.

Falsifiability4/5

As an empirical paper, the central claims are testable: the reported distributional pattern for |Ш| (including the fitted exponent and sensitivity to cutoffs) and the pipeline-consistency reproduction of Cremona/LMFDB quantities can be independently replicated on the same database with the authors’ code and parameter settings. The work becomes less falsifiable only where it leans on quantities computed via BSD itself (e.g., using tabulated |Ш| to “verify” BSD), but the rebuttal now explicitly frames that as non-independent.

Clarity4/5

The rebuttal is clear about what was changed (language around “proven” vs “verified,” and the intended interpretation of §4.1), and it states the key methodological limitation (only 38 unique |Ш| values for power-law fitting). Clarity is still limited by the absence of the actual revised manuscript text and by not showing the referenced equations and exact normalization conventions in-line, which are central to interpreting the results.

Novelty4/5

The scale (≈1.9 million Cremona curves) and the focus on empirical patterns in anomalous Tate–Shafarevich group sizes across such a large corpus are meaningfully novel as a data-driven contribution, especially if accompanied by reproducible code and careful normalization. The conceptual framing (connections to Delaunay-type heuristics and tail behavior near α≈2) is not wholly new, but the dataset-scale synthesis and empirical characterization plausibly are.

Completeness4/5

The rebuttal addresses each reviewer concern directly, includes concrete action items (language revisions, added paragraph on Manin constant), and acknowledges methodological limitations rather than over-claiming. What remains incomplete for review is (i) the actual revised text and (ii) a rigorous, dataset-wide normalization/implementation description sufficient to rule out multiplicative-factor errors (especially concerning c_E and representative choice within isogeny classes).

Publication criteria: All dimensions must score at least 2/5 with an overall average of 3/5 or higher. The AI recommendation badge above is advisory - publication is determined by the numerical scores.

The rebuttal makes genuinely constructive improvements in scientific positioning. In particular, it corrects the strongest scope issue by removing “proven/theorem” language about BSD exactness and restricting Kolyvagin’s citation to finiteness results, which aligns well with established theory. It also appropriately acknowledges the circularity inherent in using Cremona-tabulated |Ш| values computed via BSD, reframing §4.1 as a pipeline consistency check rather than an independent validation.

Where the submission still appears fragile is the mathematical bookkeeping around normalization—especially the Manin constant. The move to delete c_E^2 from the working BSD equation and then declare that subsequent equations “inherit c_E=1” is a high-impact choice: if even a subset of the 1.9M curves are outside the explicitly verified regime (optimal, conductor ≤ 500k) or if different representatives/models are mixed, the resulting |Ш| or derived statistics can be off by rational factors. The rebuttal gestures at “absorption into Cremona’s tabulated values” for non-optimal curves, but that does not yet read like a dataset-wide proof of consistent normalization; it reads like a plausible practice that still needs to be specified precisely.

On the empirical side, the tail-fit discussion is reasonably cautious: the authors acknowledge that Clauset–Shalizi–Newman-style testing would be preferable and that n=38 unique |Ш| values limits statistical leverage. That restraint is appropriate. However, the headline claim of a power-law with α≈2 will be received as stronger than “notable observation” unless the manuscript very clearly separates (a) descriptive fit on binned/unique values from (b) inferential claims about an underlying distribution.

Overall, this looks like a potentially valuable large-scale empirical study, but it is not yet in a state where a reader can be confident that the key computed quantities are uniformly normalized across the entire dataset. Fixing that would substantially raise confidence in every downstream statistic and would likely upgrade both mathematical validity and the credibility of the empirical findings.

This review was generated by AI for research and educational purposes. It is not a substitute for formal peer review. All analyses are advisory; publication decisions are based on numerical score thresholds.

Key Equations (2)

\frac{L(E,1)}{\Omega_E^+} = \frac{|\text{Sha}(E)| \cdot \text{Tam}(E)}{|E(\mathbb{Q})|^2}

Birch and Swinnerton-Dyer conjecture identity for rank-0 curves with Manin constant c_E = 1

\alpha = 2.02 \pm 0.07

Power-law exponent for Tate-Shafarevich group size distribution from OLS fitting

Testable Predictions (2)

Tate-Shafarevich group sizes follow a power-law distribution with exponent α ≈ 2

mathpending

Falsifiable if: Systematic deviation from power-law behavior in larger datasets or MLE analysis showing significantly different exponent

The exponent α ≈ 2 represents the convergence/divergence threshold for E[|Sha|]

mathpending

Falsifiable if: Theoretical proof that the true exponent differs significantly from 2 or empirical evidence of convergent moments

Tags & Keywords

BSD conjecture(math)elliptic curves(math)large-scale computation(methodology)number theory(math)power-law fitting(methodology)statistical analysis(methodology)Tate-Shafarevich groups(math)

Keywords: Birch and Swinnerton-Dyer conjecture, Tate-Shafarevich groups, elliptic curves, Cremona database, power-law distribution, L-functions, rank-0 curves, Manin constant, numerical verification, Kolyvagin

Author Rebuttal — Ghost Rank v3.2.1

Paper: Ghost Rank: Detecting Elliptic Curves with Anomalous Tate-Shafarevich Groups Across 1.9 Million Cremona Curves
Version under review: 3.2 → Revised version: 3.2.1
Date: 2026-02-28 PST
Responding to: Mathematical Validity score (3/5)

Reviewer Concern 1: "Overstates the proven status of BSD exactness"

Reviewer: "Citing only Kolyvagin for unconditional exactness of the full leading-term formula is insufficient."

Response: We agree this was a valid criticism of v3.2. In v3.2.1, we have:

Removed all uses of "proven" and "theorem" when referring to BSD for rank-0 curves.
Replaced the heading "Scope of the theorem" with "Scope of verification."
Added the explicit statement: "Our claim is not that BSD is proven — it is that (1) holds as a verified numerical identity for all curves in our dataset."
Changed every downstream reference (§4.4 caveat, §7.5 limitations) to use "verified" rather than "proven."

We cite Kolyvagin specifically and only for what he proved: finiteness of E(Q) and Sha for rank-0 curves. We do not claim he proved the full leading-term formula. The numerical verification is attributed to Cremona's high-precision computations.

Action taken: Language corrected throughout. Four separate passages revised.

Reviewer Concern 2: "Normalization ambiguities (real period, Manin constant)"

Reviewer: "Potential normalization ambiguities could affect α_BSD by rational factors."

Response: The paper addresses this in three places:

§2.1 defines Ω_E+ explicitly following Cremona's convention (integral over all real components of E(R), not just the identity component).
§2.1 states that the Manin constant c_E = 1 has been verified for all optimal curves with conductor ≤ 500,000, and that for non-optimal curves, period adjustments are absorbed into Cremona's tabulated values.
v3.2.1 addition: We removed c_E² from Equation (1) and added a dedicated "On the Manin constant" paragraph explaining that c_E = 1 is substituted into the working formula, with an explicit note that "All subsequent equations (2), (6) inherit c_E = 1 from this convention." A parenthetical at Equation (6) cross-references §2.1.

We acknowledge that for curves outside Cremona's verified range, these conventions would need re-examination. Within our dataset, there is no ambiguity.

Action taken: Manin constant bookkeeping made explicit in v3.2.1.

Reviewer Concern 3: "§4.1 BSD validation is not logically independent"

Reviewer: "The database |Ш| values are computed via BSD."

Response: We agree, and the paper already says so. Section 4.1 is titled "Numerical Consistency Check" (not "validation" or "test"), and includes the explicit statement:

"This is not an independent test of BSD — Cremona's tabulated |Ш| values are themselves computed via the BSD formula. Rather, it checks that our PARI/GP pipeline and data ingestion reproduce the same L(E,1) values that Cremona used."

The section exists to verify pipeline integrity (i.e., that we loaded and computed the right numbers), not to make a theoretical claim. Removing it would leave the reader unable to assess whether our data pipeline is trustworthy.

Action taken: None needed — the paper already frames this correctly. We invite the reviewer to re-read §4.1's opening paragraph.

Reviewer Concern 4: "Power-law fitting lacks formal goodness-of-fit testing beyond R²"

Reviewer: "Power-law fitting methodology lacks formal goodness-of-fit testing beyond R²."

Response: Partially fair. The paper reports:

OLS exponent with standard error (α = 2.02 ± 0.07)
R² = 0.955
BIC comparison favoring power law over quadratic (ΔBIC = 2.1)
Cutoff sensitivity analysis showing exponent variation from 2.02 to 1.77

We acknowledge that the gold standard for power-law identification (Clauset, Shalizi & Newman 2009, MLE + KS distance) would be more rigorous. However, with n = 38 unique |Sha| values, we are at the practical lower bound for such methods. The paper states in §7.5: "38 unique values provide limited statistical leverage."

We view the tail law as an empirical observation consistent with Delaunay's theoretical heuristics, not as a strong distributional claim. The exponent α ≈ 2 is notable primarily because it sits at the convergence/divergence threshold for E[|Sha|], connecting our observation to known theory.

Action taken: None. We believe the current presentation is appropriately cautious for the available data.

Summary

Concern	Status in v3.2.1
BSD "proven" overstatement	Fixed — four passages revised
Manin constant bookkeeping	Fixed — c_E removed from Eq(1), convention stated explicitly
§4.1 circularity	Already addressed in v3.2 — section titled and framed as consistency check
Power-law methodology	Acknowledged limitation — insufficient data for MLE methods

We respectfully suggest that concerns 1 and 2 have been substantively addressed in v3.2.1, concern 3 was already addressed in v3.2, and concern 4 represents a genuine but bounded limitation of the available data rather than a methodological error.

Detecting Large Tate-Shafarevich Groups via BSD Geometric Invariants: Lessons from a Computational Audit of 1.9 Million Elliptic Curves

Extendspublished

We investigate computational methods for identifying elliptic curves with anomalously large Tate-Shafarevich groups ($|Ш| ≫ 1$) among rank-0 curves over $ℚ$. After documenting and correcting circular reasoning in AI-assisted analysis, we find that the BSD geometric factor $α_{BSD}(E) = Ω_E^+ · ∏_p c_p(E) / |E(ℚ)_{tors}|^2$ achieves 99.5% precision at 98.4% recall for detecting $|Ш| > 1$ curves. We additionally report a power-law tail distribution for $|Ш|$ across 1.9 million curves with exponent $α̂ = 2.02 ± 0.07$, placing the distribution at the convergence threshold for $𝔼[|Ш|]$.

B+

3.6/5

You Might Also Find Interesting

Semantically similar papers and frameworks on TOE-Share

Finding recommendations...

← All Papers

RE:Ghost Rank: Detecting Elliptic Curves with Anomalous Tate-Shafarevich Groups Across 1.9 Million Cremona Curves

AI Review Scores

Overall Assessment

Strengths & Improvements

Specialist Agent Reports

Science Highlights

Key Equations (2)

Testable Predictions (2)

Tags & Keywords

Full Content

Related Papers

Detecting Large Tate-Shafarevich Groups via BSD Geometric Invariants: Lessons from a Computational Audit of 1.9 Million Elliptic Curves

You Might Also Find Interesting

Community Discussion