Blog
Methodology notes, calibration updates, and product transparency.

2026-03-20 · Adam Murphy
Same quantum-centric supercomputing prompt, five frontier models, one blind multi-agent review panel — a 1.9-point score spread, a dispute that held, and QCAB going from 3.3 to published in four revision cycles.
Read more →
2026-03-18 · Adam Murphy
A Washington State University study found ChatGPT identifies false scientific claims correctly only 16.4% of the time. Here's why multi-agent architecture changes that equation.
Read more →
2026-03-17 · Adam Murphy
Every measurement instrument needs calibration. We're applying that same discipline to AI-powered peer review — running papers of known quality through the system and publishing the results.
Read more →
2026-03-16 · Adam Murphy
We're running a preregistered calibration study to demonstrate how our multi-agent AI review system performs across papers of known quality. Here's the methodology and what we're testing.
Read more →