Tagewerk II
Independent formal-verification audit of the MCR whitepaper (Z3 + Lean)
An external formal-verification pass over the MCR whitepaper — its four theorems and the §13 conclusion, adjudicated by machine, not by judgement. Eight problems (P1–P8) were checked with Z3 4.16.0 and the Lean 4.31 kernel. Theorem 1 is the free theorem, holding even for a no-op stub (vacuous); the universality syllogism is invalid (Z3 countermodel); the order-1 error floor min(q, 1−q) > 0 is proven (Z3 UNSAT on its negation); the derivability claim collapses to 0 = 1 on the integers (Lean, 0 sorry); the entropy bound is ill-posed (a hapax exceeds log₂N by ~13 bits); the sample-complexity bound is true but weaker (O(N ln N) survives the union bound). The §13 AGI conclusion is left NOT-PROVEN — unsupported, not shown false — and one genuinely-true, exponentially-costly weaker statement (P8) is proven as the honest thing the work can build from. Nothing in Theorems 1–4 supports the AGI conclusion; every verdict is backed by a re-runnable artifact.
Verdicts — machine-adjudicated (8)
- VACUOUS P1 Theorem 1 — 'level invariance'
It is the free theorem: the learn/predict naturality square commutes for the real counter AND for a no-op stub that learns and predicts nothing — insensitive to the body.
- REFUTED P2 Corollary 1 — universality syllogism
Under the honest reading (Theorem 1 gives Representable ⇒ Runnable, not ε-Learnable), P1 & P2 ⊬ C; Z3 finds a countermodel (SAT). Only an equivocated P2 rescues the entailment.
- REFUTED P3 Order-1 error floor (flagship)
Counterexample proved: on the mode-X/Y stream the error floor at the ambiguous state a is min(q, 1−q) > 0 (unconditional per-symbol ≈ half that, still > 0); Z3 returns UNSAT on the negation, so the floor holds for every q ∈ (0,1).
- REFUTED (Lean) P4 Derivability claim
A map that is at once strictly increasing (count) and constant (assign) on a LinearOrder is a subsingleton; on ℤ it forces 0 = 1. Kernel-checked in Lean 4.31 + Mathlib, 0 errors, 0 sorry.
- ILL-POSED P5 Entropy bound E ≤ log₂ N
Surprisal E = −log₂ p(w) is bounded by log₂(corpus size), not log₂ N. A hapax in a 10⁶-token, N=100 vocabulary gives E ≈ 19.93 > log₂100 ≈ 6.64 — a ~13.3-bit violation.
- TRUE-BUT-WEAKER P6 Sample-complexity bound
The two-sided Hoeffding constant is ln(2/δ); with the missing union bound the corrected total is O(N ln N), which SURVIVES (ln(2N/δ) = ln(2/δ) + ln N). Weaker than stated, not false.
- NOT-PROVEN P7 §13 AGI conclusion
Deliberately downgraded from REFUTED: the AGI claim is unsupported by Theorems 1–4, not shown false. The honest verdict is that nothing in the formal core reaches it.
- PROVEN P8 The steelman — provable weaker statement
The one genuinely-true statement the work can build from — a valid but exponentially-costly weaker claim. Offered as the constructive path, not as support for §13.
Re-runnable artifacts
- mcr_p4_not_derivable.lean
- mcr_audit_artifacts.py