MMXXVI Cairo · 30.04°N Transmission N° 01 LIVE

Intelligence is prediction. Everything else is engineering.

We're not scaling transformers.
We're replacing them.

Substrate Predictive Computation Theory
Status Implementation Phase
Scroll ↓ settle
I The Problem
Transformer
memorizes.
  • fixed forward pass
  • no runtime adaptation
  • scales with data, not thought
Brain
predicts.
  • hierarchical prediction
  • error-driven learning
  • adaptive computation

Transformers memorize. Brains predict.

The dominant architecture in AI — the transformer — is a powerful pattern matcher. But it doesn’t understand. It can’t adapt on the fly, allocate more thought to harder problems, or learn from a single example without retraining on billions of tokens.

Intelligence requires a fundamentally different computational theory — one rooted in how biological systems actually work: hierarchical prediction, error-driven learning, and adaptive computation.

II The Architecture
the name we gave it
ماهر
M  A  H  E  R
noun. one who is skilled; a master of a craft.

Predictive Computation Theory, implemented.

PCT is a computational framework where neural columns compete, specialize, and settle into stable representations through iterative prediction. Instead of a single forward pass, PCT layers think until they converge — naturally spending more compute on harder inputs.

Maher implements PCT as a language model. It demonstrates capabilities transformers cannot achieve without external scaffolding: adaptive computation, online learning, and test-time scaling.

SPECIFICATIONS · RESEARCH FINDINGS v0.3 / internal
  • 01 · Architecture A pure-PCT language model competitive with transformers at matched parameter scale, without inherited attention or feedforward blocks.
  • 02 · Computation Adaptive compute allocation emerges from the settling dynamics — a third of tokens converge in half the steps, with no auxiliary controller.
  • 03 · Convergence Settling reliably converges across depth and width, verified across multiple architectural variants.
  • 04 · Specialization Columns specialize without supervision — selectivity emerges from the competitive dynamics alone.
III Live Visualization

Watch it settle.

Below, 48 columns compete to predict the next token. Red = high prediction error. Each column iterates until its error stabilizes. Harder tokens take more steps; easier ones converge fast. This is adaptive compute, running.

maher.predict — layer 07 — 48 columns RUNNING
Token
Step 0
Error 0.00
Compute
History
IV Founder
Ibrahim Ahmed
Founder · Sole Researcher
18 yo 1st year undergrad Cairo Alone

Intelligence is a substrate problem. The right architecture, designed from first principles, will exhibit cognition the way physics exhibits gravity.

Eighteen years old. First-year university student. Building alone from Egypt.

No co-founders. No compute sponsors. No VC pitch deck. One researcher, one architecture, one thesis — carried far enough that the first predictions are coming in.

1researcher
0co-founders
conviction

If you’re building what comes next,
so are we.

Open to: researchers · compute · correspondence Not open to: scaling laws