Active · Shadow Demo underway

A 10M model that
thinks like a billion.

The Adaptive Reasoning Architecture is an open research project building a small AI that reasons in iterative steps — updating its own weights as it thinks, rather than guessing in one shot.

10M
Parameters
40/60
Core / Plastic split
~775M
Effective capacity
~120MB
Memory footprint
≥75%
GSM8K target

Not a bigger model.
A smarter one.

Standard AI models read a problem, run one forward pass, and output an answer. ARA does something fundamentally different — it reasons in steps, checks its own work, and rewrites its own memory as it thinks.

🧠

Frozen Reasoning Core (40%)

Four million parameters permanently locked after Phase 1 training. This frozen core proposes reasoning operations, checks logical coherence, and drives backtracking when contradictions appear. It never changes during inference — it is the stable mathematical conscience of the system.

Plastic Workspace (60%)

Six million parameters that update in real time — during inference, while solving a problem. This workspace holds intermediate state, absorbs retrieved knowledge from external libraries, and builds a self-organising memory hierarchy: free scratch space, procedural skills, and consolidated intuition.

🔄

Iterative Inference

Instead of one forward pass, ARA runs up to 1,000 micro-iterations per problem. Each iteration proposes a step, checks it, accepts or backtracks, and updates the plastic weights. Harder problems naturally get more iterations — something a fixed-depth Transformer fundamentally cannot do.

📚

External Library Retrieval

Standard models use 50–70% of their capacity to store facts. ARA stores nothing — it retrieves what it needs on demand. All 10M parameters go to reasoning. Knowledge gaps trigger structured queries that progressively deepen in specificity until the model can proceed.

The 10M parameter prototype

8 Transformer layers, d_model=128, d_ff=512. Every weight matrix is partitioned per-neuron — not per-layer. 204 frozen neurons and 307 plastic neurons coexist in every FFN.

ARA 10M Model — 8 Transformer Layers
Input Embedding · 8000 × 128 · 1.02M params
× 8 layers below
Multi-Head Self-Attention (h=4) Q, K, V, O projections · 128² × 4
LayerNorm + Residual mean=0, var=1
Feed-Forward Network · 128 → 512 → 128 THE 40/60 SPLIT LIVES HERE
CORE · 204 neurons · FROZEN
WORKSPACE · 308 neurons · PLASTIC
↑ repeated × 8 layers ↑
● Reasoning Core — 4M params (40%)

Permanently frozen after Phase 1

  • Proposes reasoning operations at each step
  • Runs symbolic coherence checker
  • Triggers backtracking on contradictions
  • Detects knowledge gaps, formulates queries
  • Drives the iterative reasoning loop
  • Encodes 25+ mathematical primitives
● Plastic Workspace — 6M params (60%)

Updates in real time at every micro-iteration

  • Holds all intermediate problem state
  • Absorbs library knowledge via mini fine-tuning
  • Accumulates adaptive resistance (EWC-based)
  • Self-organises into 3 memory tiers over time
  • Initialised as a random reservoir (Phase 1)
  • ~120MB total including optimizer state

Every problem is solved
in iterative steps.

ARA does not output an answer from a single forward pass. It runs an algorithm — one micro-iteration at a time — until the problem is solved or the budget is exhausted.

01
📥

Parse Problem

The input problem is converted to a symbolic state s₀ — a set of variable bindings and assertions. The backtrack stack is initialised. Plastic weights reset to baseline.

02
🎯

Propose Operation

The frozen reasoning core selects the most likely next operation from 25+ mathematical primitives given the current state. No weights change here — this is pure forward inference through the core.

03

Coherence Check

The proposed new state s_t' is evaluated for logical consistency against all prior assertions. If any contradiction exists, the step is rejected. This is a symbolic checker — fast and deterministic.

04
🔁

Update or Backtrack

Coherent step: accept s_t, update plastic weights via the coherence loss, increment resistance on active neurons. Contradiction: roll back, reduce resistance, pop the stack, reset the learning rate.

05
📖

Query Library (if needed)

When progress stalls for 10 consecutive iterations, the core detects a knowledge gap and queries the external library. The retrieved content is absorbed into the plastic workspace via 3–5 gradient steps.

06
🏁

Terminate or Continue

If h(s_t, s_goal) = 0, the solution is complete. If the iteration budget T_max is exhausted, the best partial solution is returned with its coherence score. The entire loop runs in ~120MB of memory.

Your coordination hub
for building ARA.

The ARA Hub is where the team thinks, builds, and ships together. Here is everything inside and how to navigate it.

Getting started on the platform
When you open the platform you will be greeted by an onboarding screen — enter your name, role, and pick an avatar colour. Once inside, the sidebar on the left contains all navigation. The top bar has a quick New Message button for posting to the group chat. Notice badges on sidebar items show unread counts. On mobile, use the ☰ hamburger menu to open the sidebar.

Every claim is backed
by a derivation.

The ARA Mathematical Companion derives all architecture claims from first principles — from high-school algebra through gradient descent and EWC. Here are the four core equations that govern the system.

Masked Gradient Update
ΔW = −η · M_plastic ⊙ ∇_W L

Only the 60% plastic neurons receive gradient updates. The ⊙ (Hadamard product) with the binary mask M_plastic ensures the frozen core is mathematically untouchable — not suppressed, never updated.

Coherence Loss (inference-time)
L_coh = −μ₁·log p(sₜ|sₜ₋₁,θ) − μ₂·Prog(t)

Drives real-time plastic weight updates. Two components: consistency (how likely is this state under the model's own distribution?) and progress (how many subgoals remain?). μ₁=0.6, μ₂=0.4.

Adaptive Resistance (EWC)
L_inf = L_coh + (λ_R/2)·Σ Rᵢ·(θᵢ−θᵢ₀)²

Fisher-weighted springs resist overwriting neurons already proven useful. Resistance Rᵢ grows with each successful use (+0.02) and shrinks on failure (−0.05). Self-organises into three memory tiers.

Scaling Identity
N_eff = N · T^β · (1−f_facts)⁻¹

At T=1000 iterations, β=0.5, f_facts=0.6: a 10M model reaches ~775M parameter-equivalents. This is the theoretical ceiling the Shadow Demo is built to test empirically.

Be part of building
something new.

The Shadow Demo is at the empirical threshold. The mathematics are coherent. The architecture is specified. Now it needs engineers, researchers, and builders to make it real. If you are drawn to small models with big ideas — come help build the ARA.

No sign-up required to explore · Open research project