Active · Shadow Demo underway

A 10M model that
thinks like a billion.

The Adaptive Reasoning Architecture is an open research project building a small AI that reasons in iterative steps — updating its own weights as it thinks, rather than guessing in one shot.

⬡

Enter the ARA Platform Learn how it works ↓

10M

Parameters

40/60

Core / Plastic split

~775M

Effective capacity

~120MB

Memory footprint

≥75%

GSM8K target

What is ARA

Not a bigger model.
A smarter one.

Standard AI models read a problem, run one forward pass, and output an answer. ARA does something fundamentally different — it reasons in steps, checks its own work, and rewrites its own memory as it thinks.

🧠

Frozen Reasoning Core (40%)

Four million parameters permanently locked after Phase 1 training. This frozen core proposes reasoning operations, checks logical coherence, and drives backtracking when contradictions appear. It never changes during inference — it is the stable mathematical conscience of the system.

⚡

Plastic Workspace (60%)

Six million parameters that update in real time — during inference, while solving a problem. This workspace holds intermediate state, absorbs retrieved knowledge from external libraries, and builds a self-organising memory hierarchy: free scratch space, procedural skills, and consolidated intuition.

🔄

Iterative Inference

Instead of one forward pass, ARA runs up to 1,000 micro-iterations per problem. Each iteration proposes a step, checks it, accepts or backtracks, and updates the plastic weights. Harder problems naturally get more iterations — something a fixed-depth Transformer fundamentally cannot do.

📚

External Library Retrieval

Standard models use 50–70% of their capacity to store facts. ARA stores nothing — it retrieves what it needs on demand. All 10M parameters go to reasoning. Knowledge gaps trigger structured queries that progressively deepen in specificity until the model can proceed.

Architecture

The 10M parameter prototype

8 Transformer layers, d_model=128, d_ff=512. Every weight matrix is partitioned per-neuron — not per-layer. 204 frozen neurons and 307 plastic neurons coexist in every FFN.

ARA 10M Model — 8 Transformer Layers

Input Embedding · 8000 × 128 · 1.02M params

→

× 8 layers below

Multi-Head Self-Attention (h=4) Q, K, V, O projections · 128² × 4

LayerNorm + Residual mean=0, var=1

Feed-Forward Network · 128 → 512 → 128 THE 40/60 SPLIT LIVES HERE

CORE · 204 neurons · FROZEN

WORKSPACE · 308 neurons · PLASTIC

↑ repeated × 8 layers ↑

● Reasoning Core — 4M params (40%)

Permanently frozen after Phase 1

Proposes reasoning operations at each step
Runs symbolic coherence checker
Triggers backtracking on contradictions
Detects knowledge gaps, formulates queries
Drives the iterative reasoning loop
Encodes 25+ mathematical primitives

● Plastic Workspace — 6M params (60%)

Updates in real time at every micro-iteration

Holds all intermediate problem state
Absorbs library knowledge via mini fine-tuning
Accumulates adaptive resistance (EWC-based)
Self-organises into 3 memory tiers over time
Initialised as a random reservoir (Phase 1)
~120MB total including optimizer state

How it works

Every problem is solved
in iterative steps.

ARA does not output an answer from a single forward pass. It runs an algorithm — one micro-iteration at a time — until the problem is solved or the budget is exhausted.

📥

Parse Problem

The input problem is converted to a symbolic state s₀ — a set of variable bindings and assertions. The backtrack stack is initialised. Plastic weights reset to baseline.

🎯

Propose Operation

The frozen reasoning core selects the most likely next operation from 25+ mathematical primitives given the current state. No weights change here — this is pure forward inference through the core.

✅

Coherence Check

The proposed new state s_t' is evaluated for logical consistency against all prior assertions. If any contradiction exists, the step is rejected. This is a symbolic checker — fast and deterministic.

🔁

Update or Backtrack

Coherent step: accept s_t, update plastic weights via the coherence loss, increment resistance on active neurons. Contradiction: roll back, reduce resistance, pop the stack, reset the learning rate.

📖

Query Library (if needed)

When progress stalls for 10 consecutive iterations, the core detects a knowledge gap and queries the external library. The retrieved content is absorbed into the plastic workspace via 3–5 gradient steps.

🏁

Terminate or Continue

If h(s_t, s_goal) = 0, the solution is complete. If the iteration budget T_max is exhausted, the best partial solution is returned with its coherence score. The entire loop runs in ~120MB of memory.

The Platform

Your coordination hub
for building ARA.

The ARA Hub is where the team thinks, builds, and ships together. Here is everything inside and how to navigate it.

⬡

Open ARA Cognitives Platform ara-cognitives.pages.dev

🏠

Welcome

Your orientation to the project: architecture overview, the mathematics, resistance tiers, and the Shadow Demo targets. Start here if you are new.

First stop

⬡

Dashboard

Live project status, active sprint, recent notices, and quick links. A snapshot of what the team is working on right now.

Overview

💬

Group Chat

A single shared channel for the entire team. Discuss architecture decisions, share results, coordinate sprints, and link pull requests in real time.

Live

📋

Notice Board

Post announcements, questions, and architectural decisions. Urgent notices appear on the Dashboard for team-wide visibility.

Updates

👥

Members

See who is on the team, their roles (ML Engineer, Researcher, Contributor), and what they are currently working on.

Team

🔗

Repository

Direct link to the GitHub repository. Code, issues, pull requests, and the full README with implementation details.

Code

📚

Project Notes

Structured documentation: architecture decisions, experiment logs, benchmark results, and mathematical derivations. The living knowledge base of the project.

Docs

🗄

Supabase DB

Backend database for persistent user profiles, chat history, notices, and experiment tracking. Linked directly from the Resources section of the sidebar.

Backend

Getting started on the platform

When you open the platform you will be greeted by an onboarding screen — enter your name, role, and pick an avatar colour. Once inside, the sidebar on the left contains all navigation. The top bar has a quick New Message button for posting to the group chat. Notice badges on sidebar items show unread counts. On mobile, use the ☰ hamburger menu to open the sidebar.

Mathematics

Every claim is backed
by a derivation.

The ARA Mathematical Companion derives all architecture claims from first principles — from high-school algebra through gradient descent and EWC. Here are the four core equations that govern the system.

Masked Gradient Update

ΔW = −η · M_plastic ⊙ ∇_W L

Only the 60% plastic neurons receive gradient updates. The ⊙ (Hadamard product) with the binary mask M_plastic ensures the frozen core is mathematically untouchable — not suppressed, never updated.

Coherence Loss (inference-time)

L_coh = −μ₁·log p(sₜ|sₜ₋₁,θ) − μ₂·Prog(t)

Drives real-time plastic weight updates. Two components: consistency (how likely is this state under the model's own distribution?) and progress (how many subgoals remain?). μ₁=0.6, μ₂=0.4.

Adaptive Resistance (EWC)

L_inf = L_coh + (λ_R/2)·Σ Rᵢ·(θᵢ−θᵢ₀)²

Fisher-weighted springs resist overwriting neurons already proven useful. Resistance Rᵢ grows with each successful use (+0.02) and shrinks on failure (−0.05). Self-organises into three memory tiers.

Scaling Identity

N_eff = N · T^β · (1−f_facts)⁻¹

At T=1000 iterations, β=0.5, f_facts=0.6: a 10M model reaches ~775M parameter-equivalents. This is the theoretical ceiling the Shadow Demo is built to test empirically.

A 10M model that thinks like a billion.

Not a bigger model.A smarter one.