**Rephrased Blog Content:**

Samsung SAIT’s Montreal team has unveiled the Tiny Recursive Model (TRM), a compact, two-layer, 7 million parameter model that’s making waves in the AI community. TRM outperforms significantly larger language models on the ARC-AGI benchmark, achieving 44.6-45% accuracy on ARC-AGI-1 and 7.8-8% on ARC-AGI-2. This is a remarkable feat, considering it surpasses models like DeepSeek-R1, o3-mini-high, and Gemini 2.5 Pro, which have many more parameters.

**What Sets TRM Apart?**

TRM introduces several novel aspects. It replaces the Hierarchical Reasoning Model’s (HRM) two-module hierarchy with a single, tiny network that recursively updates a latent ‘scratchpad’ (z) and a current solution embedding (y). This model alternates between ‘think’ and ‘act’ phases: during ‘think’, it updates the scratchpad (z ← f(x, y, z)) for several inner steps, and during ‘act’, it refines the solution embedding (y ← g(y, z)).

The ‘think’ to ‘act’ block is deeply supervised, unrolled up to 16 times during training, with a learned halting head. This allows signals to carry across steps via (y, z). Unlike HRM, TRM backpropagates through all recursive steps, which the team found crucial for generalization.

**Architectural Details**

The best-performing setup for ARC/Maze retains self-attention, while for Sudoku’s small fixed grids, an MLP-Mixer-style token mixer is used. A small EMA (exponential moving average) over weights stabilizes training on limited data. Rather than stacking layers, net depth is effectively created by recursion (e.g., T = 3, n = 6), with ablations showing that two layers generalize better than deeper variants at the same effective compute.

**Understanding the Results**

On ARC-AGI-1 / ARC-AGI-2, TRM-Attn (7M) scored 44.6% / 7.8%, outperforming HRM (27M) at 40.3% / 5.0%. It also beat larger models like DeepSeek-R1 (671B) at 15.8% / 1.3%, o3-mini-high at 34.5% / 3.0%, and Gemini 2.5 Pro at 37.0% / 4.9%. On Sudoku-Extreme, TRM scored 87.4% with attention-free mixer, compared to HRM’s 55.0%. On Maze-Hard, TRM scored 85.3%, improving on HRM’s 74.5%.

**Why Does TRM Excel?**

TRM’s success can be attributed to several factors. Its ‘decision-then-revision’ approach drafts a full candidate solution and then improves it via latent iterative consistency checks, reducing exposure bias from autoregressive decoding on structured outputs. It also allocates test-time compute to recursive refinement, yielding better generalization at constant compute than adding layers. For small fixed grids, attention-free mixing reduces overcapacity and improves bias/variance trade-offs.

**Key Takeaways**

TRM is a compact, 2-layer recursive solver that alternates ‘think’ and ‘act’ updates, unrolling up to 16 steps with deep supervision and full gradient propagation. It reports impressive results on ARC-AGI, surpassing much larger LLMs. Its efficiency and pattern demonstrate that allocating test-time compute to recursive refinement can beat parameter scaling on symbolic-geometric tasks.

**Editorial Comments**

While TRM’s results are promising, ARC-AGI remains unsolved at scale. The contribution is an architectural efficiency result rather than a general reasoning breakthrough. The research team has released code on GitHub, inviting further exploration and improvement.

Share.
Leave A Reply

Exit mobile version