arxiv:2602.07061

TACIT: Transformation-Aware Capturing of Implicit Thought

Published on Feb 5

Authors:

Abstract

TACIT is a diffusion-based transformer that performs interpretable visual reasoning entirely in pixel space using rectified flow, demonstrating rapid maze-solving with emergent reasoning patterns resembling human insight.

AI-generated summary

We present TACIT (Transformation-Aware Capturing of Implicit Thought), a diffusion-based transformer for interpretable visual reasoning. Unlike language-based reasoning systems, TACIT operates entirely in pixel space using rectified flow, enabling direct visualization of the reasoning process at each inference step. We demonstrate the approach on maze-solving, where the model learns to transform images of unsolved mazes into solutions. Key results on 1 million synthetic maze pairs include: - 192x reduction in training loss over 100 epochs - 22.7x improvement in L2 distance to ground truth - Only 10 Euler steps required (vs. 100-1000 for typical diffusion models) Quantitative analysis reveals a striking phase transition phenomenon: the solution remains invisible for 68% of the transformation (zero recall), then emerges abruptly at t=0.70 within just 2% of the process. Most remarkably, 100% of samples exhibit simultaneous emergence across all spatial regions, ruling out sequential path construction and providing evidence for holistic rather than algorithmic reasoning. This "eureka moment" pattern -- long incubation followed by sudden crystallization -- parallels insight phenomena in human cognition. The pixel-space design with noise-free flow matching provides a foundation for understanding how neural networks develop implicit reasoning strategies that operate below and before language.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2602.07061 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2602.07061 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2602.07061 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.