Step-3.5-Flash-Distill-MolmoWeb-8B-TQ-Mixed

This model is a "God-tier" reasoning visual agent specifically designed to speed run complex web account management tasks. By distilling the 196B Step-3.5-Flash "Brain" into the 8B MolmoWeb "Eyes" and applying TurboQuant extreme compression, we have created a specialist that fits in ~3GB of VRAM while maintaining frontier-level reasoning.

Model Details

Model Description

Developed by: [@macmacmacmac ]
Model type: Vision-Language Model (VLM) / Agentic Specialist
Language(s) (NLP): English (Optimized for account registration, logins, mfa)
License: Apache-2.0
Finetuned from model: allenai/MolmoWeb-8B
Teacher Model: stepfun-ai/Step-3.5-Flash (196B Sparse MoE)

Model Sources

Repository: [https://huggingface.co/macmacmacmac/Step-3.5-Flash-Distill-MolmoWeb-8B-TQ-Mixed]
Trajectory Data: allenai/MolmoWeb-HumanTrajs (Identity Subset)
Quantization Tech: TurboQuant (March 2026 Release)

Uses

Direct Use

Specifically intended for Identity Steering:

Creating new accounts (Sign-up flows).
Managing existing credentials (Login/MFA handling).
Password recovery and security settings adjustment.
Navigating high-anxiety UI (Cookie walls, pop-ups, system prompts).

Out-of-Scope Use

General purpose chatbot tasks (Poetry, coding, general trivia).
High-stakes financial transfers without human-in-the-loop.
Medical diagnosis or legal advice.
Non-web based automation (OS-level file management).

Bias, Risks, and Limitations

Coordinate Drift: Extreme TurboQuant compression (3.5-bit) can occasionally cause 2-5px drift in click accuracy on ultra-high-density displays.
Hallucination: While reasoning is aligned with Step-3.5, the model may occasionally misinterpret legacy HTML that deviates significantly from standard human trajectories.
Privacy: While the model runs locally, the content of the screen is processed. Users must ensure the environment is secure.

Recommendations

Users should deploy this model with the Web UI Overlay (link soon) to ensure the agent's internal reasoning (<|thought|>) is transparent to the user, reducing anxiety during automated actions.

How to Get Started with the Model

import turboquant as tq
from transformers import AutoModelForImageTextToText

# Optimized for 3GB VRAM deployment
model = AutoModelForImageTextToText.from_pretrained(
    "YourOrg/Step-3.5-Flash-Distill-MolmoWeb-8B-TQ-Mixed",
    device_map="auto",
    trust_remote_code=True
)

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for macmacmacmac/Step-3.5-Flash-Distill-MolmoWeb-8B-TQ-Mixed

Base model

Qwen/Qwen3-8B-Base

Finetuned

Qwen/Qwen3-8B

Finetuned

allenai/MolmoWeb-8B

Finetuned

(1)

this model

macmacmacmac
/

Step-3.5-Flash-Distill-MolmoWeb-8B-TQ-Mixed

Step-3.5-Flash-Distill-MolmoWeb-8B-TQ-Mixed

Model Details

Model Description

Model Sources

Uses

Direct Use

Out-of-Scope Use

Bias, Risks, and Limitations

Recommendations

How to Get Started with the Model

Model tree for macmacmacmac/Step-3.5-Flash-Distill-MolmoWeb-8B-TQ-Mixed

Dataset used to train macmacmacmac/Step-3.5-Flash-Distill-MolmoWeb-8B-TQ-Mixed