gc_editor / README.md
ameforge's picture
Upload README.md with huggingface_hub
c23abd4 verified
---
license: apache-2.0
language:
- en
tags:
- video-editing
- instruction-following
- structured-generation
- text-to-json
- ffmpeg
- gearcut
- sparse-transformer
pipeline_tag: text-generation
inference: false
---
# GearCut Editor (gc_editor)
**gc_editor** is a compact instruction-to-operations model that powers
GearCut, an ultra-lightweight, FFmpeg-based
video editor. It translates a plain-English editing instruction into a list of
structured editing **operations** (JSON) that GearCut's `project -> ffmpeg`
compiler then executes. It is designed to run **locally, on CPU**, so the editor
needs no cloud service and no user video ever leaves the machine.
Developed by **AMEFORGE**. Built on the in-house **SparseMind** architecture
(sparse attention + sparse FFN, dynamic neuron typing, and episodic memory).
## What it does
- **Input:** the current timeline state + a natural-language instruction.
- **Output:** a JSON array of editing operations.
```text
INPUT
clips: c1=intro.mp4(0.0-8.0) | remove the first 3 seconds of the clip =>
OUTPUT
[{"op":"trim","clip":"c1","in":3.0,"out":8.0}]
```
Supported operations (v1): `trim`, `split`, `import`, `append`, `delete`,
`reorder`, `export`.
## Model details
| | |
|---|---|
| Architecture | SparseMind (decoder-only, sparse) |
| Parameters | 28,759,300 (~28.8M) |
| Hidden size / layers | 384 / 8 |
| Context length | 256 tokens |
| Tokenizer | GearCut dedicated SentencePiece-BPE, vocab 682 |
| Precision | fp32 |
## Evaluation
Measured on a held-out synthetic validation split. The meaningful metrics are
not perplexity but whether the generated operations are usable:
| Metric | Score |
|---|---|
| Valid JSON | 100.0% |
| Exact match (operations == reference) | 76.5% |
| Best exact match during training | 88.0% |
## Training data
Trained on **85,000** synthetically generated `(timeline + instruction -> operations)`
examples for 3000 steps. The generator covers the v1 operation set with
varied phrasings, clip references, file names, timestamps, and presets.
## Intended use & scope
Intended as the natural-language command layer inside the GearCut editor. It is
**not** a general-purpose assistant and only emits GearCut operations.
## Limitations
- **Synthetic training data.** The model is strongest on phrasings close to the
generator's templates. Unusual real-world wording may be handled less reliably
until the data is expanded with real examples.
- **English only (v1).** A bilingual (EN/FR) version is planned.
- **Narrow operation set (v1).** Transitions, multi-track, and effects are not
yet covered.
- **Custom architecture.** The HF inference widget is disabled; load and run the
model with the snippet below.
## How to use
```python
# Download gc_editor.pt + the GearCut tokenizer from this repo, then rebuild the
# SparseMind model with the same config stored in the checkpoint and load weights.
import torch, sentencepiece as spm
ckpt = torch.load("gc_editor.pt", map_location="cpu")
cfg = ckpt["config"] # the exact training config
# model = SparseMind(Config(**cfg)); model.load_state_dict(ckpt["model"]); model.eval()
sp = spm.SentencePieceProcessor(); sp.Load("gearcut_tok.model")
prompt = 'clips: c1=intro.mp4(0.0-8.0) | remove the first 3 seconds of the clip =>'
# ids = sp.EncodeAsIds(prompt) ; generate ; stop at EOS ; json.loads the output
```
## Citation
```bibtex
@misc{gearcut_editor,
title = {GearCut Editor: an instruction-to-operations model for lightweight video editing},
author = {AMEFORGE},
year = {2026},
note = {Built on the SparseMind architecture}
}
```