| --- |
| license: apache-2.0 |
| language: |
| - en |
| tags: |
| - video-editing |
| - instruction-following |
| - structured-generation |
| - text-to-json |
| - ffmpeg |
| - gearcut |
| - sparse-transformer |
| pipeline_tag: text-generation |
| inference: false |
| --- |
| |
| # GearCut Editor (gc_editor) |
| |
| **gc_editor** is a compact instruction-to-operations model that powers |
| GearCut, an ultra-lightweight, FFmpeg-based |
| video editor. It translates a plain-English editing instruction into a list of |
| structured editing **operations** (JSON) that GearCut's `project -> ffmpeg` |
| compiler then executes. It is designed to run **locally, on CPU**, so the editor |
| needs no cloud service and no user video ever leaves the machine. |
| |
| Developed by **AMEFORGE**. Built on the in-house **SparseMind** architecture |
| (sparse attention + sparse FFN, dynamic neuron typing, and episodic memory). |
| |
| ## What it does |
| |
| - **Input:** the current timeline state + a natural-language instruction. |
| - **Output:** a JSON array of editing operations. |
| |
| ```text |
| INPUT |
| clips: c1=intro.mp4(0.0-8.0) | remove the first 3 seconds of the clip => |
| |
| OUTPUT |
| [{"op":"trim","clip":"c1","in":3.0,"out":8.0}] |
| ``` |
| |
| Supported operations (v1): `trim`, `split`, `import`, `append`, `delete`, |
| `reorder`, `export`. |
| |
| ## Model details |
| |
| | | | |
| |---|---| |
| | Architecture | SparseMind (decoder-only, sparse) | |
| | Parameters | 28,759,300 (~28.8M) | |
| | Hidden size / layers | 384 / 8 | |
| | Context length | 256 tokens | |
| | Tokenizer | GearCut dedicated SentencePiece-BPE, vocab 682 | |
| | Precision | fp32 | |
| |
| ## Evaluation |
| |
| Measured on a held-out synthetic validation split. The meaningful metrics are |
| not perplexity but whether the generated operations are usable: |
| |
| | Metric | Score | |
| |---|---| |
| | Valid JSON | 100.0% | |
| | Exact match (operations == reference) | 76.5% | |
| | Best exact match during training | 88.0% | |
| |
| ## Training data |
| |
| Trained on **85,000** synthetically generated `(timeline + instruction -> operations)` |
| examples for 3000 steps. The generator covers the v1 operation set with |
| varied phrasings, clip references, file names, timestamps, and presets. |
| |
| ## Intended use & scope |
| |
| Intended as the natural-language command layer inside the GearCut editor. It is |
| **not** a general-purpose assistant and only emits GearCut operations. |
| |
| ## Limitations |
| |
| - **Synthetic training data.** The model is strongest on phrasings close to the |
| generator's templates. Unusual real-world wording may be handled less reliably |
| until the data is expanded with real examples. |
| - **English only (v1).** A bilingual (EN/FR) version is planned. |
| - **Narrow operation set (v1).** Transitions, multi-track, and effects are not |
| yet covered. |
| - **Custom architecture.** The HF inference widget is disabled; load and run the |
| model with the snippet below. |
| |
| ## How to use |
| |
| ```python |
| # Download gc_editor.pt + the GearCut tokenizer from this repo, then rebuild the |
| # SparseMind model with the same config stored in the checkpoint and load weights. |
| import torch, sentencepiece as spm |
| ckpt = torch.load("gc_editor.pt", map_location="cpu") |
| cfg = ckpt["config"] # the exact training config |
| # model = SparseMind(Config(**cfg)); model.load_state_dict(ckpt["model"]); model.eval() |
| sp = spm.SentencePieceProcessor(); sp.Load("gearcut_tok.model") |
| prompt = 'clips: c1=intro.mp4(0.0-8.0) | remove the first 3 seconds of the clip =>' |
| # ids = sp.EncodeAsIds(prompt) ; generate ; stop at EOS ; json.loads the output |
| ``` |
| |
| ## Citation |
| |
| ```bibtex |
| @misc{gearcut_editor, |
| title = {GearCut Editor: an instruction-to-operations model for lightweight video editing}, |
| author = {AMEFORGE}, |
| year = {2026}, |
| note = {Built on the SparseMind architecture} |
| } |
| ``` |
| |