Upload README.md with huggingface_hub

c23abd4 verified 1 day ago

3.61 kB

	---
	license: apache-2.0
	language:
	- en
	tags:
	- video-editing
	- instruction-following
	- structured-generation
	- text-to-json
	- ffmpeg
	- gearcut
	- sparse-transformer
	pipeline_tag: text-generation
	inference: false
	---

	# GearCut Editor (gc_editor)

	gc_editor is a compact instruction-to-operations model that powers
	GearCut, an ultra-lightweight, FFmpeg-based
	video editor. It translates a plain-English editing instruction into a list of
	structured editing operations (JSON) that GearCut's `project -> ffmpeg`
	compiler then executes. It is designed to run locally, on CPU, so the editor
	needs no cloud service and no user video ever leaves the machine.

	Developed by AMEFORGE. Built on the in-house SparseMind architecture
	(sparse attention + sparse FFN, dynamic neuron typing, and episodic memory).

	## What it does

	- Input: the current timeline state + a natural-language instruction.
	- Output: a JSON array of editing operations.

	```text
	INPUT
	clips: c1=intro.mp4(0.0-8.0) \| remove the first 3 seconds of the clip =>

	OUTPUT
	[{"op":"trim","clip":"c1","in":3.0,"out":8.0}]
	```

	Supported operations (v1): `trim`, `split`, `import`, `append`, `delete`,
	`reorder`, `export`.

	## Model details

	\| \| \|
	\|---\|---\|
	\| Architecture \| SparseMind (decoder-only, sparse) \|
	\| Parameters \| 28,759,300 (~28.8M) \|
	\| Hidden size / layers \| 384 / 8 \|
	\| Context length \| 256 tokens \|
	\| Tokenizer \| GearCut dedicated SentencePiece-BPE, vocab 682 \|
	\| Precision \| fp32 \|

	## Evaluation

	Measured on a held-out synthetic validation split. The meaningful metrics are
	not perplexity but whether the generated operations are usable:

	\| Metric \| Score \|
	\|---\|---\|
	\| Valid JSON \| 100.0% \|
	\| Exact match (operations == reference) \| 76.5% \|
	\| Best exact match during training \| 88.0% \|

	## Training data

	Trained on 85,000 synthetically generated `(timeline + instruction -> operations)`
	examples for 3000 steps. The generator covers the v1 operation set with
	varied phrasings, clip references, file names, timestamps, and presets.

	## Intended use & scope

	Intended as the natural-language command layer inside the GearCut editor. It is
	not a general-purpose assistant and only emits GearCut operations.

	## Limitations

	- Synthetic training data. The model is strongest on phrasings close to the
	generator's templates. Unusual real-world wording may be handled less reliably
	until the data is expanded with real examples.
	- English only (v1). A bilingual (EN/FR) version is planned.
	- Narrow operation set (v1). Transitions, multi-track, and effects are not
	yet covered.
	- Custom architecture. The HF inference widget is disabled; load and run the
	model with the snippet below.

	## How to use

	```python
	# Download gc_editor.pt + the GearCut tokenizer from this repo, then rebuild the
	# SparseMind model with the same config stored in the checkpoint and load weights.
	import torch, sentencepiece as spm
	ckpt = torch.load("gc_editor.pt", map_location="cpu")
	cfg = ckpt["config"] # the exact training config
	# model = SparseMind(Config(**cfg)); model.load_state_dict(ckpt["model"]); model.eval()
	sp = spm.SentencePieceProcessor(); sp.Load("gearcut_tok.model")
	prompt = 'clips: c1=intro.mp4(0.0-8.0) \| remove the first 3 seconds of the clip =>'
	# ids = sp.EncodeAsIds(prompt) ; generate ; stop at EOS ; json.loads the output
	```

	## Citation

	```bibtex
	@misc{gearcut_editor,
	title = {GearCut Editor: an instruction-to-operations model for lightweight video editing},
	author = {AMEFORGE},
	year = {2026},
	note = {Built on the SparseMind architecture}
	}
	```