Spaces:

k-l-lambda
/

starry

Sleeping

App Files Files Community

starry / README.md

k-l-lambda

Update thumbnail gradient to blue-yellow

2342afe 11 days ago

preview code

raw

history blame contribute delete

2.13 kB

	---
	title: Starry
	emoji: ✨
	colorFrom: blue
	colorTo: yellow
	sdk: docker
	pinned: false
	license: bsd
	short_description: End-to-end Optical Music Recognition (OMR) demo
	app_port: 7860
	---

	# STARRY — Optical Music Recognition

	Starry is a demo application for an end-to-end Optical Music Recognition (OMR) system. It transforms scanned sheet music images into structured digital music notation through a multi-stage ML pipeline, and provides an interactive platform for reviewing, editing, and managing the recognition results.

	## What Makes Starry Different

	- Multi-stage ML pipeline — Seven specialized models work in sequence: layout detection, staff gauge prediction, foreground/background mask separation, symbol semantic recognition, text location detection, OCR, and bracket recognition. Each model focuses on a specific subtask, enabling high overall accuracy.
	- Automatic regulation — An ONNX-based solver analyzes recognized measures for timing consistency and automatically corrects quantization errors, reducing the need for manual intervention.
	- Measure-level quality tracking — The system identifies and flags problematic measures for human review, with an annotation workflow that tracks corrections per-measure.
	- Real-time progress — WebSocket streaming keeps the UI updated during long-running recognition and regulation tasks.
	- Score collection management — Organize recognized scores into music sets with tagging and categorization, useful for building OMR datasets or curating repertoire.

	## Architecture

	```
	Frontend (React)
	↓ nginx (port 7860)
	├─ /api/score → cluster-server (NestJS)
	└─ /api/* → omr-service (Fastify)
	├─ async task worker
	├─ PostgreSQL
	└─ ZMQ → Python ML services (×7)
	```

	## Note

	This is a lightweight demo deployment on HuggingFace Spaces. The ML prediction services (layout, gauge, mask, semantic, text, OCR, brackets) are not included in this Space — it is intended for showcasing the review and editing workflow with pre-computed results.