Spaces:
Running
Running
Port Harbor Visualiser from Gradio to FastAPI + Hugging Face theme
Browse files- FastAPI backend (Docker Space) replacing the Gradio app
- Hugging Face themed SPA: dark slate + yellow/orange, 🤗 logo
- Browse Harbor-tagged HF datasets live (other=harbor), no stale cache
- Large datasets list via shallow Hub tree listing; per-task lazy fetch
(2k-task datasets list in ~2s instead of bulk-downloading the repo)
- Task master-detail view: collapsible task side-panel, in-place switching
- Per-task copy-able `harbor run` command
- Deep-link + dataset-card badge; real Space URL via $SPACE_HOST
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Dockerfile +20 -0
- README.md +14 -14
- app.py +124 -547
- requirements.txt +2 -2
- static/app.js +515 -0
- static/index.html +35 -0
- static/style.css +285 -0
- viewer/__init__.py +2 -1
- viewer/hub.py +121 -0
- viewer/load.py +55 -4
Dockerfile
ADDED
|
@@ -0,0 +1,20 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Hugging Face Docker Space — FastAPI Harbor Visualiser.
|
| 2 |
+
FROM python:3.11-slim
|
| 3 |
+
|
| 4 |
+
# git: needed for gh:// dataset clones. (harbor CLI installs via pip for harbor://.)
|
| 5 |
+
RUN apt-get update && apt-get install -y --no-install-recommends git \
|
| 6 |
+
&& rm -rf /var/lib/apt/lists/*
|
| 7 |
+
|
| 8 |
+
RUN useradd -m -u 1000 user
|
| 9 |
+
USER user
|
| 10 |
+
ENV PATH="/home/user/.local/bin:$PATH" \
|
| 11 |
+
HARBOR_VIEWER_CACHE=/tmp/.harbor-viewer-cache
|
| 12 |
+
WORKDIR /app
|
| 13 |
+
|
| 14 |
+
COPY --chown=user requirements.txt .
|
| 15 |
+
RUN pip install --no-cache-dir -r requirements.txt
|
| 16 |
+
|
| 17 |
+
COPY --chown=user . .
|
| 18 |
+
|
| 19 |
+
# HF routes public traffic to app_port (7860, set in README.md frontmatter).
|
| 20 |
+
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "7860"]
|
README.md
CHANGED
|
@@ -1,21 +1,20 @@
|
|
| 1 |
---
|
| 2 |
-
title: Harbor Visualiser
|
| 3 |
-
emoji:
|
| 4 |
-
colorFrom:
|
| 5 |
-
colorTo:
|
| 6 |
-
sdk:
|
| 7 |
-
|
| 8 |
-
app_file: app.py
|
| 9 |
pinned: false
|
| 10 |
license: apache-2.0
|
| 11 |
-
short_description: Browse Harbor task specs from HF, GitHub, or local
|
| 12 |
---
|
| 13 |
|
| 14 |
-
# Harbor Visualiser
|
| 15 |
|
| 16 |
-
A
|
| 17 |
|
| 18 |
-
Drop in a Hugging Face dataset id, a GitHub repo, or a local Harbor dataset directory; the viewer renders every task's metadata, instruction, oracle patch, test script, and Dockerfile side-by-side.
|
| 19 |
|
| 20 |
## Use it
|
| 21 |
|
|
@@ -41,7 +40,7 @@ https://huggingface.co/spaces/AdithyaSK/harbor-visualiser?dataset=<owner>/<datas
|
|
| 41 |
|
| 42 |
```bash
|
| 43 |
pip install -r requirements.txt
|
| 44 |
-
|
| 45 |
# → http://127.0.0.1:7860
|
| 46 |
```
|
| 47 |
|
|
@@ -85,7 +84,8 @@ Either of these:
|
|
| 85 |
|
| 86 |
## Stack
|
| 87 |
|
| 88 |
-
- [
|
| 89 |
-
-
|
|
|
|
| 90 |
- `git` (system binary) — GitHub clone
|
| 91 |
- Python stdlib `tomllib` — task.toml parsing
|
|
|
|
| 1 |
---
|
| 2 |
+
title: Hugging Face Harbor Visualiser
|
| 3 |
+
emoji: 🤗
|
| 4 |
+
colorFrom: yellow
|
| 5 |
+
colorTo: orange
|
| 6 |
+
sdk: docker
|
| 7 |
+
app_port: 7860
|
|
|
|
| 8 |
pinned: false
|
| 9 |
license: apache-2.0
|
| 10 |
+
short_description: Browse Harbor task specs from HF Hub, GitHub, or local
|
| 11 |
---
|
| 12 |
|
| 13 |
+
# 🤗 Hugging Face Harbor Visualiser
|
| 14 |
|
| 15 |
+
A FastAPI Space for browsing [Harbor](https://www.harborframework.com/) task spec directories — the dataset format used by Harbor for agent evaluation + RL environments.
|
| 16 |
|
| 17 |
+
Drop in a Hugging Face dataset id, a GitHub repo, or a local Harbor dataset directory; the viewer renders every task's metadata, instruction, oracle patch, test script, and Dockerfile side-by-side. Large datasets (2k+ tasks) list and open instantly — task ids come from a shallow Hub listing and only the opened task's files are fetched, so nothing is bulk-downloaded.
|
| 18 |
|
| 19 |
## Use it
|
| 20 |
|
|
|
|
| 40 |
|
| 41 |
```bash
|
| 42 |
pip install -r requirements.txt
|
| 43 |
+
uvicorn app:app --port 7860
|
| 44 |
# → http://127.0.0.1:7860
|
| 45 |
```
|
| 46 |
|
|
|
|
| 84 |
|
| 85 |
## Stack
|
| 86 |
|
| 87 |
+
- [FastAPI](https://fastapi.tiangolo.com/) + [uvicorn](https://www.uvicorn.org/) — server
|
| 88 |
+
- Vanilla-JS single-page UI (hash-routed) with a Hugging Face theme
|
| 89 |
+
- [huggingface_hub](https://github.com/huggingface/huggingface_hub) — Hub listing + per-task download
|
| 90 |
- `git` (system binary) — GitHub clone
|
| 91 |
- Python stdlib `tomllib` — task.toml parsing
|
app.py
CHANGED
|
@@ -1,17 +1,20 @@
|
|
| 1 |
-
"""Harbor Visualiser —
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
|
| 3 |
Run locally:
|
| 4 |
pip install -r requirements.txt
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
Or deploy to a Hugging Face Space — the `README.md` frontmatter pins
|
| 8 |
-
`sdk: gradio` and `app_file: app.py`, so the Space picks this up directly.
|
| 9 |
|
| 10 |
-
|
| 11 |
-
https://<space>/?dataset=owner/name
|
| 12 |
-
https://<space>/?dataset=harbor://org/name@tag
|
| 13 |
-
https://<space>/?dataset=gh://owner/repo
|
| 14 |
-
https://<space>/?d=owner/name (short alias)
|
| 15 |
"""
|
| 16 |
|
| 17 |
from __future__ import annotations
|
|
@@ -19,567 +22,141 @@ from __future__ import annotations
|
|
| 19 |
import logging
|
| 20 |
from pathlib import Path
|
| 21 |
|
| 22 |
-
import
|
|
|
|
|
|
|
| 23 |
|
| 24 |
-
from viewer import
|
| 25 |
-
|
| 26 |
-
fetch_dataset,
|
| 27 |
-
list_tasks,
|
| 28 |
-
load_task,
|
| 29 |
-
parse_dataset_uri,
|
| 30 |
-
)
|
| 31 |
|
| 32 |
logging.basicConfig(level=logging.INFO, format="%(asctime)s %(levelname)s %(name)s %(message)s")
|
| 33 |
logger = logging.getLogger("harbor-visualiser")
|
| 34 |
|
|
|
|
|
|
|
| 35 |
|
| 36 |
-
|
| 37 |
-
# File-tree definition + helpers
|
| 38 |
-
# ---------------------------------------------------------------------------
|
| 39 |
-
|
| 40 |
-
# Virtual entry id for the metadata overview (not a real file).
|
| 41 |
-
_OVERVIEW = "__overview__"
|
| 42 |
-
|
| 43 |
-
# Folder pseudo-ids — selecting one routes to its first present child.
|
| 44 |
-
_FOLDER_PREFIX = "__folder__"
|
| 45 |
-
|
| 46 |
-
|
| 47 |
-
# Map suffix → Gradio Code language token. Everything else falls back to "shell".
|
| 48 |
-
_EXTENSION_LANGUAGE: dict[str, str] = {
|
| 49 |
-
".toml": "yaml", # Gradio Prism has no TOML; YAML is the closest fit
|
| 50 |
-
".diff": "python", # closest available; +/- lines render fine
|
| 51 |
-
".patch": "python",
|
| 52 |
-
".sh": "shell",
|
| 53 |
-
".bash": "shell",
|
| 54 |
-
".py": "python",
|
| 55 |
-
".json": "json",
|
| 56 |
-
".yaml": "yaml",
|
| 57 |
-
".yml": "yaml",
|
| 58 |
-
".md": "markdown",
|
| 59 |
-
".markdown": "markdown",
|
| 60 |
-
".txt": "shell",
|
| 61 |
-
".csv": "shell",
|
| 62 |
-
".tsv": "shell",
|
| 63 |
-
".ini": "yaml",
|
| 64 |
-
".cfg": "yaml",
|
| 65 |
-
".conf": "shell",
|
| 66 |
-
".html": "html",
|
| 67 |
-
".css": "css",
|
| 68 |
-
".js": "javascript",
|
| 69 |
-
".ts": "typescript",
|
| 70 |
-
}
|
| 71 |
-
|
| 72 |
-
|
| 73 |
-
def _file_language(filename: str) -> str:
|
| 74 |
-
"""Pick a Gradio Code `language=` token for a given filename."""
|
| 75 |
-
if filename.endswith("Dockerfile"):
|
| 76 |
-
return "dockerfile"
|
| 77 |
-
suffix = Path(filename).suffix.lower()
|
| 78 |
-
return _EXTENSION_LANGUAGE.get(suffix, "shell")
|
| 79 |
-
|
| 80 |
-
|
| 81 |
-
def _read_task_file(task: HarborTask, file_id: str) -> str | None:
|
| 82 |
-
"""Fetch the content for a file_id; None when the file isn't present.
|
| 83 |
-
|
| 84 |
-
task.toml and instruction.md have special handling (task.toml uses the
|
| 85 |
-
pre-captured raw text; instruction.md falls back to the inline `task.instruction`
|
| 86 |
-
field from task.toml when no instruction.md file is on disk). Everything else
|
| 87 |
-
is a direct lookup against the dict populated by walking the task dir.
|
| 88 |
-
"""
|
| 89 |
-
if file_id == "task.toml":
|
| 90 |
-
return task.task_toml_raw or None
|
| 91 |
-
if file_id == "instruction.md":
|
| 92 |
-
return task.files.get("instruction.md") or task.instruction_inline
|
| 93 |
-
return task.files.get(file_id)
|
| 94 |
-
|
| 95 |
-
|
| 96 |
-
def _build_file_tree(task: HarborTask) -> tuple[list[tuple[str, str]], dict[str, str]]:
|
| 97 |
-
"""Render a file-explorer-style choice list for a task.
|
| 98 |
-
|
| 99 |
-
Walks every file discovered under the task dir (via `task.files`) and groups
|
| 100 |
-
them by their first path segment. Order: Overview → top-level files
|
| 101 |
-
(task.toml, instruction.md, anything else) → folders alphabetically with
|
| 102 |
-
their children alphabetically. No hardcoded allowlist — what's on disk is
|
| 103 |
-
what gets shown.
|
| 104 |
-
|
| 105 |
-
Returns:
|
| 106 |
-
choices: list of (label, value) tuples for `gr.Radio`. Labels use unicode
|
| 107 |
-
tree glyphs (📂, ├─, └─) so the radio reads like a file tree.
|
| 108 |
-
folder_redirects: maps each folder pseudo-id to its first present child
|
| 109 |
-
file_id so clicking a folder header opens its first file.
|
| 110 |
-
"""
|
| 111 |
-
choices: list[tuple[str, str]] = [("ⓘ Overview", _OVERVIEW)]
|
| 112 |
-
redirects: dict[str, str] = {}
|
| 113 |
-
|
| 114 |
-
# Bucket every discovered file by top-level dir ("" = at task root)
|
| 115 |
-
top_level: list[str] = []
|
| 116 |
-
by_folder: dict[str, list[str]] = {}
|
| 117 |
-
for path in sorted(task.files):
|
| 118 |
-
if "/" in path:
|
| 119 |
-
folder = path.split("/", 1)[0]
|
| 120 |
-
by_folder.setdefault(folder, []).append(path)
|
| 121 |
-
else:
|
| 122 |
-
top_level.append(path)
|
| 123 |
-
|
| 124 |
-
# Top-level files: task.toml first, then instruction.md (with inline
|
| 125 |
-
# fallback), then anything else alphabetically. Order is presentational.
|
| 126 |
-
if (task.task_toml_raw or "").strip():
|
| 127 |
-
choices.append(("📄 task.toml", "task.toml"))
|
| 128 |
-
if (task.files.get("instruction.md") or task.instruction_inline):
|
| 129 |
-
choices.append(("📄 instruction.md", "instruction.md"))
|
| 130 |
-
for path in sorted(top_level):
|
| 131 |
-
if path in ("task.toml", "instruction.md"):
|
| 132 |
-
continue # already added
|
| 133 |
-
choices.append((f"📄 {path}", path))
|
| 134 |
-
|
| 135 |
-
# Folders alphabetically (environment / solution / tests / ...) with
|
| 136 |
-
# children alphabetical within each.
|
| 137 |
-
for folder in sorted(by_folder):
|
| 138 |
-
children = sorted(by_folder[folder])
|
| 139 |
-
if not children:
|
| 140 |
-
continue
|
| 141 |
-
folder_id = f"{_FOLDER_PREFIX}{folder}"
|
| 142 |
-
choices.append((f"📂 {folder}/", folder_id))
|
| 143 |
-
redirects[folder_id] = children[0] # folder header → first child
|
| 144 |
-
for i, full_id in enumerate(children):
|
| 145 |
-
# Show the path *inside* the folder (handles nested subdirs too)
|
| 146 |
-
basename = full_id[len(folder) + 1:] # strip "<folder>/"
|
| 147 |
-
glyph = "└─" if i == len(children) - 1 else "├─"
|
| 148 |
-
choices.append((f" {glyph} {basename}", full_id))
|
| 149 |
-
|
| 150 |
-
return choices, redirects
|
| 151 |
|
| 152 |
|
| 153 |
# ---------------------------------------------------------------------------
|
| 154 |
-
#
|
| 155 |
# ---------------------------------------------------------------------------
|
| 156 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 157 |
|
| 158 |
-
|
| 159 |
-
|
|
|
|
|
|
|
| 160 |
|
| 161 |
-
Outputs (in order):
|
| 162 |
-
status_md, source_state, root_state, all_tasks_state, folder_redirects_state,
|
| 163 |
-
task_search_value, task_radio_update,
|
| 164 |
-
file_radio_update, markdown_update, code_update
|
| 165 |
-
"""
|
| 166 |
-
if not uri or not uri.strip():
|
| 167 |
-
return _empty_state("Enter a dataset URI to begin.")
|
| 168 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 169 |
try:
|
| 170 |
source = parse_dataset_uri(uri)
|
| 171 |
except ValueError as exc:
|
| 172 |
-
|
| 173 |
-
|
| 174 |
try:
|
| 175 |
-
|
| 176 |
-
|
| 177 |
-
|
| 178 |
-
|
| 179 |
-
|
| 180 |
-
|
| 181 |
-
|
| 182 |
-
|
| 183 |
-
|
| 184 |
-
|
| 185 |
-
|
| 186 |
-
|
| 187 |
-
|
| 188 |
-
|
| 189 |
-
|
| 190 |
-
|
| 191 |
-
|
| 192 |
-
|
| 193 |
-
|
| 194 |
-
|
| 195 |
-
|
| 196 |
-
|
| 197 |
-
|
| 198 |
-
|
| 199 |
-
|
| 200 |
-
|
| 201 |
-
|
| 202 |
-
gr.update(choices=tasks, value=first, label=f"Tasks ({len(tasks)})"),
|
| 203 |
-
gr.update(choices=file_choices, value=_OVERVIEW, label="Files"),
|
| 204 |
-
md_html,
|
| 205 |
-
code_update,
|
| 206 |
-
)
|
| 207 |
-
|
| 208 |
-
|
| 209 |
-
def select_task_action(task_id: str, root: str):
|
| 210 |
-
"""Switch task → repopulate file tree, render the Overview."""
|
| 211 |
-
if not task_id or not root:
|
| 212 |
-
return (
|
| 213 |
-
{},
|
| 214 |
-
gr.update(choices=[], value=None, label="Files"),
|
| 215 |
-
"Pick a task from the list.",
|
| 216 |
-
gr.update(value="", visible=False),
|
| 217 |
-
)
|
| 218 |
try:
|
| 219 |
-
|
| 220 |
-
except
|
| 221 |
-
|
| 222 |
-
return (
|
| 223 |
-
{},
|
| 224 |
-
gr.update(choices=[], value=None, label="Files"),
|
| 225 |
-
f"❌ {exc}",
|
| 226 |
-
gr.update(value="", visible=False),
|
| 227 |
-
)
|
| 228 |
-
file_choices, redirects = _build_file_tree(task)
|
| 229 |
-
md_html, code_update = _render_file(task, _OVERVIEW)
|
| 230 |
-
return (
|
| 231 |
-
redirects,
|
| 232 |
-
gr.update(choices=file_choices, value=_OVERVIEW, label="Files"),
|
| 233 |
-
md_html,
|
| 234 |
-
code_update,
|
| 235 |
-
)
|
| 236 |
-
|
| 237 |
-
|
| 238 |
-
def select_file_action(file_id: str, root: str, task_id: str, folder_redirects: dict):
|
| 239 |
-
"""Switch file inside a task → render its content.
|
| 240 |
-
|
| 241 |
-
Folder pseudo-ids are routed to their first child via `folder_redirects`.
|
| 242 |
-
The file_tree radio's value is also updated so the user sees which file
|
| 243 |
-
was opened.
|
| 244 |
-
"""
|
| 245 |
-
if not file_id or not root or not task_id:
|
| 246 |
-
return ("Pick a task first.", gr.update(value="", visible=False), gr.update())
|
| 247 |
-
|
| 248 |
-
# Folder header click → redirect to first child + update radio selection
|
| 249 |
-
redirect = (folder_redirects or {}).get(file_id)
|
| 250 |
-
if redirect is not None:
|
| 251 |
-
file_id = redirect
|
| 252 |
-
radio_update = gr.update(value=file_id)
|
| 253 |
-
else:
|
| 254 |
-
radio_update = gr.update() # no-op for normal file clicks
|
| 255 |
-
|
| 256 |
try:
|
| 257 |
-
|
| 258 |
-
|
| 259 |
-
|
| 260 |
-
|
| 261 |
-
|
| 262 |
-
|
| 263 |
-
|
| 264 |
-
|
| 265 |
-
|
| 266 |
-
|
| 267 |
-
|
| 268 |
-
|
| 269 |
-
|
| 270 |
-
|
| 271 |
-
|
| 272 |
-
|
| 273 |
-
|
| 274 |
-
|
| 275 |
-
|
| 276 |
-
|
| 277 |
-
|
| 278 |
-
|
| 279 |
-
|
| 280 |
-
|
| 281 |
-
|
| 282 |
-
|
| 283 |
-
|
| 284 |
-
|
| 285 |
-
|
| 286 |
-
|
| 287 |
-
|
| 288 |
-
|
| 289 |
-
|
| 290 |
-
|
| 291 |
-
|
| 292 |
-
|
| 293 |
-
|
| 294 |
-
|
| 295 |
-
|
| 296 |
-
|
| 297 |
-
|
| 298 |
-
|
| 299 |
-
|
| 300 |
-
[], # all_tasks_state
|
| 301 |
-
{}, # folder_redirects_state
|
| 302 |
-
"", # task search clear
|
| 303 |
-
gr.update(choices=[], value=None, label="Tasks"),
|
| 304 |
-
gr.update(choices=[], value=None, label="Files"),
|
| 305 |
-
"Pick a task from the list once a dataset is loaded.",
|
| 306 |
-
gr.update(value="", visible=False),
|
| 307 |
-
)
|
| 308 |
-
|
| 309 |
-
|
| 310 |
-
def _render_file(task: HarborTask, file_id: str):
|
| 311 |
-
"""Return (markdown_html, code_update) for the right-panel content area.
|
| 312 |
-
|
| 313 |
-
Exactly ONE of the two panels is visible at a time:
|
| 314 |
-
- Overview + .md files → markdown panel
|
| 315 |
-
- everything else → code panel with language=auto
|
| 316 |
-
"""
|
| 317 |
-
if file_id == _OVERVIEW:
|
| 318 |
-
return (_overview_markdown(task), gr.update(value="", visible=False))
|
| 319 |
-
|
| 320 |
-
content = _read_task_file(task, file_id)
|
| 321 |
-
if content is None:
|
| 322 |
-
return (
|
| 323 |
-
f"_(no `{file_id}` in this task)_",
|
| 324 |
-
gr.update(value="", visible=False),
|
| 325 |
-
)
|
| 326 |
-
|
| 327 |
-
if file_id.endswith(".md"):
|
| 328 |
-
# Render Markdown for instruction.md (no code box)
|
| 329 |
-
return (content, gr.update(value="", visible=False))
|
| 330 |
-
|
| 331 |
-
lang = _file_language(file_id)
|
| 332 |
-
return (
|
| 333 |
-
"", # markdown empty
|
| 334 |
-
gr.update(value=content, language=lang, visible=True, label=file_id),
|
| 335 |
-
)
|
| 336 |
-
|
| 337 |
-
|
| 338 |
-
def _overview_markdown(task: HarborTask) -> str:
|
| 339 |
-
"""Render the task's metadata as a clean markdown table."""
|
| 340 |
-
rows: list[tuple[str, str]] = []
|
| 341 |
-
rows.append(("task id", f"`{task.id}`"))
|
| 342 |
-
if task.name:
|
| 343 |
-
rows.append(("name", f"`{task.name}`"))
|
| 344 |
-
if task.version:
|
| 345 |
-
rows.append(("spec version", task.version))
|
| 346 |
-
if task.description:
|
| 347 |
-
rows.append(("description", task.description))
|
| 348 |
-
if task.difficulty:
|
| 349 |
-
rows.append(("difficulty", task.difficulty))
|
| 350 |
-
if task.category:
|
| 351 |
-
rows.append(("category", task.category))
|
| 352 |
-
if task.keywords:
|
| 353 |
-
rows.append(("keywords", ", ".join(f"`{k}`" for k in task.keywords)))
|
| 354 |
-
if task.agent_timeout_sec is not None:
|
| 355 |
-
rows.append(("agent timeout", f"{task.agent_timeout_sec}s"))
|
| 356 |
-
if task.verifier_timeout_sec is not None:
|
| 357 |
-
rows.append(("verifier timeout", f"{task.verifier_timeout_sec}s"))
|
| 358 |
-
|
| 359 |
-
md = ["| Field | Value |", "|---|---|"]
|
| 360 |
-
for k, v in rows:
|
| 361 |
-
md.append(f"| **{k}** | {v} |")
|
| 362 |
-
|
| 363 |
-
if task.repo2env:
|
| 364 |
-
md.append("\n### `[metadata.repo2env]` extension (Repo2RLEnv)\n")
|
| 365 |
-
md.append("| Field | Value |")
|
| 366 |
-
md.append("|---|---|")
|
| 367 |
-
for k, v in sorted(task.repo2env.items()):
|
| 368 |
-
if isinstance(v, dict):
|
| 369 |
-
md.append(f"| **{k}** | _(nested — see below)_ |")
|
| 370 |
-
for kk, vv in sorted(v.items()):
|
| 371 |
-
md.append(f"| `{kk}` | `{_short(vv)}` |")
|
| 372 |
-
else:
|
| 373 |
-
md.append(f"| **{k}** | `{_short(v)}` |")
|
| 374 |
-
|
| 375 |
-
return "\n".join(md)
|
| 376 |
-
|
| 377 |
-
|
| 378 |
-
def _short(v) -> str:
|
| 379 |
-
"""Truncate long values for the metadata table cells."""
|
| 380 |
-
if isinstance(v, list):
|
| 381 |
-
return ", ".join(str(x) for x in v)
|
| 382 |
-
s = str(v)
|
| 383 |
-
return s if len(s) < 110 else s[:107] + "…"
|
| 384 |
|
| 385 |
|
| 386 |
# ---------------------------------------------------------------------------
|
| 387 |
-
# UI
|
| 388 |
# ---------------------------------------------------------------------------
|
| 389 |
|
| 390 |
-
|
| 391 |
-
|
| 392 |
-
|
| 393 |
-
Browse [Harbor](https://www.harborframework.com/) task spec datasets — Hugging Face, GitHub, Harbor registry, or local."""
|
| 394 |
-
|
| 395 |
-
|
| 396 |
-
_FOOTER_MD = """<sub>Built with [Gradio](https://www.gradio.app/) · Harbor framework [docs](https://www.harborframework.com/)</sub>"""
|
| 397 |
-
|
| 398 |
-
|
| 399 |
-
# A small set of popular / known-working datasets surfaced as one-click examples.
|
| 400 |
-
_EXAMPLES: list[tuple[str, str]] = [
|
| 401 |
-
("cookbook/test (Harbor)", "harbor://cookbook/test"),
|
| 402 |
-
("SWE-Atlas QnA (Harbor)", "harbor://scale-ai/swe-atlas-qna"),
|
| 403 |
-
("SWE-Bench Pro (Harbor)", "harbor://cais/swebenchpro"),
|
| 404 |
-
("Click PRs (HF / Repo2RLEnv)", "AdithyaSK/click-r2e-v082post1"),
|
| 405 |
-
("Click PRs (GitHub demo)", "https://github.com/adithya-s-k/harbor-tasks-demo"),
|
| 406 |
-
]
|
| 407 |
-
|
| 408 |
-
|
| 409 |
-
# Minimal monochrome aesthetic — file-explorer feel for both task list and file tree.
|
| 410 |
-
_CUSTOM_CSS = """
|
| 411 |
-
.gradio-container { font-family: ui-sans-serif, system-ui, -apple-system, sans-serif; }
|
| 412 |
-
h1, h2, h3 { font-weight: 600; }
|
| 413 |
-
|
| 414 |
-
button.primary { background: #111 !important; color: white !important; border: 1px solid #111 !important; }
|
| 415 |
-
button.primary:hover { background: #333 !important; }
|
| 416 |
-
|
| 417 |
-
/* Task list — scrollable + monospace + ellipsis on long IDs */
|
| 418 |
-
#task-list .wrap { max-height: 65vh; overflow-y: auto; padding-right: 4px; }
|
| 419 |
-
#task-list label,
|
| 420 |
-
#task-list label > span {
|
| 421 |
-
font-family: ui-monospace, Menlo, Consolas, monospace;
|
| 422 |
-
font-size: 11.5px;
|
| 423 |
-
font-weight: 500;
|
| 424 |
-
white-space: nowrap;
|
| 425 |
-
overflow: hidden;
|
| 426 |
-
text-overflow: ellipsis;
|
| 427 |
-
display: block;
|
| 428 |
-
max-width: 100%;
|
| 429 |
-
}
|
| 430 |
-
|
| 431 |
-
/* File tree — also monospace, slightly larger, preserve indent whitespace */
|
| 432 |
-
#file-tree .wrap { max-height: 55vh; overflow-y: auto; }
|
| 433 |
-
#file-tree label,
|
| 434 |
-
#file-tree label > span {
|
| 435 |
-
font-family: ui-monospace, Menlo, Consolas, monospace;
|
| 436 |
-
font-size: 12.5px;
|
| 437 |
-
white-space: pre;
|
| 438 |
-
}
|
| 439 |
-
|
| 440 |
-
#task-search input { font-family: ui-monospace, Menlo, Consolas, monospace; font-size: 12px; }
|
| 441 |
-
|
| 442 |
-
footer { display: none !important; }
|
| 443 |
-
"""
|
| 444 |
-
|
| 445 |
-
|
| 446 |
-
def build_ui() -> gr.Blocks:
|
| 447 |
-
with gr.Blocks(title="Harbor Visualiser") as demo:
|
| 448 |
-
gr.Markdown(_INTRO_MD)
|
| 449 |
-
|
| 450 |
-
with gr.Row():
|
| 451 |
-
uri_input = gr.Textbox(
|
| 452 |
-
label="Dataset",
|
| 453 |
-
placeholder="owner/name | gh://owner/repo | harbor://org/name | https://github.com/owner/repo",
|
| 454 |
-
lines=1,
|
| 455 |
-
scale=8,
|
| 456 |
-
)
|
| 457 |
-
load_btn = gr.Button("Load", variant="primary", scale=1, min_width=80)
|
| 458 |
-
|
| 459 |
-
# Quick-access popular examples
|
| 460 |
-
with gr.Row():
|
| 461 |
-
example_btns: list[gr.Button] = []
|
| 462 |
-
for label, _ in _EXAMPLES:
|
| 463 |
-
example_btns.append(gr.Button(label, size="sm", variant="secondary"))
|
| 464 |
-
|
| 465 |
-
status = gr.Markdown("Enter a dataset URI to begin.")
|
| 466 |
-
|
| 467 |
-
# Hidden state for the dispatch handlers
|
| 468 |
-
source_state = gr.State("")
|
| 469 |
-
root_state = gr.State("")
|
| 470 |
-
all_tasks_state = gr.State([]) # full unfiltered list for the search box
|
| 471 |
-
folder_redirects_state = gr.State({}) # folder pseudo-id → first child file_id
|
| 472 |
-
|
| 473 |
-
# ─── 3-column file-explorer layout ────────────────────────────────
|
| 474 |
-
with gr.Row():
|
| 475 |
-
# Column 1 — Tasks (scrollable + searchable)
|
| 476 |
-
with gr.Column(scale=2, min_width=240):
|
| 477 |
-
task_search = gr.Textbox(
|
| 478 |
-
label="Filter tasks",
|
| 479 |
-
placeholder="type to filter…",
|
| 480 |
-
lines=1,
|
| 481 |
-
elem_id="task-search",
|
| 482 |
-
)
|
| 483 |
-
task_list = gr.Radio(
|
| 484 |
-
choices=[],
|
| 485 |
-
label="Tasks",
|
| 486 |
-
value=None,
|
| 487 |
-
interactive=True,
|
| 488 |
-
elem_id="task-list",
|
| 489 |
-
)
|
| 490 |
-
# Column 2 — File tree for the selected task
|
| 491 |
-
with gr.Column(scale=2, min_width=200):
|
| 492 |
-
file_tree = gr.Radio(
|
| 493 |
-
choices=[],
|
| 494 |
-
label="Files",
|
| 495 |
-
value=None,
|
| 496 |
-
interactive=True,
|
| 497 |
-
elem_id="file-tree",
|
| 498 |
-
)
|
| 499 |
-
# Column 3 — content viewer (markdown OR code, mutually exclusive)
|
| 500 |
-
with gr.Column(scale=6):
|
| 501 |
-
content_md = gr.Markdown("Pick a task from the list once a dataset is loaded.")
|
| 502 |
-
content_code = gr.Code(
|
| 503 |
-
value="",
|
| 504 |
-
language="shell",
|
| 505 |
-
label="",
|
| 506 |
-
interactive=False,
|
| 507 |
-
visible=False,
|
| 508 |
-
)
|
| 509 |
-
|
| 510 |
-
gr.Markdown(_FOOTER_MD)
|
| 511 |
-
|
| 512 |
-
# --- Event wiring ---------------------------------------------------
|
| 513 |
-
|
| 514 |
-
load_outputs = [
|
| 515 |
-
status,
|
| 516 |
-
source_state,
|
| 517 |
-
root_state,
|
| 518 |
-
all_tasks_state,
|
| 519 |
-
folder_redirects_state,
|
| 520 |
-
task_search,
|
| 521 |
-
task_list,
|
| 522 |
-
file_tree,
|
| 523 |
-
content_md,
|
| 524 |
-
content_code,
|
| 525 |
-
]
|
| 526 |
-
|
| 527 |
-
load_btn.click(
|
| 528 |
-
fn=load_dataset_action,
|
| 529 |
-
inputs=[uri_input],
|
| 530 |
-
outputs=load_outputs,
|
| 531 |
-
)
|
| 532 |
-
uri_input.submit(
|
| 533 |
-
fn=load_dataset_action,
|
| 534 |
-
inputs=[uri_input],
|
| 535 |
-
outputs=load_outputs,
|
| 536 |
-
)
|
| 537 |
-
|
| 538 |
-
task_list.change(
|
| 539 |
-
fn=select_task_action,
|
| 540 |
-
inputs=[task_list, root_state],
|
| 541 |
-
outputs=[folder_redirects_state, file_tree, content_md, content_code],
|
| 542 |
-
)
|
| 543 |
-
|
| 544 |
-
file_tree.change(
|
| 545 |
-
fn=select_file_action,
|
| 546 |
-
inputs=[file_tree, root_state, task_list, folder_redirects_state],
|
| 547 |
-
outputs=[content_md, content_code, file_tree],
|
| 548 |
-
)
|
| 549 |
-
|
| 550 |
-
task_search.change(
|
| 551 |
-
fn=filter_tasks_action,
|
| 552 |
-
inputs=[task_search, all_tasks_state, root_state],
|
| 553 |
-
outputs=[task_list],
|
| 554 |
-
)
|
| 555 |
-
|
| 556 |
-
# Example buttons → set input + auto-load
|
| 557 |
-
for btn, (_, uri_value) in zip(example_btns, _EXAMPLES, strict=True):
|
| 558 |
-
btn.click(fn=lambda u=uri_value: u, outputs=uri_input).then(
|
| 559 |
-
fn=load_dataset_action,
|
| 560 |
-
inputs=[uri_input],
|
| 561 |
-
outputs=load_outputs,
|
| 562 |
-
)
|
| 563 |
-
|
| 564 |
-
# On page load: read ?dataset= → prefill → auto-load if non-empty
|
| 565 |
-
demo.load(fn=init_from_url, inputs=None, outputs=uri_input).then(
|
| 566 |
-
fn=lambda u: load_dataset_action(u) if u else _empty_state("Enter a dataset URI to begin."),
|
| 567 |
-
inputs=[uri_input],
|
| 568 |
-
outputs=load_outputs,
|
| 569 |
-
)
|
| 570 |
-
|
| 571 |
-
return demo
|
| 572 |
|
| 573 |
|
| 574 |
-
|
| 575 |
-
theme = gr.themes.Monochrome(
|
| 576 |
-
radius_size=gr.themes.sizes.radius_sm,
|
| 577 |
-
spacing_size=gr.themes.sizes.spacing_md,
|
| 578 |
-
text_size=gr.themes.sizes.text_md,
|
| 579 |
-
)
|
| 580 |
-
demo = build_ui()
|
| 581 |
-
demo.queue(default_concurrency_limit=4).launch(
|
| 582 |
-
server_name="0.0.0.0",
|
| 583 |
-
theme=theme,
|
| 584 |
-
css=_CUSTOM_CSS,
|
| 585 |
-
)
|
|
|
|
| 1 |
+
"""Harbor Visualiser — FastAPI backend + Harbor Hub UI.
|
| 2 |
+
|
| 3 |
+
Serves a single-page "Harbor Hub" themed UI (static/) plus a JSON API that
|
| 4 |
+
reuses the existing loader/parser:
|
| 5 |
+
|
| 6 |
+
GET / → the SPA (static/index.html)
|
| 7 |
+
GET /api/hub/datasets → live list of Harbor-tagged HF datasets
|
| 8 |
+
GET /api/hub/count?id= → task count for one Hub dataset (memoised)
|
| 9 |
+
GET /api/dataset?uri= → fetch a dataset, return its task ids + meta
|
| 10 |
+
GET /api/task?uri=&task= → one task's parsed spec (files + metadata)
|
| 11 |
+
GET /healthz
|
| 12 |
|
| 13 |
Run locally:
|
| 14 |
pip install -r requirements.txt
|
| 15 |
+
uvicorn app:app --reload --port 7860 # → http://127.0.0.1:7860
|
|
|
|
|
|
|
|
|
|
| 16 |
|
| 17 |
+
On a Hugging Face Docker Space it runs via the Dockerfile (uvicorn :7860).
|
|
|
|
|
|
|
|
|
|
|
|
|
| 18 |
"""
|
| 19 |
|
| 20 |
from __future__ import annotations
|
|
|
|
| 22 |
import logging
|
| 23 |
from pathlib import Path
|
| 24 |
|
| 25 |
+
from fastapi import FastAPI, HTTPException, Query
|
| 26 |
+
from fastapi.responses import FileResponse, JSONResponse
|
| 27 |
+
from fastapi.staticfiles import StaticFiles
|
| 28 |
|
| 29 |
+
from viewer import fetch_dataset, fetch_hf_task, list_tasks, load_task, parse_dataset_uri
|
| 30 |
+
from viewer.hub import count_tasks, list_harbor_datasets, list_hf_tasks
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 31 |
|
| 32 |
logging.basicConfig(level=logging.INFO, format="%(asctime)s %(levelname)s %(name)s %(message)s")
|
| 33 |
logger = logging.getLogger("harbor-visualiser")
|
| 34 |
|
| 35 |
+
HERE = Path(__file__).resolve().parent
|
| 36 |
+
STATIC = HERE / "static"
|
| 37 |
|
| 38 |
+
app = FastAPI(title="Harbor Visualiser", docs_url="/api/docs")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 39 |
|
| 40 |
|
| 41 |
# ---------------------------------------------------------------------------
|
| 42 |
+
# API
|
| 43 |
# ---------------------------------------------------------------------------
|
| 44 |
|
| 45 |
+
@app.get("/api/hub/datasets")
|
| 46 |
+
def api_hub_datasets(
|
| 47 |
+
q: str | None = Query(None, description="substring filter on dataset id"),
|
| 48 |
+
sort: str = Query("downloads"),
|
| 49 |
+
limit: int = Query(500, ge=1, le=2000),
|
| 50 |
+
) -> JSONResponse:
|
| 51 |
+
"""Live list of Harbor-tagged datasets on the HF Hub (no stale cache)."""
|
| 52 |
+
try:
|
| 53 |
+
ds = list_harbor_datasets(query=q, sort=sort, limit=limit)
|
| 54 |
+
except Exception as exc: # noqa: BLE001
|
| 55 |
+
raise HTTPException(502, f"HF Hub listing failed: {exc}") from exc
|
| 56 |
+
return JSONResponse({"datasets": [d.as_dict() for d in ds], "count": len(ds)})
|
| 57 |
+
|
| 58 |
|
| 59 |
+
@app.get("/api/hub/count")
|
| 60 |
+
def api_hub_count(id: str = Query(..., description="dataset id, e.g. owner/name")) -> JSONResponse:
|
| 61 |
+
"""Task count for a single Hub dataset (one cheap list_repo_files call)."""
|
| 62 |
+
return JSONResponse({"id": id, "tasks": count_tasks(id)})
|
| 63 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 64 |
|
| 65 |
+
@app.get("/api/dataset")
|
| 66 |
+
def api_dataset(
|
| 67 |
+
uri: str = Query(..., description="owner/name | hf:// | gh:// | harbor:// | local path"),
|
| 68 |
+
refresh: int = Query(0, description="1 = force re-fetch (bypass cache)"),
|
| 69 |
+
) -> JSONResponse:
|
| 70 |
+
"""Fetch a dataset and return its task ids + source metadata."""
|
| 71 |
try:
|
| 72 |
source = parse_dataset_uri(uri)
|
| 73 |
except ValueError as exc:
|
| 74 |
+
raise HTTPException(400, str(exc)) from exc
|
|
|
|
| 75 |
try:
|
| 76 |
+
if source.kind == "hf":
|
| 77 |
+
# List task ids via the Hub API — no download. Critical for large
|
| 78 |
+
# datasets (2k+ tasks) which would otherwise snapshot the whole repo.
|
| 79 |
+
tasks = list_hf_tasks(source.ident, source.revision)
|
| 80 |
+
else:
|
| 81 |
+
root = fetch_dataset(source, force=bool(refresh))
|
| 82 |
+
tasks = list_tasks(root)
|
| 83 |
+
except Exception as exc: # noqa: BLE001
|
| 84 |
+
raise HTTPException(502, f"fetch failed: {exc}") from exc
|
| 85 |
+
return JSONResponse({
|
| 86 |
+
"uri": uri,
|
| 87 |
+
"display": source.display,
|
| 88 |
+
"kind": source.kind,
|
| 89 |
+
"ident": source.ident,
|
| 90 |
+
"revision": source.revision,
|
| 91 |
+
"tasks": tasks,
|
| 92 |
+
"count": len(tasks),
|
| 93 |
+
})
|
| 94 |
+
|
| 95 |
+
|
| 96 |
+
@app.get("/api/task")
|
| 97 |
+
def api_task(
|
| 98 |
+
uri: str = Query(...),
|
| 99 |
+
task: str = Query(..., description="task id (directory name)"),
|
| 100 |
+
refresh: int = Query(0),
|
| 101 |
+
) -> JSONResponse:
|
| 102 |
+
"""Return one task's full parsed spec — metadata + every file."""
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 103 |
try:
|
| 104 |
+
source = parse_dataset_uri(uri)
|
| 105 |
+
except ValueError as exc:
|
| 106 |
+
raise HTTPException(400, str(exc)) from exc
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 107 |
try:
|
| 108 |
+
if source.kind == "hf":
|
| 109 |
+
# Pull just this one task's files, not the entire dataset.
|
| 110 |
+
root = fetch_hf_task(source, task, force=bool(refresh))
|
| 111 |
+
else:
|
| 112 |
+
root = fetch_dataset(source, force=bool(refresh))
|
| 113 |
+
t = load_task(root, task)
|
| 114 |
+
except FileNotFoundError as exc:
|
| 115 |
+
raise HTTPException(404, str(exc)) from exc
|
| 116 |
+
except Exception as exc: # noqa: BLE001
|
| 117 |
+
raise HTTPException(502, f"load failed: {exc}") from exc
|
| 118 |
+
return JSONResponse({
|
| 119 |
+
"id": t.id,
|
| 120 |
+
"name": t.name,
|
| 121 |
+
"org": t.org,
|
| 122 |
+
"version": t.version,
|
| 123 |
+
"description": t.description,
|
| 124 |
+
"instruction_inline": t.instruction_inline,
|
| 125 |
+
"difficulty": t.difficulty,
|
| 126 |
+
"category": t.category,
|
| 127 |
+
"keywords": t.keywords,
|
| 128 |
+
"agent_timeout_sec": t.agent_timeout_sec,
|
| 129 |
+
"verifier_timeout_sec": t.verifier_timeout_sec,
|
| 130 |
+
"repo2env": t.repo2env,
|
| 131 |
+
"task_toml_raw": t.task_toml_raw,
|
| 132 |
+
"files": t.files,
|
| 133 |
+
})
|
| 134 |
+
|
| 135 |
+
|
| 136 |
+
@app.get("/api/config")
|
| 137 |
+
def api_config() -> JSONResponse:
|
| 138 |
+
"""Runtime config for the UI. On a Hugging Face Space, $SPACE_HOST is the
|
| 139 |
+
public app host (e.g. owner-name.hf.space) — we surface it so the deep-link
|
| 140 |
+
/ badge examples show the real Space URL instead of localhost."""
|
| 141 |
+
import os
|
| 142 |
+
return JSONResponse({
|
| 143 |
+
"space_host": os.environ.get("SPACE_HOST") or None,
|
| 144 |
+
"space_id": os.environ.get("SPACE_ID") or None,
|
| 145 |
+
})
|
| 146 |
+
|
| 147 |
+
|
| 148 |
+
@app.get("/healthz")
|
| 149 |
+
def healthz() -> dict:
|
| 150 |
+
return {"ok": True}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 151 |
|
| 152 |
|
| 153 |
# ---------------------------------------------------------------------------
|
| 154 |
+
# UI (static SPA)
|
| 155 |
# ---------------------------------------------------------------------------
|
| 156 |
|
| 157 |
+
@app.get("/")
|
| 158 |
+
def index() -> FileResponse:
|
| 159 |
+
return FileResponse(STATIC / "index.html", media_type="text/html")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 160 |
|
| 161 |
|
| 162 |
+
app.mount("/static", StaticFiles(directory=str(STATIC)), name="static")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
requirements.txt
CHANGED
|
@@ -1,4 +1,4 @@
|
|
| 1 |
-
|
|
|
|
| 2 |
huggingface_hub>=0.27.0
|
| 3 |
harbor>=0.6.0
|
| 4 |
-
|
|
|
|
| 1 |
+
fastapi>=0.115
|
| 2 |
+
uvicorn[standard]>=0.30
|
| 3 |
huggingface_hub>=0.27.0
|
| 4 |
harbor>=0.6.0
|
|
|
static/app.js
ADDED
|
@@ -0,0 +1,515 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
/* Harbor Hub — SPA frontend. Vanilla JS, hash-routed, talks to the FastAPI API. */
|
| 2 |
+
'use strict';
|
| 3 |
+
|
| 4 |
+
const APP = document.getElementById('app');
|
| 5 |
+
|
| 6 |
+
/* ── tiny helpers ─────────────────────────────────── */
|
| 7 |
+
const esc = (s) => String(s == null ? '' : s).replace(/&/g, '&').replace(/</g, '<').replace(/>/g, '>').replace(/"/g, '"');
|
| 8 |
+
const fmtNum = (n) => (n == null || n < 0) ? '—' : n.toLocaleString();
|
| 9 |
+
const enc = encodeURIComponent;
|
| 10 |
+
const qs = (o) => Object.entries(o).filter(([, v]) => v != null && v !== '').map(([k, v]) => `${k}=${enc(v)}`).join('&');
|
| 11 |
+
|
| 12 |
+
async function api(path) {
|
| 13 |
+
const r = await fetch(path);
|
| 14 |
+
if (!r.ok) {
|
| 15 |
+
let msg = `${r.status}`;
|
| 16 |
+
try { msg = (await r.json()).detail || msg; } catch {}
|
| 17 |
+
throw new Error(msg);
|
| 18 |
+
}
|
| 19 |
+
return r.json();
|
| 20 |
+
}
|
| 21 |
+
|
| 22 |
+
const ICON = {
|
| 23 |
+
copy: '<svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"><rect x="9" y="9" width="11" height="11" rx="2"/><path d="M5 15V5a2 2 0 0 1 2-2h10"/></svg>',
|
| 24 |
+
check: '<svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.5"><path d="M20 6L9 17l-5-5"/></svg>',
|
| 25 |
+
search: '<svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"><circle cx="11" cy="11" r="7"/><path d="M21 21l-4-4"/></svg>',
|
| 26 |
+
file: '<svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"><path d="M14 2H6a2 2 0 0 0-2 2v16a2 2 0 0 0 2 2h12a2 2 0 0 0 2-2V8z"/><path d="M14 2v6h6"/></svg>',
|
| 27 |
+
dir: '<svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"><path d="M3 7a2 2 0 0 1 2-2h4l2 3h8a2 2 0 0 1 2 2v7a2 2 0 0 1-2 2H5a2 2 0 0 1-2-2z"/></svg>',
|
| 28 |
+
info: '<svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"><circle cx="12" cy="12" r="9"/><path d="M12 16v-4M12 8h.01"/></svg>',
|
| 29 |
+
back: '<svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"><path d="M15 18l-6-6 6-6"/></svg>',
|
| 30 |
+
next: '<svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"><path d="M9 18l6-6-6-6"/></svg>',
|
| 31 |
+
term: '<svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"><path d="M4 17l6-6-6-6M12 19h8"/></svg>',
|
| 32 |
+
panel: '<svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"><rect x="3" y="4" width="18" height="16" rx="2"/><path d="M9 4v16"/></svg>',
|
| 33 |
+
};
|
| 34 |
+
|
| 35 |
+
function copyButton(text, cls = 'copy') {
|
| 36 |
+
const b = document.createElement('button');
|
| 37 |
+
b.className = cls; b.innerHTML = ICON.copy; b.title = 'Copy';
|
| 38 |
+
b.onclick = (e) => {
|
| 39 |
+
e.stopPropagation(); e.preventDefault();
|
| 40 |
+
navigator.clipboard.writeText(text).then(() => {
|
| 41 |
+
b.innerHTML = ICON.check; b.classList.add('copied');
|
| 42 |
+
setTimeout(() => { b.innerHTML = ICON.copy; b.classList.remove('copied'); }, 1100);
|
| 43 |
+
});
|
| 44 |
+
};
|
| 45 |
+
return b;
|
| 46 |
+
}
|
| 47 |
+
|
| 48 |
+
/* ── theme ────────────────────────────────────────── */
|
| 49 |
+
function applyTheme(mode) {
|
| 50 |
+
const sys = window.matchMedia('(prefers-color-scheme: dark)').matches ? 'dark' : 'light';
|
| 51 |
+
document.documentElement.setAttribute('data-theme', mode === 'system' ? sys : mode);
|
| 52 |
+
document.querySelectorAll('#theme-toggle button').forEach(b =>
|
| 53 |
+
b.classList.toggle('active', b.dataset.mode === mode));
|
| 54 |
+
}
|
| 55 |
+
(function initTheme() {
|
| 56 |
+
let mode = localStorage.getItem('hh-theme') || 'dark';
|
| 57 |
+
applyTheme(mode);
|
| 58 |
+
document.getElementById('theme-toggle').addEventListener('click', (e) => {
|
| 59 |
+
const b = e.target.closest('button'); if (!b) return;
|
| 60 |
+
mode = b.dataset.mode; localStorage.setItem('hh-theme', mode); applyTheme(mode);
|
| 61 |
+
});
|
| 62 |
+
window.matchMedia('(prefers-color-scheme: dark)').addEventListener('change', () => {
|
| 63 |
+
if ((localStorage.getItem('hh-theme') || 'dark') === 'system') applyTheme('system');
|
| 64 |
+
});
|
| 65 |
+
})();
|
| 66 |
+
|
| 67 |
+
/* ── data row with lazy task count ────────────────── */
|
| 68 |
+
function datasetRow(id, count) {
|
| 69 |
+
const row = document.createElement('div');
|
| 70 |
+
row.className = 'row';
|
| 71 |
+
row.onclick = () => { location.hash = `dataset?uri=${enc(id)}`; };
|
| 72 |
+
const name = document.createElement('span'); name.className = 'name'; name.textContent = id;
|
| 73 |
+
row.appendChild(name);
|
| 74 |
+
row.appendChild(copyButton(id));
|
| 75 |
+
const t = document.createElement('span'); t.className = 'tasks';
|
| 76 |
+
if (count == null) { t.innerHTML = '<span class="spin">···</span>'; t.dataset.lazy = id; }
|
| 77 |
+
else t.textContent = fmtNum(count);
|
| 78 |
+
row.appendChild(t);
|
| 79 |
+
return row;
|
| 80 |
+
}
|
| 81 |
+
|
| 82 |
+
// Fill in lazy counts for visible rows, throttled.
|
| 83 |
+
async function fillCounts(container) {
|
| 84 |
+
const pending = [...container.querySelectorAll('.tasks[data-lazy]')];
|
| 85 |
+
let i = 0;
|
| 86 |
+
const worker = async () => {
|
| 87 |
+
while (i < pending.length) {
|
| 88 |
+
const cell = pending[i++]; const id = cell.dataset.lazy; delete cell.dataset.lazy;
|
| 89 |
+
try { const r = await api(`/api/hub/count?id=${enc(id)}`); cell.textContent = fmtNum(r.tasks); }
|
| 90 |
+
catch { cell.textContent = '—'; }
|
| 91 |
+
}
|
| 92 |
+
};
|
| 93 |
+
await Promise.all([worker(), worker(), worker(), worker()]); // 4 in parallel
|
| 94 |
+
}
|
| 95 |
+
|
| 96 |
+
/* ── routes ───────────────────────────────────────── */
|
| 97 |
+
function setActiveNav(name) {
|
| 98 |
+
document.querySelectorAll('.nav .links a').forEach(a => a.classList.toggle('active', a.dataset.nav === name));
|
| 99 |
+
}
|
| 100 |
+
|
| 101 |
+
async function renderHome() {
|
| 102 |
+
setActiveNav('home');
|
| 103 |
+
// Resolve the public base URL: on a HF Space this is the real .hf.space host,
|
| 104 |
+
// so deep-link / badge examples don't show localhost.
|
| 105 |
+
let origin = location.origin;
|
| 106 |
+
try { const cfg = await api('/api/config'); if (cfg.space_host) origin = `https://${cfg.space_host}`; } catch {}
|
| 107 |
+
const badgeUrl = 'https://img.shields.io/badge/%F0%9F%A4%97%20Harbor%20Visualiser-View%20Tasks-ffd21e';
|
| 108 |
+
const deepLink = `${origin}/?dataset=YOUR_DATASET_ID`;
|
| 109 |
+
const badgeMd = `[](${deepLink})`;
|
| 110 |
+
APP.innerHTML = `
|
| 111 |
+
<div class="hero">
|
| 112 |
+
<div class="mark">🤗</div>
|
| 113 |
+
<h1><span class="hf">Hugging Face</span> Harbor Visualiser</h1>
|
| 114 |
+
<p>Visualise <a href="https://www.harborframework.com" target="_blank" rel="noopener" class="hl">Harbor ↗</a> task-spec datasets <strong style="color:var(--text)">straight from the Hugging Face Hub</strong> — metadata, instructions, oracle patches, tests & Dockerfiles. Also works with GitHub repos and local paths. No bulk download, always the latest.</p>
|
| 115 |
+
</div>
|
| 116 |
+
<div class="search" id="load-box">
|
| 117 |
+
${ICON.search}
|
| 118 |
+
<input id="load-input" placeholder="Load any dataset — owner/name · hf:// · gh://owner/repo · harbor://org/name · /local/path" />
|
| 119 |
+
<span class="kbd">↵</span>
|
| 120 |
+
</div>
|
| 121 |
+
<div style="display:flex;align-items:center;justify-content:space-between;margin:26px 0 12px">
|
| 122 |
+
<h2 style="margin:0">Harbor datasets on the Hub</h2>
|
| 123 |
+
<span class="faint" id="hub-status">loading…</span>
|
| 124 |
+
</div>
|
| 125 |
+
<div class="card" id="hub-table">
|
| 126 |
+
<div class="thead"><span>Dataset</span><span class="col-tasks">Tasks</span></div>
|
| 127 |
+
<div class="loading"><span class="spinner"></span>fetching from huggingface.co/datasets?other=harbor</div>
|
| 128 |
+
</div>
|
| 129 |
+
<div class="center"><a class="btn" href="#/datasets">View all datasets →</a></div>
|
| 130 |
+
|
| 131 |
+
<div class="howto">
|
| 132 |
+
<h2>Link your dataset to the visualiser</h2>
|
| 133 |
+
<div class="steps">
|
| 134 |
+
<div class="step">
|
| 135 |
+
<h3>Deep-link any dataset</h3>
|
| 136 |
+
<p>Append <code>?dataset=<owner>/<name></code> to open straight into a dataset's tasks — handy from a dataset card or docs.</p>
|
| 137 |
+
<div class="snippet"><code id="snip-link">${esc(origin)}/?dataset=<owner>/<name></code><span id="copy-link"></span></div>
|
| 138 |
+
</div>
|
| 139 |
+
<div class="step">
|
| 140 |
+
<h3>Add a badge to your dataset card</h3>
|
| 141 |
+
<p>Paste this Markdown into your dataset README so a 🤗 badge always links here:</p>
|
| 142 |
+
<span class="badge-preview"><span class="l">🤗 Harbor Visualiser</span><span class="r">View Tasks</span></span>
|
| 143 |
+
<div class="snippet"><code id="snip-badge">${esc(badgeMd)}</code><span id="copy-badge"></span></div>
|
| 144 |
+
</div>
|
| 145 |
+
</div>
|
| 146 |
+
</div>
|
| 147 |
+
|
| 148 |
+
<div class="footer">
|
| 149 |
+
A read-only visualiser for <a href="https://www.harborframework.com" target="_blank" rel="noopener" class="hl">Harbor</a>
|
| 150 |
+
task-spec datasets — the format used by Harbor for agent evaluation & RL environments.
|
| 151 |
+
Runs on Hugging Face Spaces · not affiliated with the Harbor project.
|
| 152 |
+
</div>
|
| 153 |
+
`;
|
| 154 |
+
document.getElementById('copy-link').appendChild(copyButton(`${origin}/?dataset=<owner>/<name>`));
|
| 155 |
+
document.getElementById('copy-badge').appendChild(copyButton(badgeMd));
|
| 156 |
+
const input = document.getElementById('load-input');
|
| 157 |
+
input.addEventListener('keydown', (e) => {
|
| 158 |
+
if (e.key === 'Enter' && input.value.trim()) location.hash = `dataset?uri=${enc(input.value.trim())}`;
|
| 159 |
+
});
|
| 160 |
+
|
| 161 |
+
try {
|
| 162 |
+
const { datasets } = await api('/api/hub/datasets?sort=downloads&limit=12');
|
| 163 |
+
const card = document.getElementById('hub-table');
|
| 164 |
+
card.innerHTML = '<div class="thead"><span>Dataset</span><span class="col-tasks">Tasks</span></div>';
|
| 165 |
+
datasets.slice(0, 8).forEach(d => card.appendChild(datasetRow(d.id, null)));
|
| 166 |
+
document.getElementById('hub-status').textContent = `${datasets.length}+ datasets`;
|
| 167 |
+
fillCounts(card);
|
| 168 |
+
} catch (e) {
|
| 169 |
+
document.getElementById('hub-table').innerHTML = `<div class="errbox">Couldn't reach the Hub: ${esc(e.message)}</div>`;
|
| 170 |
+
document.getElementById('hub-status').textContent = '';
|
| 171 |
+
}
|
| 172 |
+
}
|
| 173 |
+
|
| 174 |
+
let _hubCache = null;
|
| 175 |
+
async function renderDatasets(params) {
|
| 176 |
+
setActiveNav('datasets');
|
| 177 |
+
const sort = params.get('sort') || 'downloads';
|
| 178 |
+
APP.innerHTML = `
|
| 179 |
+
<div class="page">
|
| 180 |
+
<h1>Datasets</h1>
|
| 181 |
+
<p class="muted" style="margin:-8px 0 20px;font-size:13.5px">Search across every <strong style="color:var(--text)">Harbor-tagged dataset on the Hugging Face Hub</strong> — the live <code style="background:var(--panel-2);padding:1px 6px;border-radius:4px">other=harbor</code> filter.</p>
|
| 182 |
+
<div class="search">
|
| 183 |
+
${ICON.search}
|
| 184 |
+
<input id="ds-search" placeholder="Search Harbor datasets on the Hub…" autofocus />
|
| 185 |
+
<select id="ds-sort">
|
| 186 |
+
<option value="downloads">Most downloads</option>
|
| 187 |
+
<option value="likes">Most likes</option>
|
| 188 |
+
<option value="lastModified">Recently updated</option>
|
| 189 |
+
</select>
|
| 190 |
+
<span class="kbd">⌘K</span>
|
| 191 |
+
</div>
|
| 192 |
+
<div class="card" id="ds-table"><div class="loading"><span class="spinner"></span>loading…</div></div>
|
| 193 |
+
<div class="hint" style="margin-top:18px">
|
| 194 |
+
<span class="ic">${ICON.info}</span>
|
| 195 |
+
<span><strong style="color:var(--text)">Want your dataset to show up here?</strong> Add the <code>harbor</code> tag to your dataset card's metadata (<code>tags: [harbor]</code> in the README front-matter) and it'll appear in this list automatically.</span>
|
| 196 |
+
</div>
|
| 197 |
+
</div>`;
|
| 198 |
+
const tbl = document.getElementById('ds-table');
|
| 199 |
+
const search = document.getElementById('ds-search');
|
| 200 |
+
const sortSel = document.getElementById('ds-sort'); sortSel.value = sort;
|
| 201 |
+
|
| 202 |
+
async function load() {
|
| 203 |
+
tbl.innerHTML = '<div class="loading"><span class="spinner"></span>loading…</div>';
|
| 204 |
+
try {
|
| 205 |
+
const { datasets } = await api(`/api/hub/datasets?${qs({ sort: sortSel.value, limit: 1000 })}`);
|
| 206 |
+
_hubCache = datasets; draw(datasets);
|
| 207 |
+
} catch (e) { tbl.innerHTML = `<div class="errbox">${esc(e.message)}</div>`; }
|
| 208 |
+
}
|
| 209 |
+
function draw(list) {
|
| 210 |
+
tbl.innerHTML = '<div class="thead"><span>Dataset</span><span class="col-tasks">Tasks</span></div>';
|
| 211 |
+
if (!list.length) { tbl.innerHTML += '<div class="empty">no matching datasets</div>'; return; }
|
| 212 |
+
list.slice(0, 300).forEach(d => tbl.appendChild(datasetRow(d.id, null)));
|
| 213 |
+
if (list.length > 300) tbl.innerHTML += `<div class="empty">showing 300 of ${list.length} — refine your search</div>`;
|
| 214 |
+
fillCounts(tbl);
|
| 215 |
+
}
|
| 216 |
+
let t;
|
| 217 |
+
search.addEventListener('input', () => {
|
| 218 |
+
clearTimeout(t);
|
| 219 |
+
t = setTimeout(() => {
|
| 220 |
+
const q = search.value.trim().toLowerCase();
|
| 221 |
+
draw(q ? _hubCache.filter(d => d.id.toLowerCase().includes(q)) : _hubCache);
|
| 222 |
+
}, 120);
|
| 223 |
+
});
|
| 224 |
+
sortSel.addEventListener('change', load);
|
| 225 |
+
await load();
|
| 226 |
+
}
|
| 227 |
+
|
| 228 |
+
async function renderDataset(params) {
|
| 229 |
+
setActiveNav(null);
|
| 230 |
+
const uri = params.get('uri');
|
| 231 |
+
APP.innerHTML = `
|
| 232 |
+
<div class="page">
|
| 233 |
+
<div class="crumb"><a href="#/datasets">Datasets</a><span class="sep">/</span><span>${esc(uri)}</span></div>
|
| 234 |
+
<div class="loading"><span class="spinner"></span>fetching <b>${esc(uri)}</b> …
|
| 235 |
+
<span class="sub">Loading the Harbor spec — this can take a few seconds to a minute for large datasets (the more tasks, the longer the listing).</span>
|
| 236 |
+
</div>
|
| 237 |
+
</div>`;
|
| 238 |
+
let data;
|
| 239 |
+
try { data = await api(`/api/dataset?uri=${enc(uri)}`); }
|
| 240 |
+
catch (e) { APP.querySelector('.page').innerHTML = `<div class="crumb"><a href="#/datasets">Datasets</a></div><div class="errbox">Failed to load <b>${esc(uri)}</b>: ${esc(e.message)}</div>`; return; }
|
| 241 |
+
|
| 242 |
+
const page = APP.querySelector('.page');
|
| 243 |
+
page.innerHTML = `
|
| 244 |
+
<div class="crumb"><a href="#/datasets">Datasets</a><span class="sep">/</span><span>${esc(data.display)}</span>
|
| 245 |
+
<span class="pill">${data.count} tasks</span>
|
| 246 |
+
<button class="btn" id="refresh" style="margin-left:auto;padding:5px 11px;font-size:12px">↻ refresh</button>
|
| 247 |
+
</div>
|
| 248 |
+
<div class="search"><span style="color:var(--faint)">${ICON.search}</span>
|
| 249 |
+
<input id="task-search" placeholder="Search ${data.count} tasks…" autofocus /></div>
|
| 250 |
+
<div class="card tasklist" id="tasks"></div>`;
|
| 251 |
+
const tasksCard = document.getElementById('tasks');
|
| 252 |
+
const tsearch = document.getElementById('task-search');
|
| 253 |
+
function draw(list) {
|
| 254 |
+
tasksCard.innerHTML = '<div class="thead"><span>Task</span></div>';
|
| 255 |
+
if (!list.length) { tasksCard.innerHTML += '<div class="empty">no matching tasks</div>'; return; }
|
| 256 |
+
list.slice(0, 500).forEach(tid => {
|
| 257 |
+
const row = document.createElement('div'); row.className = 'row';
|
| 258 |
+
row.onclick = () => { location.hash = `task?${qs({ uri, task: tid })}`; };
|
| 259 |
+
row.innerHTML = `<span class="name">${esc(tid)}</span>`;
|
| 260 |
+
row.appendChild(copyButton(tid));
|
| 261 |
+
tasksCard.appendChild(row);
|
| 262 |
+
});
|
| 263 |
+
if (list.length > 500) tasksCard.innerHTML += `<div class="empty">showing 500 of ${list.length} — refine your search</div>`;
|
| 264 |
+
}
|
| 265 |
+
draw(data.tasks);
|
| 266 |
+
let t;
|
| 267 |
+
tsearch.addEventListener('input', () => {
|
| 268 |
+
clearTimeout(t);
|
| 269 |
+
t = setTimeout(() => {
|
| 270 |
+
const q = tsearch.value.trim().toLowerCase();
|
| 271 |
+
draw(q ? data.tasks.filter(x => x.toLowerCase().includes(q)) : data.tasks);
|
| 272 |
+
}, 100);
|
| 273 |
+
});
|
| 274 |
+
document.getElementById('refresh').onclick = async () => {
|
| 275 |
+
page.querySelector('.crumb').insertAdjacentHTML('beforeend', ' <span class="faint">refreshing…</span>');
|
| 276 |
+
try { const fresh = await api(`/api/dataset?${qs({ uri, refresh: 1 })}`); data.tasks = fresh.tasks; draw(fresh.tasks); }
|
| 277 |
+
catch (e) { alert('refresh failed: ' + e.message); }
|
| 278 |
+
location.reload();
|
| 279 |
+
};
|
| 280 |
+
}
|
| 281 |
+
|
| 282 |
+
/* ── task viewer (file tree + content) ────────────── */
|
| 283 |
+
const LANG = { toml: 'ini', diff: 'diff', patch: 'diff', sh: 'bash', bash: 'bash', py: 'python', json: 'json', yaml: 'yaml', yml: 'yaml', md: 'markdown', js: 'javascript', ts: 'typescript', html: 'xml', css: 'css' };
|
| 284 |
+
function langFor(path) {
|
| 285 |
+
if (path.endsWith('Dockerfile')) return 'dockerfile';
|
| 286 |
+
const ext = path.split('.').pop().toLowerCase();
|
| 287 |
+
return LANG[ext] || 'plaintext';
|
| 288 |
+
}
|
| 289 |
+
|
| 290 |
+
function harborCmd(kind, ident, taskId) {
|
| 291 |
+
if (kind === 'gh') return `harbor run --task-git-url https://github.com/${ident}.git -i ${taskId} -a oracle`;
|
| 292 |
+
if (kind === 'local') return `harbor run -p ${ident} -i ${taskId} -a oracle`;
|
| 293 |
+
// hf: pull from the Hub, then run the single task with the oracle agent
|
| 294 |
+
const dir = ident.split('/').pop();
|
| 295 |
+
return `huggingface-cli download ${ident} --repo-type dataset --local-dir ${dir} && harbor run -p ${dir} -i ${taskId} -a oracle`;
|
| 296 |
+
}
|
| 297 |
+
|
| 298 |
+
let _taskSiblings = { uri: null, tasks: [], ident: null, kind: null };
|
| 299 |
+
async function renderTask(params) {
|
| 300 |
+
setActiveNav(null);
|
| 301 |
+
const uri = params.get('uri');
|
| 302 |
+
let task = params.get('task');
|
| 303 |
+
let initialFile = params.get('f');
|
| 304 |
+
|
| 305 |
+
APP.innerHTML = `<div class="page"><div class="loading"><span class="spinner"></span>loading task…
|
| 306 |
+
<span class="sub">Fetching this task's files from the Hub — usually a second or two.</span>
|
| 307 |
+
</div></div>`;
|
| 308 |
+
|
| 309 |
+
// Sibling task list (for the side panel) + canonical ident/kind (run command).
|
| 310 |
+
// Cached per-uri so flipping between tasks doesn't refetch the list.
|
| 311 |
+
if (_taskSiblings.uri !== uri) {
|
| 312 |
+
try {
|
| 313 |
+
const ds = await api(`/api/dataset?uri=${enc(uri)}`);
|
| 314 |
+
_taskSiblings = { uri, tasks: ds.tasks || [], ident: ds.ident, kind: ds.kind };
|
| 315 |
+
} catch { _taskSiblings = { uri, tasks: [], ident: uri, kind: 'hf' }; }
|
| 316 |
+
}
|
| 317 |
+
const siblings = _taskSiblings.tasks;
|
| 318 |
+
const ident = _taskSiblings.ident || uri;
|
| 319 |
+
const kind = _taskSiblings.kind || 'hf';
|
| 320 |
+
|
| 321 |
+
const page = APP.querySelector('.page');
|
| 322 |
+
const collapsed = localStorage.getItem('hh-tasks-collapsed') === '1';
|
| 323 |
+
page.innerHTML = `
|
| 324 |
+
<div class="crumb">
|
| 325 |
+
<button class="nav-btn ghost" id="toggle-tasks" title="Toggle task list">${ICON.panel}</button>
|
| 326 |
+
<a href="#dataset?uri=${enc(uri)}">${esc(ident)}</a>
|
| 327 |
+
<span class="sep">/</span><span id="crumb-task">${esc(task)}</span>
|
| 328 |
+
<span id="crumb-diff"></span>
|
| 329 |
+
<span class="pos" id="crumb-pos" style="margin-left:auto"></span>
|
| 330 |
+
</div>
|
| 331 |
+
<div class="runbar">
|
| 332 |
+
<span class="lbl">${ICON.term}</span>
|
| 333 |
+
<code id="run-cmd"></code>
|
| 334 |
+
<span id="run-copy"></span>
|
| 335 |
+
</div>
|
| 336 |
+
<div class="taskview${collapsed ? ' collapsed' : ''}" id="taskview">
|
| 337 |
+
<div class="tasks-panel" id="tasks-panel">
|
| 338 |
+
<div class="tp-head">Tasks <span class="faint">${siblings.length}</span></div>
|
| 339 |
+
<div class="tp-search">${ICON.search}<input id="tp-search" placeholder="Filter tasks…" /></div>
|
| 340 |
+
<div class="tp-list" id="tp-list"></div>
|
| 341 |
+
</div>
|
| 342 |
+
<div class="tree" id="tree"></div>
|
| 343 |
+
<div class="content" id="content"></div>
|
| 344 |
+
</div>`;
|
| 345 |
+
|
| 346 |
+
const taskview = document.getElementById('taskview');
|
| 347 |
+
const tpList = document.getElementById('tp-list');
|
| 348 |
+
const tree = document.getElementById('tree');
|
| 349 |
+
const content = document.getElementById('content');
|
| 350 |
+
const runbar = page.querySelector('.runbar');
|
| 351 |
+
const runCode = document.getElementById('run-cmd');
|
| 352 |
+
const runCopyHolder = document.getElementById('run-copy');
|
| 353 |
+
|
| 354 |
+
document.getElementById('toggle-tasks').onclick = () => {
|
| 355 |
+
taskview.classList.toggle('collapsed');
|
| 356 |
+
localStorage.setItem('hh-tasks-collapsed', taskview.classList.contains('collapsed') ? '1' : '0');
|
| 357 |
+
};
|
| 358 |
+
|
| 359 |
+
// ── tasks side panel ──
|
| 360 |
+
function drawPanel(filter = '') {
|
| 361 |
+
tpList.innerHTML = '';
|
| 362 |
+
const q = filter.trim().toLowerCase();
|
| 363 |
+
const list = q ? siblings.filter(s => s.toLowerCase().includes(q)) : siblings;
|
| 364 |
+
list.slice(0, 1000).forEach(tid => {
|
| 365 |
+
const r = document.createElement('div');
|
| 366 |
+
r.className = 'tp-item' + (tid === task ? ' active' : '');
|
| 367 |
+
r.textContent = tid; r.title = tid; r.dataset.tid = tid;
|
| 368 |
+
r.onclick = () => { if (tid !== task) loadDetail(tid, null); };
|
| 369 |
+
tpList.appendChild(r);
|
| 370 |
+
});
|
| 371 |
+
if (list.length > 1000) {
|
| 372 |
+
const m = document.createElement('div'); m.className = 'empty'; m.textContent = `showing 1000 of ${list.length} — filter to narrow`;
|
| 373 |
+
tpList.appendChild(m);
|
| 374 |
+
}
|
| 375 |
+
}
|
| 376 |
+
drawPanel();
|
| 377 |
+
const tps = document.getElementById('tp-search');
|
| 378 |
+
let ft;
|
| 379 |
+
tps.addEventListener('input', () => { clearTimeout(ft); ft = setTimeout(() => drawPanel(tps.value), 100); });
|
| 380 |
+
|
| 381 |
+
function syncPanelActive(tid) {
|
| 382 |
+
tpList.querySelectorAll('.tp-item').forEach(n => n.classList.toggle('active', n.dataset.tid === tid));
|
| 383 |
+
const a = tpList.querySelector('.tp-item.active'); if (a) a.scrollIntoView({ block: 'nearest' });
|
| 384 |
+
}
|
| 385 |
+
|
| 386 |
+
// ─�� load one task's detail into the tree + content (no full re-render) ──
|
| 387 |
+
async function loadDetail(tid, wantFile) {
|
| 388 |
+
task = tid;
|
| 389 |
+
syncPanelActive(tid);
|
| 390 |
+
document.getElementById('crumb-task').textContent = tid;
|
| 391 |
+
const i = siblings.indexOf(tid);
|
| 392 |
+
document.getElementById('crumb-pos').textContent = i >= 0 ? `${i + 1} / ${siblings.length}` : '';
|
| 393 |
+
history.replaceState(null, '', '#' + `task?${qs({ uri, task: tid })}`);
|
| 394 |
+
|
| 395 |
+
const cmd = harborCmd(kind, ident, tid);
|
| 396 |
+
runCode.textContent = cmd;
|
| 397 |
+
runCopyHolder.innerHTML = '';
|
| 398 |
+
const rc = copyButton(cmd);
|
| 399 |
+
rc.addEventListener('click', () => { runbar.classList.add('copied'); setTimeout(() => runbar.classList.remove('copied'), 1100); });
|
| 400 |
+
runCopyHolder.appendChild(rc);
|
| 401 |
+
|
| 402 |
+
tree.innerHTML = '';
|
| 403 |
+
content.innerHTML = `<div class="loading"><span class="spinner"></span>loading task…</div>`;
|
| 404 |
+
let t;
|
| 405 |
+
try { t = await api(`/api/task?${qs({ uri, task: tid })}`); }
|
| 406 |
+
catch (e) { content.innerHTML = `<div class="errbox">${esc(e.message)}</div>`; return; }
|
| 407 |
+
if (task !== tid) return; // a newer click superseded this fetch
|
| 408 |
+
|
| 409 |
+
document.getElementById('crumb-diff').innerHTML = t.difficulty ? `<span class="pill">${esc(t.difficulty)}</span>` : '';
|
| 410 |
+
buildDetail(t, wantFile);
|
| 411 |
+
}
|
| 412 |
+
|
| 413 |
+
function buildDetail(t, wantFile) {
|
| 414 |
+
const files = t.files || {};
|
| 415 |
+
const paths = Object.keys(files).sort();
|
| 416 |
+
tree.innerHTML = `<div class="thead2">${esc(t.id)}</div>`;
|
| 417 |
+
|
| 418 |
+
function node(label, indent, type, onClick, active) {
|
| 419 |
+
const n = document.createElement('div');
|
| 420 |
+
n.className = 'tnode' + (type === 'dir' ? ' dir' : '') + (active ? ' active' : '');
|
| 421 |
+
n.style.paddingLeft = (14 + indent * 16) + 'px';
|
| 422 |
+
n.innerHTML = (type === 'dir' ? ICON.dir : type === 'info' ? ICON.info : ICON.file) + `<span>${esc(label)}</span>`;
|
| 423 |
+
if (onClick) n.onclick = onClick;
|
| 424 |
+
return n;
|
| 425 |
+
}
|
| 426 |
+
const nodes = {};
|
| 427 |
+
function setHashFile(f) { return `task?${qs({ uri, task: t.id, f })}`; }
|
| 428 |
+
function select(id) {
|
| 429 |
+
Object.values(nodes).forEach(n => n.classList.remove('active'));
|
| 430 |
+
if (nodes[id]) nodes[id].classList.add('active');
|
| 431 |
+
if (id === '__overview__') showOverview(); else showFile(id);
|
| 432 |
+
}
|
| 433 |
+
const ov = node('Overview', 0, 'info', () => { history.replaceState(null, '', '#' + setHashFile('__overview__')); select('__overview__'); });
|
| 434 |
+
nodes['__overview__'] = ov; tree.appendChild(ov);
|
| 435 |
+
const groups = {}; const top = [];
|
| 436 |
+
paths.forEach(p => { if (p.includes('/')) { const f = p.split('/')[0]; (groups[f] = groups[f] || []).push(p); } else top.push(p); });
|
| 437 |
+
top.forEach(p => { const n = node(p, 0, 'file', () => { history.replaceState(null, '', '#' + setHashFile(p)); select(p); }); nodes[p] = n; tree.appendChild(n); });
|
| 438 |
+
Object.keys(groups).sort().forEach(folder => {
|
| 439 |
+
tree.appendChild(node(folder + '/', 0, 'dir'));
|
| 440 |
+
groups[folder].sort().forEach(p => { const n = node(p.split('/').slice(1).join('/'), 1, 'file', () => { history.replaceState(null, '', '#' + setHashFile(p)); select(p); }); nodes[p] = n; tree.appendChild(n); });
|
| 441 |
+
});
|
| 442 |
+
|
| 443 |
+
function showOverview() {
|
| 444 |
+
const rows = [];
|
| 445 |
+
const add = (k, v) => { if (v != null && v !== '' && !(Array.isArray(v) && !v.length)) rows.push([k, v]); };
|
| 446 |
+
add('Task id', t.id); add('Name', t.name); add('Org', t.org); add('Version', t.version);
|
| 447 |
+
add('Difficulty', t.difficulty); add('Category', t.category);
|
| 448 |
+
add('Agent timeout', t.agent_timeout_sec != null ? t.agent_timeout_sec + 's' : null);
|
| 449 |
+
add('Verifier timeout', t.verifier_timeout_sec != null ? t.verifier_timeout_sec + 's' : null);
|
| 450 |
+
let html = `<div class="fhead"><span class="path">${ICON.info} Overview</span></div>`;
|
| 451 |
+
if (t.description) html += `<div class="md">${marked.parse(t.description)}</div>`;
|
| 452 |
+
html += '<table class="kv">';
|
| 453 |
+
rows.forEach(([k, v]) => html += `<tr><td>${esc(k)}</td><td>${esc(v)}</td></tr>`);
|
| 454 |
+
if (t.keywords && t.keywords.length) html += `<tr><td>Keywords</td><td>${t.keywords.map(k => `<span class="kw">${esc(k)}</span>`).join('')}</td></tr>`;
|
| 455 |
+
if (t.repo2env) html += `<tr><td>repo2env</td><td><pre style="margin:0;padding:0;background:none">${esc(JSON.stringify(t.repo2env, null, 2))}</pre></td></tr>`;
|
| 456 |
+
html += '</table>';
|
| 457 |
+
const instr = files['instruction.md'] || t.instruction_inline;
|
| 458 |
+
if (instr) html += `<div class="fhead"><span class="path">${ICON.file} instruction.md</span></div><div class="md">${marked.parse(instr)}</div>`;
|
| 459 |
+
content.innerHTML = html;
|
| 460 |
+
}
|
| 461 |
+
function showFile(path) {
|
| 462 |
+
const body = files[path] != null ? files[path] : (path === 'task.toml' ? t.task_toml_raw : '');
|
| 463 |
+
const fhead = document.createElement('div'); fhead.className = 'fhead';
|
| 464 |
+
fhead.innerHTML = `<span class="path">${ICON.file} ${esc(path)}</span>`;
|
| 465 |
+
fhead.appendChild(copyButton(body));
|
| 466 |
+
content.innerHTML = '';
|
| 467 |
+
content.appendChild(fhead);
|
| 468 |
+
if (path.endsWith('.md')) {
|
| 469 |
+
const d = document.createElement('div'); d.className = 'md'; d.innerHTML = marked.parse(body); content.appendChild(d);
|
| 470 |
+
} else {
|
| 471 |
+
const pre = document.createElement('pre'); const code = document.createElement('code');
|
| 472 |
+
code.className = 'language-' + langFor(path); code.textContent = body;
|
| 473 |
+
pre.appendChild(code); content.appendChild(pre);
|
| 474 |
+
try { hljs.highlightElement(code); } catch {}
|
| 475 |
+
}
|
| 476 |
+
content.scrollTop = 0;
|
| 477 |
+
}
|
| 478 |
+
|
| 479 |
+
select(wantFile && (nodes[wantFile] || wantFile === '__overview__') ? wantFile : '__overview__');
|
| 480 |
+
}
|
| 481 |
+
|
| 482 |
+
await loadDetail(task, initialFile);
|
| 483 |
+
}
|
| 484 |
+
|
| 485 |
+
/* ── router ───────────────────────────────────────── */
|
| 486 |
+
function router() {
|
| 487 |
+
const raw = location.hash.slice(1) || '/';
|
| 488 |
+
const [route, query] = raw.split('?');
|
| 489 |
+
const params = new URLSearchParams(query || '');
|
| 490 |
+
window.scrollTo(0, 0);
|
| 491 |
+
if (route === '/' || route === '' || route === 'home') return renderHome();
|
| 492 |
+
if (route === '/datasets' || route === 'datasets') return renderDatasets(params);
|
| 493 |
+
if (route === 'dataset') return renderDataset(params);
|
| 494 |
+
if (route === 'task') return renderTask(params);
|
| 495 |
+
renderHome();
|
| 496 |
+
}
|
| 497 |
+
|
| 498 |
+
// ⌘K focuses search on datasets page (and jumps there otherwise)
|
| 499 |
+
document.addEventListener('keydown', (e) => {
|
| 500 |
+
if ((e.metaKey || e.ctrlKey) && e.key === 'k') {
|
| 501 |
+
e.preventDefault();
|
| 502 |
+
const s = document.getElementById('ds-search') || document.getElementById('load-input');
|
| 503 |
+
if (s) s.focus(); else location.hash = '/datasets';
|
| 504 |
+
}
|
| 505 |
+
});
|
| 506 |
+
|
| 507 |
+
// ?dataset= / ?d= prefill (legacy Gradio-style deep link) → dataset view
|
| 508 |
+
(function prefill() {
|
| 509 |
+
const p = new URLSearchParams(location.search);
|
| 510 |
+
const d = p.get('dataset') || p.get('d');
|
| 511 |
+
if (d && !location.hash) { location.hash = `dataset?uri=${enc(d)}`; }
|
| 512 |
+
})();
|
| 513 |
+
|
| 514 |
+
window.addEventListener('hashchange', router);
|
| 515 |
+
router();
|
static/index.html
ADDED
|
@@ -0,0 +1,35 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
<!DOCTYPE html>
|
| 2 |
+
<html lang="en" data-theme="dark">
|
| 3 |
+
<head>
|
| 4 |
+
<meta charset="utf-8">
|
| 5 |
+
<meta name="viewport" content="width=device-width, initial-scale=1">
|
| 6 |
+
<title>Hugging Face Harbor Visualiser — browse Harbor task-spec datasets</title>
|
| 7 |
+
<link rel="icon" href="data:image/svg+xml,<svg xmlns='http://www.w3.org/2000/svg' viewBox='0 0 100 100'><text y='.9em' font-size='90'>🤗</text></svg>">
|
| 8 |
+
<link rel="preconnect" href="https://fonts.googleapis.com">
|
| 9 |
+
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
|
| 10 |
+
<link href="https://fonts.googleapis.com/css2?family=JetBrains+Mono:wght@400;500;600;700&display=swap" rel="stylesheet">
|
| 11 |
+
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.9.0/styles/github-dark.min.css">
|
| 12 |
+
<link rel="stylesheet" href="/static/style.css">
|
| 13 |
+
</head>
|
| 14 |
+
<body>
|
| 15 |
+
<nav class="nav">
|
| 16 |
+
<a class="brand" href="#/"><span class="logo">🤗</span> Harbor Visualiser</a>
|
| 17 |
+
<div class="links">
|
| 18 |
+
<a href="#/" data-nav="home">Home</a>
|
| 19 |
+
<a href="#/datasets" data-nav="datasets">Datasets</a>
|
| 20 |
+
<a href="https://www.harborframework.com" target="_blank" rel="noopener" class="ext">Harbor ↗</a>
|
| 21 |
+
</div>
|
| 22 |
+
<div class="spacer"></div>
|
| 23 |
+
<div class="theme-toggle" id="theme-toggle">
|
| 24 |
+
<button data-mode="light" title="Light"><svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"><circle cx="12" cy="12" r="4"/><path d="M12 2v2M12 20v2M4.9 4.9l1.4 1.4M17.7 17.7l1.4 1.4M2 12h2M20 12h2M4.9 19.1l1.4-1.4M17.7 6.3l1.4-1.4"/></svg></button>
|
| 25 |
+
<button data-mode="system" title="System"><svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"><rect x="2" y="3" width="20" height="14" rx="2"/><path d="M8 21h8M12 17v4"/></svg></button>
|
| 26 |
+
<button data-mode="dark" title="Dark"><svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"><path d="M21 12.8A9 9 0 1 1 11.2 3a7 7 0 0 0 9.8 9.8z"/></svg></button>
|
| 27 |
+
</div>
|
| 28 |
+
</nav>
|
| 29 |
+
<main id="app" class="wrap"></main>
|
| 30 |
+
|
| 31 |
+
<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.9.0/highlight.min.js"></script>
|
| 32 |
+
<script src="https://cdnjs.cloudflare.com/ajax/libs/marked/12.0.0/marked.min.js"></script>
|
| 33 |
+
<script src="/static/app.js"></script>
|
| 34 |
+
</body>
|
| 35 |
+
</html>
|
static/style.css
ADDED
|
@@ -0,0 +1,285 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
/* Hugging Face Harbor Visualiser — Hugging Face themed dark/light. */
|
| 2 |
+
:root {
|
| 3 |
+
--bg: #ffffff; --panel: #f9fafb; --panel-2: #f1f3f5;
|
| 4 |
+
--border: #e5e7eb; --border-strong: #d4d7dd;
|
| 5 |
+
--text: #1b1b1f; --muted: #5b6270; --faint: #99a0ad;
|
| 6 |
+
--accent: #e88b00; --accent-soft: rgba(255,157,0,.12);
|
| 7 |
+
--hf-yellow: #ffd21e; --hf-orange: #ff9d00;
|
| 8 |
+
--ok: #16a34a; --warn: #d97706; --err: #dc2626;
|
| 9 |
+
--hover: #f1f3f5;
|
| 10 |
+
--mono: 'JetBrains Mono', ui-monospace, 'SF Mono', SFMono-Regular, Menlo, Consolas, monospace;
|
| 11 |
+
--radius: 10px; --nav-h: 56px; --maxw: 1180px;
|
| 12 |
+
}
|
| 13 |
+
:root[data-theme="dark"] {
|
| 14 |
+
--bg: #0b0d12; --panel: #11141b; --panel-2: #1a1e27;
|
| 15 |
+
--border: #232834; --border-strong: #323847;
|
| 16 |
+
--text: #f3f4f6; --muted: #9aa1ad; --faint: #646b78;
|
| 17 |
+
--accent: #ffae45; --accent-soft: rgba(255,174,69,.13);
|
| 18 |
+
--hf-yellow: #ffd21e; --hf-orange: #ff9d00;
|
| 19 |
+
--ok: #4ade80; --warn: #fbbf24; --err: #f87171;
|
| 20 |
+
--hover: #181c24;
|
| 21 |
+
}
|
| 22 |
+
* { box-sizing: border-box; }
|
| 23 |
+
html, body { margin: 0; padding: 0; overflow-x: hidden; max-width: 100%; }
|
| 24 |
+
body {
|
| 25 |
+
background: var(--bg); color: var(--text);
|
| 26 |
+
font-family: var(--mono); font-size: 14px; line-height: 1.55;
|
| 27 |
+
-webkit-font-smoothing: antialiased;
|
| 28 |
+
}
|
| 29 |
+
a { color: inherit; text-decoration: none; }
|
| 30 |
+
button { font-family: inherit; cursor: pointer; }
|
| 31 |
+
code, pre { font-family: var(--mono); }
|
| 32 |
+
::selection { background: var(--accent-soft); }
|
| 33 |
+
|
| 34 |
+
/* ── nav ───────────────────────────────────────────── */
|
| 35 |
+
.nav {
|
| 36 |
+
position: sticky; top: 0; z-index: 50;
|
| 37 |
+
height: var(--nav-h); display: flex; align-items: center; gap: 22px;
|
| 38 |
+
padding: 0 22px; background: color-mix(in srgb, var(--bg) 86%, transparent);
|
| 39 |
+
backdrop-filter: blur(10px); border-bottom: 1px solid var(--border);
|
| 40 |
+
}
|
| 41 |
+
.nav .brand { display: flex; align-items: center; gap: 9px; font-weight: 700; font-size: 15px; letter-spacing: -.2px; }
|
| 42 |
+
.nav .brand:hover { color: var(--text); }
|
| 43 |
+
.nav .brand .logo { font-size: 20px; line-height: 1; filter: saturate(1.15); }
|
| 44 |
+
.nav .brand .tag { font-size: 9px; font-weight: 700; letter-spacing: .6px; text-transform: uppercase;
|
| 45 |
+
color: #1b1b1f; background: var(--hf-yellow); padding: 2px 6px; border-radius: 5px; }
|
| 46 |
+
.nav .links { display: flex; gap: 18px; }
|
| 47 |
+
.nav .links a { color: var(--muted); font-size: 13px; }
|
| 48 |
+
.nav .links a:hover, .nav .links a.active { color: var(--text); }
|
| 49 |
+
.nav .spacer { flex: 1; }
|
| 50 |
+
.theme-toggle { display: flex; border: 1px solid var(--border); border-radius: 8px; overflow: hidden; }
|
| 51 |
+
.theme-toggle button {
|
| 52 |
+
background: transparent; border: 0; color: var(--faint);
|
| 53 |
+
padding: 6px 9px; display: grid; place-items: center; line-height: 0;
|
| 54 |
+
}
|
| 55 |
+
.theme-toggle button:hover { color: var(--text); background: var(--hover); }
|
| 56 |
+
.theme-toggle button.active { color: var(--text); background: var(--panel-2); }
|
| 57 |
+
.theme-toggle svg { width: 15px; height: 15px; }
|
| 58 |
+
|
| 59 |
+
/* ── layout ────────────────────────────────────────── */
|
| 60 |
+
.wrap { max-width: var(--maxw); margin: 0 auto; padding: 0 22px; }
|
| 61 |
+
.page { padding: 34px 0 80px; }
|
| 62 |
+
h1 { font-size: 30px; font-weight: 700; letter-spacing: -.5px; margin: 0 0 18px; }
|
| 63 |
+
h2 { font-size: 18px; font-weight: 600; margin: 0 0 12px; }
|
| 64 |
+
.muted { color: var(--muted); }
|
| 65 |
+
.faint { color: var(--faint); }
|
| 66 |
+
|
| 67 |
+
/* thin Hugging Face accent strip at the very top */
|
| 68 |
+
body::before { content: ""; display: block; height: 3px;
|
| 69 |
+
background: linear-gradient(90deg, var(--hf-yellow), var(--hf-orange)); }
|
| 70 |
+
|
| 71 |
+
/* hero */
|
| 72 |
+
.hero { text-align: center; padding: 60px 0 30px; }
|
| 73 |
+
.hero .mark { font-size: 56px; line-height: 1; margin-bottom: 10px; }
|
| 74 |
+
.hero h1 { font-size: 46px; margin: 0 0 14px; letter-spacing: -1.4px; }
|
| 75 |
+
.hero h1 .hf { background: linear-gradient(90deg, var(--hf-orange), var(--hf-yellow));
|
| 76 |
+
-webkit-background-clip: text; background-clip: text; -webkit-text-fill-color: transparent; }
|
| 77 |
+
.hero p { color: var(--muted); font-size: 14.5px; margin: 0 auto; max-width: 640px; line-height: 1.6; }
|
| 78 |
+
|
| 79 |
+
/* how-to / embed instructions */
|
| 80 |
+
.howto { margin: 46px 0 0; }
|
| 81 |
+
.howto h2 { font-size: 15px; margin: 0 0 14px; }
|
| 82 |
+
.howto .steps { display: grid; gap: 14px; grid-template-columns: 1fr 1fr; }
|
| 83 |
+
.howto .step { min-width: 0; border: 1px solid var(--border); border-radius: var(--radius); background: var(--panel); padding: 16px 18px; }
|
| 84 |
+
.howto .step h3 { font-size: 13px; margin: 0 0 4px; font-weight: 600; }
|
| 85 |
+
.howto .step p { color: var(--muted); font-size: 12.5px; margin: 0 0 12px; line-height: 1.55; }
|
| 86 |
+
.snippet { display: flex; align-items: center; gap: 10px; background: var(--panel-2);
|
| 87 |
+
border: 1px solid var(--border); border-radius: 8px; padding: 9px 12px; max-width: 100%; overflow: hidden; }
|
| 88 |
+
.snippet code { font-size: 12px; color: var(--text); overflow-x: auto; white-space: nowrap; flex: 1; min-width: 0; }
|
| 89 |
+
.snippet code::-webkit-scrollbar { height: 5px; }
|
| 90 |
+
.snippet code::-webkit-scrollbar-thumb { background: var(--border-strong); border-radius: 3px; }
|
| 91 |
+
.snippet .copy { color: var(--faint); line-height: 0; flex: none; }
|
| 92 |
+
.snippet .copy:hover { color: var(--accent); }
|
| 93 |
+
.snippet .copy svg { width: 14px; height: 14px; }
|
| 94 |
+
.howto .badge-preview { display: inline-flex; align-items: stretch; font-size: 11px; font-weight: 700;
|
| 95 |
+
border-radius: 5px; overflow: hidden; margin-bottom: 11px; }
|
| 96 |
+
.howto .badge-preview .l { background: #555; color: #fff; padding: 3px 8px; }
|
| 97 |
+
.howto .badge-preview .r { background: var(--hf-yellow); color: #1b1b1f; padding: 3px 8px; }
|
| 98 |
+
@media (max-width: 700px) { .howto .steps { grid-template-columns: 1fr; } }
|
| 99 |
+
/* accent link + nav external link */
|
| 100 |
+
.hl { color: var(--accent); }
|
| 101 |
+
.hl:hover { text-decoration: underline; }
|
| 102 |
+
.nav .links a.ext { color: var(--faint); }
|
| 103 |
+
.nav .links a.ext:hover { color: var(--accent); }
|
| 104 |
+
/* footer attribution */
|
| 105 |
+
.footer { max-width: 620px; margin: 56px auto 0; text-align: center; color: var(--faint);
|
| 106 |
+
font-size: 12px; line-height: 1.7; border-top: 1px solid var(--border); padding-top: 22px; }
|
| 107 |
+
|
| 108 |
+
/* ── table card ────────────────────────────────────── */
|
| 109 |
+
.card { border: 1px solid var(--border); border-radius: var(--radius); overflow: hidden; background: var(--panel); }
|
| 110 |
+
.thead { display: flex; padding: 11px 18px; border-bottom: 1px solid var(--border);
|
| 111 |
+
font-size: 11px; letter-spacing: .8px; text-transform: uppercase; color: var(--faint); }
|
| 112 |
+
.thead .col-tasks { margin-left: auto; }
|
| 113 |
+
.row {
|
| 114 |
+
display: flex; align-items: center; gap: 8px; padding: 13px 18px;
|
| 115 |
+
border-bottom: 1px solid var(--border); cursor: pointer; transition: background .08s;
|
| 116 |
+
}
|
| 117 |
+
.row:last-child { border-bottom: 0; }
|
| 118 |
+
.row:hover { background: var(--hover); }
|
| 119 |
+
.row .name { font-size: 13.5px; white-space: nowrap; overflow: hidden; text-overflow: ellipsis; }
|
| 120 |
+
.row .copy { color: var(--faint); opacity: 0; transition: opacity .1s; line-height: 0; }
|
| 121 |
+
.row:hover .copy { opacity: 1; }
|
| 122 |
+
.row .copy:hover { color: var(--text); }
|
| 123 |
+
.row .copy svg { width: 13px; height: 13px; }
|
| 124 |
+
.row .tasks { margin-left: auto; color: var(--muted); font-variant-numeric: tabular-nums; font-size: 13px; padding-left: 14px; }
|
| 125 |
+
.row .tasks .spin { color: var(--faint); }
|
| 126 |
+
|
| 127 |
+
/* search */
|
| 128 |
+
.search { display: flex; align-items: center; gap: 10px; border: 1px solid var(--border);
|
| 129 |
+
border-radius: var(--radius); padding: 11px 16px; background: var(--panel); margin-bottom: 18px; }
|
| 130 |
+
.search:focus-within { border-color: var(--border-strong); }
|
| 131 |
+
.search svg { width: 16px; height: 16px; color: var(--faint); flex: none; }
|
| 132 |
+
.search input { flex: 1; background: transparent; border: 0; outline: 0; color: var(--text); font-family: var(--mono); font-size: 14px; }
|
| 133 |
+
.search input::placeholder { color: var(--faint); }
|
| 134 |
+
.search .kbd { color: var(--faint); font-size: 11px; border: 1px solid var(--border); border-radius: 5px; padding: 2px 6px; }
|
| 135 |
+
.search select { background: var(--panel-2); color: var(--muted); border: 1px solid var(--border); border-radius: 6px; padding: 5px 8px; font-family: var(--mono); font-size: 12px; }
|
| 136 |
+
|
| 137 |
+
/* buttons */
|
| 138 |
+
.btn { display: inline-flex; align-items: center; gap: 7px; background: var(--panel-2);
|
| 139 |
+
border: 1px solid var(--border); border-radius: 8px; color: var(--text);
|
| 140 |
+
padding: 9px 16px; font-size: 13px; transition: background .1s, border-color .1s; }
|
| 141 |
+
.btn:hover { background: var(--hover); border-color: var(--border-strong); }
|
| 142 |
+
.center { text-align: center; margin-top: 26px; }
|
| 143 |
+
|
| 144 |
+
/* pills / badges */
|
| 145 |
+
.pill { display: inline-flex; align-items: center; gap: 5px; font-size: 11px; padding: 2px 8px;
|
| 146 |
+
border-radius: 999px; border: 1px solid var(--border); color: var(--muted); background: var(--panel-2); }
|
| 147 |
+
.pill.ok { color: var(--ok); border-color: color-mix(in srgb, var(--ok) 35%, var(--border)); }
|
| 148 |
+
|
| 149 |
+
/* breadcrumb */
|
| 150 |
+
.crumb { display: flex; align-items: center; gap: 8px; color: var(--muted); font-size: 13px; margin-bottom: 16px; flex-wrap: wrap; }
|
| 151 |
+
.crumb a:hover { color: var(--text); }
|
| 152 |
+
.crumb .sep { color: var(--faint); }
|
| 153 |
+
|
| 154 |
+
/* ── dataset / task viewer (split) ─────────────────── */
|
| 155 |
+
.viewer { display: grid; grid-template-columns: 300px 1fr; gap: 0; border: 1px solid var(--border);
|
| 156 |
+
border-radius: var(--radius); overflow: hidden; min-height: 70vh; }
|
| 157 |
+
.tree { border-right: 1px solid var(--border); background: var(--panel); overflow: auto; max-height: 80vh; }
|
| 158 |
+
.tree .thead2 { padding: 11px 16px; font-size: 11px; letter-spacing: .8px; text-transform: uppercase;
|
| 159 |
+
color: var(--faint); border-bottom: 1px solid var(--border); position: sticky; top: 0; background: var(--panel); }
|
| 160 |
+
.tnode { display: flex; align-items: center; gap: 7px; padding: 6px 14px; cursor: pointer; font-size: 13px;
|
| 161 |
+
color: var(--muted); white-space: nowrap; overflow: hidden; text-overflow: ellipsis; }
|
| 162 |
+
.tnode:hover { background: var(--hover); color: var(--text); }
|
| 163 |
+
.tnode.active { background: var(--accent-soft); color: var(--text); }
|
| 164 |
+
.tnode.dir { color: var(--text); }
|
| 165 |
+
.tnode .ind { display: inline-block; }
|
| 166 |
+
.tnode svg { width: 14px; height: 14px; flex: none; color: var(--faint); }
|
| 167 |
+
.tnode.active svg { color: var(--accent); }
|
| 168 |
+
|
| 169 |
+
/* task master-detail: [tasks panel | file tree | content] */
|
| 170 |
+
.taskview { display: grid; grid-template-columns: 250px 230px 1fr; border: 1px solid var(--border);
|
| 171 |
+
border-radius: var(--radius); overflow: hidden; height: calc(100vh - 200px); min-height: 460px;
|
| 172 |
+
transition: grid-template-columns .18s ease; }
|
| 173 |
+
.taskview.collapsed { grid-template-columns: 0 230px 1fr; }
|
| 174 |
+
.taskview.collapsed .tasks-panel { opacity: 0; pointer-events: none; }
|
| 175 |
+
.tasks-panel { display: flex; flex-direction: column; min-width: 0; border-right: 1px solid var(--border);
|
| 176 |
+
background: var(--panel-2); overflow: hidden; transition: opacity .15s ease; }
|
| 177 |
+
.tasks-panel .tp-head { padding: 11px 14px; font-size: 11px; letter-spacing: .8px; text-transform: uppercase;
|
| 178 |
+
color: var(--faint); border-bottom: 1px solid var(--border); flex: none; }
|
| 179 |
+
.tasks-panel .tp-search { display: flex; align-items: center; gap: 7px; padding: 8px 12px; border-bottom: 1px solid var(--border); flex: none; }
|
| 180 |
+
.tasks-panel .tp-search svg { width: 14px; height: 14px; color: var(--faint); flex: none; }
|
| 181 |
+
.tasks-panel .tp-search input { flex: 1; min-width: 0; background: transparent; border: 0; outline: 0; color: var(--text); font-family: var(--mono); font-size: 12.5px; }
|
| 182 |
+
.tasks-panel .tp-search input::placeholder { color: var(--faint); }
|
| 183 |
+
.tp-list { overflow: auto; flex: 1; }
|
| 184 |
+
.tp-item { padding: 8px 14px; font-size: 12.5px; color: var(--muted); cursor: pointer;
|
| 185 |
+
white-space: nowrap; overflow: hidden; text-overflow: ellipsis; border-left: 2px solid transparent; }
|
| 186 |
+
.tp-item:hover { background: var(--hover); color: var(--text); }
|
| 187 |
+
.tp-item.active { background: var(--accent-soft); color: var(--text); border-left-color: var(--hf-orange); }
|
| 188 |
+
.tp-list .empty { padding: 14px; font-size: 11.5px; color: var(--faint); }
|
| 189 |
+
|
| 190 |
+
.content { overflow: auto; max-height: 80vh; background: var(--bg); }
|
| 191 |
+
.taskview .tree, .taskview .content { max-height: none; }
|
| 192 |
+
.content .fhead { display: flex; align-items: center; gap: 10px; padding: 10px 16px;
|
| 193 |
+
border-bottom: 1px solid var(--border); position: sticky; top: 0; background: var(--bg); z-index: 2; }
|
| 194 |
+
.content .fhead .path { display: inline-flex; align-items: center; gap: 7px; font-size: 13px; color: var(--muted); min-width: 0; }
|
| 195 |
+
.content .fhead svg { width: 15px; height: 15px; flex: none; color: var(--faint); }
|
| 196 |
+
.content .fhead .copy { margin-left: auto; }
|
| 197 |
+
.content pre { margin: 0; padding: 16px; overflow: auto; font-size: 12.5px; line-height: 1.6; }
|
| 198 |
+
.content pre code { background: transparent !important; padding: 0 !important; }
|
| 199 |
+
.content .md { padding: 18px 22px; }
|
| 200 |
+
.content .md h1 { font-size: 22px; } .content .md h2 { font-size: 17px; } .content .md pre { background: var(--panel-2); border-radius: 8px; }
|
| 201 |
+
.content .md code { background: var(--panel-2); padding: 1px 5px; border-radius: 4px; font-size: 12.5px; }
|
| 202 |
+
|
| 203 |
+
/* overview */
|
| 204 |
+
.kv { width: 100%; border-collapse: collapse; }
|
| 205 |
+
.kv td { padding: 9px 16px; border-bottom: 1px solid var(--border); vertical-align: top; font-size: 13px; }
|
| 206 |
+
.kv td:first-child { color: var(--muted); width: 180px; white-space: nowrap; }
|
| 207 |
+
.kw { display: inline-block; font-size: 11px; padding: 2px 8px; border: 1px solid var(--border); border-radius: 999px; margin: 0 4px 4px 0; color: var(--muted); }
|
| 208 |
+
|
| 209 |
+
/* task list (within dataset) — scrolls internally, not the whole page */
|
| 210 |
+
.tasklist { max-height: calc(100vh - 240px); overflow-y: auto; }
|
| 211 |
+
.tasklist .thead { position: sticky; top: 0; background: var(--panel); z-index: 1; }
|
| 212 |
+
.tasklist .row .name { font-size: 13px; }
|
| 213 |
+
.tasklist .row .tasks { font-size: 12px; }
|
| 214 |
+
|
| 215 |
+
/* states */
|
| 216 |
+
.loading, .empty, .errbox { padding: 50px 20px; text-align: center; color: var(--muted); }
|
| 217 |
+
.errbox { color: var(--err); }
|
| 218 |
+
.spinner { display: inline-block; width: 16px; height: 16px; border: 2px solid var(--border); border-top-color: var(--accent);
|
| 219 |
+
border-radius: 50%; animation: spin .7s linear infinite; vertical-align: -3px; margin-right: 8px; }
|
| 220 |
+
@keyframes spin { to { transform: rotate(360deg); } }
|
| 221 |
+
.copied { color: var(--ok) !important; }
|
| 222 |
+
|
| 223 |
+
/* code section (publish-style) */
|
| 224 |
+
.codeblock { border: 1px solid var(--border); border-radius: var(--radius); background: var(--panel); margin: 14px 0; overflow: hidden; }
|
| 225 |
+
.codeblock .chead { display: flex; padding: 9px 14px; border-bottom: 1px solid var(--border); font-size: 11px; letter-spacing: .6px; text-transform: uppercase; color: var(--faint); }
|
| 226 |
+
.codeblock .chead .copy { margin-left: auto; }
|
| 227 |
+
.codeblock pre { margin: 0; padding: 14px; font-size: 12.5px; overflow: auto; }
|
| 228 |
+
|
| 229 |
+
/* run-this-task command — styled like a terminal block */
|
| 230 |
+
.runbar { display: flex; align-items: center; gap: 10px; margin: 0 0 14px;
|
| 231 |
+
background: #0d1117; border: 1px solid var(--border-strong); border-radius: var(--radius);
|
| 232 |
+
padding: 11px 14px; box-shadow: inset 0 0 0 1px rgba(255,255,255,.02); }
|
| 233 |
+
:root[data-theme="light"] .runbar { background: #1b1f27; }
|
| 234 |
+
.runbar .lbl { display: inline-flex; align-items: center; flex: none; color: #4ade80; line-height: 0; }
|
| 235 |
+
.runbar .lbl svg { width: 15px; height: 15px; }
|
| 236 |
+
.runbar code { font-family: var(--mono); font-size: 12.5px; color: #d7dce4; white-space: nowrap;
|
| 237 |
+
overflow-x: auto; flex: 1; min-width: 0; }
|
| 238 |
+
.runbar code::before { content: "$ "; color: #6b7280; }
|
| 239 |
+
.runbar code::-webkit-scrollbar { height: 5px; }
|
| 240 |
+
.runbar code::-webkit-scrollbar-thumb { background: #2b313b; border-radius: 3px; }
|
| 241 |
+
.runbar .copy { flex: none; color: #6b7280; line-height: 0; }
|
| 242 |
+
.runbar .copy:hover { color: var(--hf-yellow); }
|
| 243 |
+
.runbar .copy svg { width: 15px; height: 15px; }
|
| 244 |
+
.runbar.copied .copy { color: #4ade80; }
|
| 245 |
+
|
| 246 |
+
/* generic icon button (crumb toggle etc.) */
|
| 247 |
+
.nav-btn { display: grid; place-items: center; width: 30px; height: 30px; flex: none;
|
| 248 |
+
border: 1px solid var(--border); border-radius: 7px; background: var(--panel-2); color: var(--muted); }
|
| 249 |
+
.nav-btn:hover:not(:disabled) { color: var(--text); border-color: var(--border-strong); }
|
| 250 |
+
.nav-btn:disabled { opacity: .35; cursor: default; }
|
| 251 |
+
.nav-btn svg { width: 15px; height: 15px; }
|
| 252 |
+
.nav-btn.ghost { background: transparent; }
|
| 253 |
+
.crumb .pos { color: var(--faint); font-size: 11px; font-variant-numeric: tabular-nums; flex: none; }
|
| 254 |
+
|
| 255 |
+
/* inline hint / note (loading warnings, tag instruction) */
|
| 256 |
+
.hint { display: flex; align-items: flex-start; gap: 9px; color: var(--muted); font-size: 12.5px;
|
| 257 |
+
line-height: 1.55; background: var(--panel); border: 1px solid var(--border);
|
| 258 |
+
border-radius: var(--radius); padding: 13px 15px; }
|
| 259 |
+
.hint .ic { color: var(--hf-orange); flex: none; line-height: 0; margin-top: 1px; }
|
| 260 |
+
.hint code { background: var(--panel-2); padding: 1px 6px; border-radius: 4px; color: var(--text); font-size: 12px; }
|
| 261 |
+
.loading .sub { display: block; margin-top: 8px; font-size: 12px; color: var(--faint); }
|
| 262 |
+
|
| 263 |
+
/* responsive */
|
| 264 |
+
@media (max-width: 760px) {
|
| 265 |
+
.nav { gap: 12px; padding: 0 14px; }
|
| 266 |
+
.nav .brand { font-size: 13px; }
|
| 267 |
+
.wrap { padding: 0 14px; }
|
| 268 |
+
.hero h1 { font-size: 32px; }
|
| 269 |
+
.hero .mark { font-size: 46px; }
|
| 270 |
+
.viewer { grid-template-columns: 1fr; }
|
| 271 |
+
.tree { max-height: 200px; border-right: 0; border-bottom: 1px solid var(--border); }
|
| 272 |
+
.content { max-height: none; }
|
| 273 |
+
.nav .links { gap: 12px; }
|
| 274 |
+
.nav .links a:not(.ext) { display: none; }
|
| 275 |
+
.crumb { gap: 6px; }
|
| 276 |
+
.kv td:first-child { width: auto; }
|
| 277 |
+
|
| 278 |
+
/* task view stacks: tasks panel (toggleable) over tree over content */
|
| 279 |
+
.taskview { grid-template-columns: 1fr; height: auto; }
|
| 280 |
+
.taskview.collapsed { grid-template-columns: 1fr; }
|
| 281 |
+
.tasks-panel { max-height: 220px; border-right: 0; border-bottom: 1px solid var(--border); }
|
| 282 |
+
.taskview.collapsed .tasks-panel { display: none; }
|
| 283 |
+
.taskview .tree { max-height: 200px; border-right: 0; border-bottom: 1px solid var(--border); }
|
| 284 |
+
.taskview .content { max-height: 70vh; }
|
| 285 |
+
}
|
viewer/__init__.py
CHANGED
|
@@ -1,12 +1,13 @@
|
|
| 1 |
"""Harbor Visualiser — load + parse Harbor task spec datasets."""
|
| 2 |
|
| 3 |
-
from viewer.load import DatasetSource, fetch_dataset, parse_dataset_uri
|
| 4 |
from viewer.parse import HarborTask, list_tasks, load_task
|
| 5 |
|
| 6 |
__all__ = [
|
| 7 |
"DatasetSource",
|
| 8 |
"HarborTask",
|
| 9 |
"fetch_dataset",
|
|
|
|
| 10 |
"list_tasks",
|
| 11 |
"load_task",
|
| 12 |
"parse_dataset_uri",
|
|
|
|
| 1 |
"""Harbor Visualiser — load + parse Harbor task spec datasets."""
|
| 2 |
|
| 3 |
+
from viewer.load import DatasetSource, fetch_dataset, fetch_hf_task, parse_dataset_uri
|
| 4 |
from viewer.parse import HarborTask, list_tasks, load_task
|
| 5 |
|
| 6 |
__all__ = [
|
| 7 |
"DatasetSource",
|
| 8 |
"HarborTask",
|
| 9 |
"fetch_dataset",
|
| 10 |
+
"fetch_hf_task",
|
| 11 |
"list_tasks",
|
| 12 |
"load_task",
|
| 13 |
"parse_dataset_uri",
|
viewer/hub.py
ADDED
|
@@ -0,0 +1,121 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Discover Harbor task-spec datasets on the Hugging Face Hub.
|
| 2 |
+
|
| 3 |
+
Harbor datasets are tagged `harbor` on the Hub — the same filter as
|
| 4 |
+
https://huggingface.co/datasets?other=harbor . This module lists them (fast,
|
| 5 |
+
no per-dataset round-trips) and computes per-dataset task counts on demand
|
| 6 |
+
(one cheap `list_repo_files` call, memoised).
|
| 7 |
+
|
| 8 |
+
All listing is done live against the Hub so the UI always reflects the latest
|
| 9 |
+
published datasets (no stale snapshot).
|
| 10 |
+
"""
|
| 11 |
+
|
| 12 |
+
from __future__ import annotations
|
| 13 |
+
|
| 14 |
+
import logging
|
| 15 |
+
import os
|
| 16 |
+
import time
|
| 17 |
+
from dataclasses import dataclass
|
| 18 |
+
|
| 19 |
+
logger = logging.getLogger(__name__)
|
| 20 |
+
|
| 21 |
+
_HARBOR_TAG = "harbor"
|
| 22 |
+
|
| 23 |
+
|
| 24 |
+
@dataclass(slots=True)
|
| 25 |
+
class HubDataset:
|
| 26 |
+
id: str
|
| 27 |
+
downloads: int = 0
|
| 28 |
+
likes: int = 0
|
| 29 |
+
updated: str | None = None
|
| 30 |
+
private: bool = False
|
| 31 |
+
|
| 32 |
+
def as_dict(self) -> dict:
|
| 33 |
+
return {
|
| 34 |
+
"id": self.id,
|
| 35 |
+
"downloads": self.downloads,
|
| 36 |
+
"likes": self.likes,
|
| 37 |
+
"updated": self.updated,
|
| 38 |
+
"private": self.private,
|
| 39 |
+
}
|
| 40 |
+
|
| 41 |
+
|
| 42 |
+
def _token() -> str | None:
|
| 43 |
+
return os.environ.get("HF_TOKEN") or None
|
| 44 |
+
|
| 45 |
+
|
| 46 |
+
def list_harbor_datasets(query: str | None = None, sort: str = "downloads",
|
| 47 |
+
limit: int = 500) -> list[HubDataset]:
|
| 48 |
+
"""List datasets tagged `harbor` on the Hub. Always live (no caching).
|
| 49 |
+
|
| 50 |
+
`sort` ∈ {downloads, likes, lastModified, trending}. `query` filters by
|
| 51 |
+
substring on the dataset id (server-side search)."""
|
| 52 |
+
from huggingface_hub import HfApi
|
| 53 |
+
|
| 54 |
+
api = HfApi(token=_token())
|
| 55 |
+
# `filter=` matches the `other:harbor` tag used by the Hub UI.
|
| 56 |
+
kwargs: dict = {"filter": _HARBOR_TAG, "limit": limit}
|
| 57 |
+
if sort in ("downloads", "likes", "lastModified", "trendingScore"):
|
| 58 |
+
kwargs["sort"] = sort
|
| 59 |
+
if query:
|
| 60 |
+
kwargs["search"] = query
|
| 61 |
+
out: list[HubDataset] = []
|
| 62 |
+
for d in api.list_datasets(**kwargs):
|
| 63 |
+
lm = getattr(d, "last_modified", None)
|
| 64 |
+
out.append(HubDataset(
|
| 65 |
+
id=d.id,
|
| 66 |
+
downloads=int(getattr(d, "downloads", 0) or 0),
|
| 67 |
+
likes=int(getattr(d, "likes", 0) or 0),
|
| 68 |
+
updated=lm.isoformat() if lm else None,
|
| 69 |
+
private=bool(getattr(d, "private", False)),
|
| 70 |
+
))
|
| 71 |
+
return out
|
| 72 |
+
|
| 73 |
+
|
| 74 |
+
# task-id memo: {(id, rev): (ids, ts)} — derived from a shallow tree listing,
|
| 75 |
+
# never a download. Short TTL so freshly-pushed tasks still surface.
|
| 76 |
+
_TASKS_CACHE: dict[tuple[str, str], tuple[list[str], float]] = {}
|
| 77 |
+
_TASKS_TTL = 120.0 # seconds
|
| 78 |
+
|
| 79 |
+
|
| 80 |
+
def _is_dir(entry) -> bool:
|
| 81 |
+
return entry.__class__.__name__ == "RepoFolder"
|
| 82 |
+
|
| 83 |
+
|
| 84 |
+
def list_hf_tasks(dataset_id: str, revision: str | None = None, *, ttl: float = _TASKS_TTL) -> list[str]:
|
| 85 |
+
"""Task ids in a Hub dataset WITHOUT downloading it.
|
| 86 |
+
|
| 87 |
+
Uses *shallow* tree listings so even 2k-task datasets resolve in ~1 API call
|
| 88 |
+
instead of walking every file: if a top-level `tasks/` folder exists we list
|
| 89 |
+
its immediate children (Repo2RLEnv's nested layout); otherwise we treat the
|
| 90 |
+
top-level folders as flat task dirs. This is the fix for huge datasets that
|
| 91 |
+
used to hang while the whole repo was enumerated/downloaded."""
|
| 92 |
+
key = (dataset_id, revision or "head")
|
| 93 |
+
now = time.time()
|
| 94 |
+
hit = _TASKS_CACHE.get(key)
|
| 95 |
+
if hit and (now - hit[1]) < ttl:
|
| 96 |
+
return hit[0]
|
| 97 |
+
|
| 98 |
+
from huggingface_hub import HfApi
|
| 99 |
+
|
| 100 |
+
api = HfApi(token=_token())
|
| 101 |
+
root = list(api.list_repo_tree(dataset_id, repo_type="dataset", revision=revision, recursive=False))
|
| 102 |
+
names = {e.path: e for e in root}
|
| 103 |
+
|
| 104 |
+
if "tasks" in names and _is_dir(names["tasks"]):
|
| 105 |
+
sub = api.list_repo_tree(dataset_id, "tasks", repo_type="dataset", revision=revision, recursive=False)
|
| 106 |
+
ids = sorted(e.path.split("/")[-1] for e in sub if _is_dir(e))
|
| 107 |
+
else:
|
| 108 |
+
# flat layout: top-level folders are the tasks (skip dotfiles/README/etc.)
|
| 109 |
+
ids = sorted(e.path for e in root if _is_dir(e) and not e.path.startswith("."))
|
| 110 |
+
|
| 111 |
+
_TASKS_CACHE[key] = (ids, now)
|
| 112 |
+
return ids
|
| 113 |
+
|
| 114 |
+
|
| 115 |
+
def count_tasks(dataset_id: str) -> int:
|
| 116 |
+
"""Number of Harbor tasks in a Hub dataset (shallow listing, memoised)."""
|
| 117 |
+
try:
|
| 118 |
+
return len(list_hf_tasks(dataset_id))
|
| 119 |
+
except Exception as exc: # noqa: BLE001
|
| 120 |
+
logger.warning("count_tasks(%s) failed: %s", dataset_id, exc)
|
| 121 |
+
return -1
|
viewer/load.py
CHANGED
|
@@ -139,16 +139,21 @@ def _fetch_hf(source: DatasetSource, force: bool) -> Path:
|
|
| 139 |
from huggingface_hub import snapshot_download
|
| 140 |
|
| 141 |
target = CACHE_ROOT / source.cache_key
|
| 142 |
-
|
| 143 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 144 |
return target
|
| 145 |
|
| 146 |
-
if target.exists():
|
| 147 |
-
shutil.rmtree(target)
|
| 148 |
target.mkdir(parents=True, exist_ok=True)
|
| 149 |
# Public datasets work without a token; private ones rely on $HF_TOKEN
|
| 150 |
# being set in the Space's secrets.
|
| 151 |
token = os.environ.get("HF_TOKEN") or None
|
|
|
|
|
|
|
| 152 |
snapshot_download(
|
| 153 |
repo_id=source.ident,
|
| 154 |
repo_type="dataset",
|
|
@@ -159,6 +164,52 @@ def _fetch_hf(source: DatasetSource, force: bool) -> Path:
|
|
| 159 |
return target
|
| 160 |
|
| 161 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 162 |
def _fetch_harbor(source: DatasetSource, force: bool) -> Path:
|
| 163 |
"""Shell out to `harbor datasets download` to fetch a Harbor-registry dataset.
|
| 164 |
|
|
|
|
| 139 |
from huggingface_hub import snapshot_download
|
| 140 |
|
| 141 |
target = CACHE_ROOT / source.cache_key
|
| 142 |
+
# Pinned revisions (tag/commit) are immutable → caching is always safe.
|
| 143 |
+
# Unpinned ("head") datasets MUST re-sync every load so we never show stale
|
| 144 |
+
# data — snapshot_download is etag-aware, so re-syncing only pulls files
|
| 145 |
+
# that actually changed (cheap). This is the fix for "doesn't show latest".
|
| 146 |
+
pinned = source.revision is not None
|
| 147 |
+
if not force and pinned and target.exists() and any(target.iterdir()):
|
| 148 |
+
logger.info("hf cache hit (pinned %s): %s", source.revision, target)
|
| 149 |
return target
|
| 150 |
|
|
|
|
|
|
|
| 151 |
target.mkdir(parents=True, exist_ok=True)
|
| 152 |
# Public datasets work without a token; private ones rely on $HF_TOKEN
|
| 153 |
# being set in the Space's secrets.
|
| 154 |
token = os.environ.get("HF_TOKEN") or None
|
| 155 |
+
logger.info("hf %s: %s@%s", "fetch" if pinned else "re-sync",
|
| 156 |
+
source.ident, source.revision or "head")
|
| 157 |
snapshot_download(
|
| 158 |
repo_id=source.ident,
|
| 159 |
repo_type="dataset",
|
|
|
|
| 164 |
return target
|
| 165 |
|
| 166 |
|
| 167 |
+
def fetch_hf_task(source: DatasetSource, task_id: str, *, force: bool = False) -> Path:
|
| 168 |
+
"""Download ONLY one task's files from an HF dataset (not the whole repo).
|
| 169 |
+
|
| 170 |
+
Snapshot-downloading a 2k-task dataset just to open one task is the slowness
|
| 171 |
+
the user hit; even `snapshot_download(allow_patterns=...)` still walks the
|
| 172 |
+
entire repo tree first. Instead we list just this task's subtree (one shallow
|
| 173 |
+
API call) and `hf_hub_download` each file. A handful of small files, no
|
| 174 |
+
full-repo walk. Files accumulate under one per-dataset cache dir so
|
| 175 |
+
revisiting is free. Returns a root that `load_task(root, task_id)` resolves
|
| 176 |
+
for either flat or nested layout.
|
| 177 |
+
"""
|
| 178 |
+
from huggingface_hub import HfApi, hf_hub_download
|
| 179 |
+
|
| 180 |
+
target = CACHE_ROOT / f"{source.cache_key}__bytask"
|
| 181 |
+
target.mkdir(parents=True, exist_ok=True)
|
| 182 |
+
token = os.environ.get("HF_TOKEN") or None
|
| 183 |
+
api = HfApi(token=token)
|
| 184 |
+
logger.info("hf per-task fetch: %s :: %s", source.ident, task_id)
|
| 185 |
+
|
| 186 |
+
# Resolve the task's directory in the repo: nested (`tasks/<id>`) first, then flat.
|
| 187 |
+
# `list_repo_tree` is a generator, so the 404 for a non-existent prefix only
|
| 188 |
+
# fires while iterating — force it inside the try (via list()) so we fall
|
| 189 |
+
# through to the other layout instead of bubbling the error up.
|
| 190 |
+
files: list[str] = []
|
| 191 |
+
for prefix in (f"tasks/{task_id}", task_id):
|
| 192 |
+
try:
|
| 193 |
+
entries = list(api.list_repo_tree(
|
| 194 |
+
source.ident, prefix, repo_type="dataset",
|
| 195 |
+
revision=source.revision, recursive=True,
|
| 196 |
+
))
|
| 197 |
+
except Exception: # noqa: BLE001 — path doesn't exist in this layout
|
| 198 |
+
continue
|
| 199 |
+
files = [e.path for e in entries if getattr(e, "size", None) is not None]
|
| 200 |
+
if files:
|
| 201 |
+
break
|
| 202 |
+
if not files:
|
| 203 |
+
raise FileNotFoundError(f"task {task_id!r} not found in {source.ident}")
|
| 204 |
+
|
| 205 |
+
for f in files:
|
| 206 |
+
hf_hub_download(
|
| 207 |
+
repo_id=source.ident, repo_type="dataset", revision=source.revision,
|
| 208 |
+
filename=f, local_dir=str(target), token=token,
|
| 209 |
+
)
|
| 210 |
+
return target
|
| 211 |
+
|
| 212 |
+
|
| 213 |
def _fetch_harbor(source: DatasetSource, force: bool) -> Path:
|
| 214 |
"""Shell out to `harbor datasets download` to fetch a Harbor-registry dataset.
|
| 215 |
|