scrub for the cadgenbench Public flip: drop NIST + remove build secret
Browse filesThree changes bundled together so the next Space rebuild lands in the
post-NIST, post-secret end state in one go:
- legacy/nist_*_comparison_3d.html (9 MB each) deleted; these were old
NIST-stub comparison artifacts already labelled legacy, kept around
for visual reference only. Regenerate from the real benchmark
dataset post-launch if useful.
- results.jsonl emptied. The three baseline seed rows it held were
scheduled for pre-launch deletion anyway (per space-setup/migration.md
Phase D), and the same rows were just rewritten on the Hub side of
cadgenbench-submissions to drop the NIST per-fixture entries. The
Space's leaderboard read path uses results.jsonl only as a Hub-fetch
fallback; an empty file just produces an empty table on fallback,
which is the right behaviour for a development Space.
- Dockerfile: dropped the `RUN --mount=type=secret,id=GH_PAT` line and
switched the cadgenbench install URL from the authenticated
`git+https://michaelr27:$(cat /run/secrets/GH_PAT)@github.com/...`
form to a plain unauthenticated
`git+https://github.com/huggingface/cadgenbench.git@<sha>`. The
source repo is now Public (history rewritten to a single commit
during the NIST scrub). The GH_PAT Space secret can be deleted from
the Space settings and the matching classic PAT revoked on GitHub
after this rebuild verifies green. CADGENBENCH_SHA bumped from the
pre-squash d7e0468 to fab9a3b (the new Initial commit on the public
repo); same tree contents, the SHA changed because the history was
rewritten.
Co-authored-by: Cursor <cursoragent@cursor.com>
- Dockerfile +7 -33
- legacy/nist_comparison_3d.html +0 -0
- legacy/nist_hf_comparison_3d.html +0 -0
- results.jsonl +0 -3
|
@@ -4,23 +4,10 @@
|
|
| 4 |
#
|
| 5 |
# HF builds this server-side on each git push. Local smoke test:
|
| 6 |
#
|
| 7 |
-
# docker buildx build --platform linux/amd64
|
| 8 |
-
# --secret id=GH_PAT,src=/tmp/gh_pat \
|
| 9 |
-
# -t cadgenbench-space-test .
|
| 10 |
#
|
| 11 |
-
#
|
| 12 |
-
#
|
| 13 |
-
# space-setup user isn't an admin on huggingface/cadgenbench so can't add
|
| 14 |
-
# a per-repo deploy key either). Broader scope than ideal: this PAT could
|
| 15 |
-
# in principle read any private repo the issuing user has access to, not
|
| 16 |
-
# just huggingface/cadgenbench. Tracked as a follow-up to swap for a
|
| 17 |
-
# read-only deploy key once admin permissions land. See `space-setup/
|
| 18 |
-
# migration.md` Phase C GitHub move section.
|
| 19 |
-
#
|
| 20 |
-
# The mount itself is safe: the value is visible only inside the single
|
| 21 |
-
# `pip install cadgenbench` RUN below, never lands in an image layer,
|
| 22 |
-
# env var, or disk file in the final image. The same value lives on the
|
| 23 |
-
# Space as a Settings Secret (mirrors /tmp/gh_pat locally).
|
| 24 |
|
| 25 |
FROM python:3.12-slim-bookworm
|
| 26 |
|
|
@@ -55,24 +42,11 @@ RUN pip install --no-cache-dir -r /tmp/requirements.txt \
|
|
| 55 |
RUN pip install --no-cache-dir playwright \
|
| 56 |
&& playwright install --with-deps chromium
|
| 57 |
|
| 58 |
-
# cadgenbench from the
|
| 59 |
-
# mount is visible only inside this single RUN: not embedded in any layer,
|
| 60 |
-
# not exposed as env, not written to disk after the layer commits. Bumping
|
| 61 |
# CADGENBENCH_SHA is the one-line path to picking up a new cadgenbench.
|
| 62 |
-
ARG CADGENBENCH_SHA=
|
| 63 |
-
|
| 64 |
-
|
| 65 |
-
# fine-grained PATs (GitHub special-cases the single-value username), and
|
| 66 |
-
# the GitHub-Actions placeholder `x-access-token` is rejected as a
|
| 67 |
-
# non-existent user. The username here is the actual PAT owner
|
| 68 |
-
# (`michaelr27`); the value is just an HTTP Basic Auth label, GitHub
|
| 69 |
-
# ignores it during auth and uses the PAT as the credential. Hardcoding
|
| 70 |
-
# it is acceptable temporary debt: this whole URL goes away when we swap
|
| 71 |
-
# GH_PAT for a deploy key (URL becomes `git+ssh://`) or when
|
| 72 |
-
# huggingface/cadgenbench flips Public (URL drops auth entirely).
|
| 73 |
-
RUN --mount=type=secret,id=GH_PAT,mode=0400,required=true \
|
| 74 |
-
pip install --no-cache-dir \
|
| 75 |
-
"cadgenbench @ git+https://michaelr27:$(cat /run/secrets/GH_PAT)@github.com/huggingface/cadgenbench.git@${CADGENBENCH_SHA}"
|
| 76 |
|
| 77 |
# Drop privileges. HF Spaces conventionally run as uid 1000 with
|
| 78 |
# WORKDIR /home/user/app.
|
|
|
|
| 4 |
#
|
| 5 |
# HF builds this server-side on each git push. Local smoke test:
|
| 6 |
#
|
| 7 |
+
# docker buildx build --platform linux/amd64 -t cadgenbench-space-test .
|
|
|
|
|
|
|
| 8 |
#
|
| 9 |
+
# cadgenbench is installed from github.com/huggingface/cadgenbench, which
|
| 10 |
+
# is Public. No build secrets or auth required.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
|
| 12 |
FROM python:3.12-slim-bookworm
|
| 13 |
|
|
|
|
| 42 |
RUN pip install --no-cache-dir playwright \
|
| 43 |
&& playwright install --with-deps chromium
|
| 44 |
|
| 45 |
+
# cadgenbench from the Public GitHub repo, pinned to a commit. Bumping
|
|
|
|
|
|
|
| 46 |
# CADGENBENCH_SHA is the one-line path to picking up a new cadgenbench.
|
| 47 |
+
ARG CADGENBENCH_SHA=4ad7487
|
| 48 |
+
RUN pip install --no-cache-dir \
|
| 49 |
+
"cadgenbench @ git+https://github.com/huggingface/cadgenbench.git@${CADGENBENCH_SHA}"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 50 |
|
| 51 |
# Drop privileges. HF Spaces conventionally run as uid 1000 with
|
| 52 |
# WORKDIR /home/user/app.
|
|
The diff for this file is too large to render.
See raw diff
|
|
|
|
The diff for this file is too large to render.
See raw diff
|
|
|
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
{"submission_id": "HF_build123d_baseline_claude-opus-4-7", "submission_name": "HF build123d baseline (Claude Opus 4.7)", "submitter_name": "michaelr27", "agent_url": "https://github.com/MichaelRabinovich/LeForge", "notes": "", "submitted_at": "2026-05-26T12:02:31Z", "cadgenbench_version": "0.1.0", "cadgenbench_data_revision": "f4c58085b5eb", "aggregate_score": 0.6597, "validity_rate": 1.0, "score_by_task_type": {"generation": 0.6114, "editing": 0.9979}, "per_task_scores": {"editing": {"score": 0.9979, "validity_rate": 1.0, "n_fixtures": 1, "n_valid": 1, "n_invalid": 0, "n_missing": 0}, "generation": {"score": 0.6114, "validity_rate": 1.0, "n_fixtures": 7, "n_valid": 7, "n_invalid": 0, "n_missing": 0}}, "per_fixture_scores": {"jig-01-edit-double-hole": {"status": "valid", "cad_score": 0.9979, "task_type": "editing"}, "jig-01-single-hole-plate": {"status": "valid", "cad_score": 0.9984, "task_type": "generation"}, "jig-02-4hole-pattern-plate": {"status": "valid", "cad_score": 0.7688, "task_type": "generation"}, "jig-03-l-bracket-w-hex": {"status": "valid", "cad_score": 0.6047, "task_type": "generation"}, "jig-04-slot-and-2-holes-plate": {"status": "valid", "cad_score": 0.6758, "task_type": "generation"}, "nist-ctc-01": {"status": "valid", "cad_score": 0.4268, "task_type": "generation"}, "nist-ctc-03": {"status": "valid", "cad_score": 0.348, "task_type": "generation"}, "nist-ctc-05": {"status": "valid", "cad_score": 0.4571, "task_type": "generation"}}, "submission_blob_url": "https://huggingface.co/datasets/michaelr27/cadgenbench-submissions/resolve/main/submissions/HF_build123d_baseline_claude-opus-4-7.zip"}
|
| 2 |
-
{"submission_id": "HF_build123d_baseline_gemini-3.1-pro-preview", "submission_name": "HF build123d baseline (Gemini 3.1 Pro Preview)", "submitter_name": "michaelr27", "agent_url": "https://github.com/MichaelRabinovich/LeForge", "notes": "", "submitted_at": "2026-05-26T12:02:31Z", "cadgenbench_version": "0.1.0", "cadgenbench_data_revision": "f4c58085b5eb", "aggregate_score": 0.7267, "validity_rate": 1.0, "score_by_task_type": {"generation": 0.6879, "editing": 0.9982}, "per_task_scores": {"editing": {"score": 0.9982, "validity_rate": 1.0, "n_fixtures": 1, "n_valid": 1, "n_invalid": 0, "n_missing": 0}, "generation": {"score": 0.6879, "validity_rate": 1.0, "n_fixtures": 7, "n_valid": 7, "n_invalid": 0, "n_missing": 0}}, "per_fixture_scores": {"jig-01-edit-double-hole": {"status": "valid", "cad_score": 0.9982, "task_type": "editing"}, "jig-01-single-hole-plate": {"status": "valid", "cad_score": 0.9932, "task_type": "generation"}, "jig-02-4hole-pattern-plate": {"status": "valid", "cad_score": 0.8743, "task_type": "generation"}, "jig-03-l-bracket-w-hex": {"status": "valid", "cad_score": 0.579, "task_type": "generation"}, "jig-04-slot-and-2-holes-plate": {"status": "valid", "cad_score": 0.821, "task_type": "generation"}, "nist-ctc-01": {"status": "valid", "cad_score": 0.6155, "task_type": "generation"}, "nist-ctc-03": {"status": "valid", "cad_score": 0.4289, "task_type": "generation"}, "nist-ctc-05": {"status": "valid", "cad_score": 0.5031, "task_type": "generation"}}, "submission_blob_url": "https://huggingface.co/datasets/michaelr27/cadgenbench-submissions/resolve/main/submissions/HF_build123d_baseline_gemini-3.1-pro-preview.zip"}
|
| 3 |
-
{"submission_id": "HF_build123d_baseline_gpt-5.5", "submission_name": "HF build123d baseline (GPT-5.5)", "submitter_name": "michaelr27", "agent_url": "https://github.com/MichaelRabinovich/LeForge", "notes": "", "submitted_at": "2026-05-26T12:02:31Z", "cadgenbench_version": "0.1.0", "cadgenbench_data_revision": "f4c58085b5eb", "aggregate_score": 0.6805, "validity_rate": 1.0, "score_by_task_type": {"generation": 0.6351, "editing": 0.9982}, "per_task_scores": {"editing": {"score": 0.9982, "validity_rate": 1.0, "n_fixtures": 1, "n_valid": 1, "n_invalid": 0, "n_missing": 0}, "generation": {"score": 0.6351, "validity_rate": 1.0, "n_fixtures": 7, "n_valid": 7, "n_invalid": 0, "n_missing": 0}}, "per_fixture_scores": {"jig-01-edit-double-hole": {"status": "valid", "cad_score": 0.9982, "task_type": "editing"}, "jig-01-single-hole-plate": {"status": "valid", "cad_score": 0.9996, "task_type": "generation"}, "jig-02-4hole-pattern-plate": {"status": "valid", "cad_score": 0.717, "task_type": "generation"}, "jig-03-l-bracket-w-hex": {"status": "valid", "cad_score": 0.578, "task_type": "generation"}, "jig-04-slot-and-2-holes-plate": {"status": "valid", "cad_score": 0.8948, "task_type": "generation"}, "nist-ctc-01": {"status": "valid", "cad_score": 0.478, "task_type": "generation"}, "nist-ctc-03": {"status": "valid", "cad_score": 0.3619, "task_type": "generation"}, "nist-ctc-05": {"status": "valid", "cad_score": 0.4166, "task_type": "generation"}}, "submission_blob_url": "https://huggingface.co/datasets/michaelr27/cadgenbench-submissions/resolve/main/submissions/HF_build123d_baseline_gpt-5.5.zip"}
|
|
|
|
|
|
|
|
|
|
|
|