davanstrien's picture
davanstrien HF Staff
probe: docker sdk bucket mount test (UID 1000 per HF docs)
e1eb5a6 verified
---
title: bucket-sqlite-probe-docker
emoji: 🐳
colorFrom: red
colorTo: gray
sdk: docker
app_port: 7860
pinned: false
tags:
- bucket
- sqlite
- probe
- reference
---
# Bucket mount Γ— SQLite probe β€” Docker SDK Space (UID 1000)
**This is the failing half of a matched pair.** The Gradio SDK half is at
[`davanstrien/bucket-sqlite-probe-gradio`](https://huggingface.co/spaces/davanstrien/bucket-sqlite-probe-gradio) and all its probes pass. This Space runs the *same probe code* against the *same bucket*, but inside a Docker SDK container that follows the official [`spaces-sdks-docker#permissions`](https://huggingface.co/docs/hub/spaces-sdks-docker#permissions) guidance β€” creating a `user` account with `uid=1000` and switching to it via `USER user` in the Dockerfile. Its write probes fail with `Permission denied` / `unable to open database file`.
The Dockerfile is intentionally as close to the official permissions example as possible β€” the only deviations are the `CMD` (runs `app.py`) and an explicit `chown/chmod 777 /data` step to prove that build-time permissions are overridden by the runtime mount.
## What it demonstrates
With bucket `davanstrien/search-v2-chroma` attached R/W at `/data`:
- The container runs as `uid=1000(user)`, exactly as `spaces-sdks-docker#permissions` recommends.
- `/data` is mounted by `hf-mount` with `idmapped,user_id=0,group_id=0,default_permissions`. That id-mapping pins the mount's writable UID to **0 (root)**, not the container's UID 1000.
- `ls -lan /data` shows `drwxr-xr-x 3 65534 65534` (nobody:nogroup, mode 755). The `chmod 777` in the Dockerfile is silently overridden β€” the runtime mount replaces the build-time directory entirely.
- Write probes all fail:
```
FAIL touch /data/docker_probe_touch: PermissionError: [Errno 13] Permission denied: '/data/docker_probe_touch'
FAIL sqlite3 connect + CREATE + INSERT: OperationalError: unable to open database file
FAIL sqlite3 journal_mode=DELETE: OperationalError: unable to open database file
FAIL fcntl.flock LOCK_EX|LOCK_NB: PermissionError: [Errno 13] Permission denied: '/data/docker_probe.lock'
```
- The control probe (`touch $HOME/app/control_probe` on the container's build-time writable dir) succeeds β€” so the container is healthy and UID 1000 can write *somewhere*, just not to the bucket mount.
## Why this matters
1. The Docker Spaces docs explicitly tell you to run as UID 1000. There's no note that this is incompatible with Storage Bucket mounts.
2. The Storage Buckets blog post ([buckets as working layer](https://huggingface.co/blog/davanstrien/buckets-as-working-layer)) implies buckets are a drop-in R/W volume for Spaces.
3. These two bits of guidance are silently incompatible today because the FUSE mount is provisioned with `user_id=0,group_id=0`. A root container sees the mount as writable; a UID 1000 container does not.
4. Any SQLite-backed tool (ChromaDB, DuckDB persistent, LMDB, RocksDB) built on a Docker SDK Space following the official permissions guidance will silently fail to open its database on a bucket mount. `huggingface-datasets-search-v2` hit this; [trackio](https://github.com/gradio-app/trackio) doesn't because Gradio SDK Spaces run as root.
## The fix (almost certainly)
The mount provisioning layer should either:
1. Mount with `user_id=1000,group_id=1000` for Docker SDK Spaces (the conventional UID), or
2. Mount with the Space's runtime UID dynamically (cleanest), or
3. Chmod the mount root to `0777` at provisioning time (hacky but works for any UID).
All three are infra-side changes. The container user can't fix this β€” any `chown` / `chmod` in the Dockerfile is overridden when the mount is attached at runtime.
## Reproducing
1. Fork or duplicate this Space.
2. Create or choose a Storage Bucket you own.
3. Attach it R/W at `/data` via **Space settings β†’ Volumes** (UI) or via:
```python
from huggingface_hub import HfApi, Volume # requires huggingface_hub >= 1.9.1
HfApi().set_space_volumes(
"your-namespace/bucket-sqlite-probe-docker",
volumes=[Volume(type="bucket", source="your-namespace/your-bucket", mount_path="/data")],
)
```
4. Restart. Probe output appears in startup logs and at the Space URL.
## Related
- Matched Gradio SDK probe Space (passing): https://huggingface.co/spaces/davanstrien/bucket-sqlite-probe-gradio
- `gradio-app/trackio#465` β€” trackio's PR switching to the bucket backend (works because Gradio SDK β‡’ root)
- `huggingface/huggingface_hub#4054` β€” `set_space_volumes` payload fix (required for this Space to attach the bucket)
- Docker Spaces permissions docs: https://huggingface.co/docs/hub/spaces-sdks-docker#permissions
- Blog post: [Buckets as working layer](https://huggingface.co/blog/davanstrien/buckets-as-working-layer)
---
Throwaway investigative Space. Kept public as a reference example. Do not rely on for production.