| # ποΈ DocGenie Architecture & Dependency Resolution |
|
|
| ## π¦ Package Structure |
|
|
| ``` |
| docgenie/ β Root monorepo |
| βββ docgenie/ β Core package (importable) |
| β βββ __init__.py |
| β βββ generation/ β Used by API |
| β β βββ pipeline_01/ |
| β β β βββ claude_batching.py β ClaudeBatchedClient |
| β β βββ pipeline_03/ |
| β β βββ pipeline_04/ |
| β β βββ utils/ |
| β βββ evaluation/ |
| β βββ utils/ |
| β |
| βββ api/ β API Service (imports docgenie.*) |
| β βββ main.py from docgenie import ENV |
| β βββ worker.py from docgenie.generation.pipeline_01... |
| β βββ utils.py from docgenie.generation... |
| β βββ requirements.txt Extra: Redis, Supabase, Google |
| β |
| βββ handwriting_service/ β GPU Service (NO docgenie imports!) |
| β βββ main.py β Self-contained |
| β βββ inference.py β No external deps |
| β βββ models.py |
| β |
| βββ WordStylist/ β Model code (used by handwriting) |
| ``` |
|
|
| ## π Dependency Graph |
|
|
| ``` |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β API Service β |
| β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β |
| β β api/main.py β β |
| β β β imports β β |
| β β api/utils.py (call_claude_api_direct) β β |
| β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β |
| β β |
| β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β |
| β β api/worker.py β β |
| β β β imports β β |
| β β from docgenie.generation.pipeline_01.claude_batching β β |
| β β from docgenie.generation.constants β β |
| β β from docgenie.generation.pipeline_03_process_responseβ β |
| β β from docgenie.generation.pipeline_04_render_pdf... β β |
| β β from docgenie import ENV β β |
| β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β |
| β β β |
| β REQUIRES β |
| β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β |
| β β docgenie/ package β β |
| β β (entire generation module) β β |
| β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β Handwriting Service β |
| β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β |
| β β handwriting_service/main.py β β |
| β β β imports β β |
| β β from handwriting_service.inference import ... β β |
| β β from handwriting_service.models import ... β β |
| β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β |
| β β β |
| β REQUIRES β |
| β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β |
| β β WordStylist/ model β β |
| β β (diffusion model code) β β |
| β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β |
| β β |
| β β NO docgenie imports - completely independent! β |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| ``` |
|
|
| ## π³ Docker Build Strategy |
|
|
| ### β What Doesn't Work |
|
|
| ```dockerfile |
| # β WRONG: Can't copy just api/ folder |
| FROM python:3.11 |
| COPY api/ /app/api/ # Missing docgenie package! |
| RUN pip install -r requirements.txt |
| CMD ["uvicorn", "main:app"] # ImportError: No module named 'docgenie' |
| ``` |
|
|
| ### β
What Works |
|
|
| ```dockerfile |
| # β
CORRECT: Copy entire monorepo |
| FROM python:3.11 |
| WORKDIR /app |
| |
| # Copy everything |
| COPY . . |
| |
| # Install docgenie as package |
| RUN pip install -e . # Makes docgenie.* importable |
| |
| # Install API requirements |
| RUN pip install -r api/requirements.txt |
| |
| WORKDIR /app/api |
| CMD ["uvicorn", "main:app"] # β docgenie imports work! |
| ``` |
|
|
| ## π’ Deployment Strategy Comparison |
|
|
| ### Option 1: Separate Deployments (β Won't Work) |
|
|
| ``` |
| API Deployment: |
| βββ api/ folder only |
| βββ β Missing docgenie package β ImportError |
| |
| Handwriting Deployment: |
| βββ handwriting_service/ folder |
| βββ WordStylist/ |
| ``` |
|
|
| **Problem:** API can't find docgenie imports! |
|
|
| ### Option 2: Monorepo Deployment (β
Works) |
|
|
| ``` |
| API Deployment: |
| βββ docgenie/ package (core) |
| βββ api/ service (imports docgenie) |
| βββ setup.py |
| βββ requirements.txt |
| |
| Handwriting Deployment: |
| βββ handwriting_service/ |
| βββ WordStylist/ |
| ``` |
|
|
| **Solution:** Deploy entire repo for API, standalone for handwriting! |
|
|
| ## π File Structure in Containers |
|
|
| ### API Container (Railway/EC2) |
| ``` |
| /app/ |
| βββ docgenie/ β Installed as Python package |
| β βββ __init__.py |
| β βββ generation/ |
| β βββ utils/ |
| βββ api/ β Working directory |
| β βββ main.py |
| β βββ worker.py |
| β βββ utils.py |
| βββ setup.py |
| βββ pyproject.toml |
| |
| Python can import: |
| β from docgenie.generation.pipeline_01 import ... |
| β from docgenie import ENV |
| ``` |
|
|
| ### Handwriting Container (RunPod) |
| ``` |
| /app/ |
| βββ handwriting_service/ |
| β βββ main.py β No docgenie imports! |
| β βββ inference.py |
| β βββ models.py |
| βββ WordStylist/ β Model code |
| βββ ldm/ |
| βββ wordstylist_inference.py |
| |
| Python can import: |
| β from handwriting_service.inference import ... |
| β No docgenie dependencies needed! |
| ``` |
|
|
| ## π― Import Resolution Flow |
|
|
| ### API Service Import Chain |
|
|
| 1. **FastAPI starts:** |
| ```python |
| uvicorn main:app |
| ``` |
|
|
| 2. **main.py imports utils:** |
| ```python |
| from api.utils import call_claude_api_direct |
| ``` |
|
|
| 3. **utils.py imports docgenie:** |
| ```python |
| from docgenie.generation.pipeline_01.claude_batching import ClaudeBatchedClient |
| ``` |
|
|
| 4. **Python looks for docgenie:** |
| - Checks sys.path |
| - Finds `/app` (where `pip install -e .` installed it) |
| - Loads `docgenie/__init__.py` |
| - β Import succeeds! |
|
|
| ### Handwriting Service Import Chain |
|
|
| 1. **FastAPI starts:** |
| ```python |
| uvicorn main:app |
| ``` |
|
|
| 2. **main.py imports local modules:** |
| ```python |
| from handwriting_service.inference import HandwritingGenerator |
| ``` |
|
|
| 3. **inference.py imports WordStylist:** |
| ```python |
| sys.path.insert(0, str(Path(__file__).parent.parent / "WordStylist")) |
| from ldm.models.diffusion.ddpm import LatentDiffusion |
| ``` |
|
|
| 4. **Python loads local modules:** |
| - No external package dependencies |
| - β Completely self-contained! |
|
|
| ## π Verifying Imports |
|
|
| ### Test API Imports |
| ```bash |
| # Inside API container |
| python3 -c "from docgenie.generation.pipeline_01.claude_batching import ClaudeBatchedClient; print('β Import works!')" |
| ``` |
|
|
| ### Test Handwriting Imports |
| ```bash |
| # Inside handwriting container |
| python3 -c "from handwriting_service.inference import HandwritingGenerator; print('β Import works!')" |
| ``` |
|
|
| ## π‘ Key Insights |
|
|
| 1. **API needs monorepo:** Must deploy entire `docgenie/` folder structure |
| 2. **Handwriting is independent:** Can deploy just `handwriting_service/` + `WordStylist/` |
| 3. **Docker layer caching:** Install docgenie package first, then API requirements |
| 4. **Working directory matters:** Set WORKDIR to /app/api for API service |
| 5. **Python package installation:** `pip install -e .` makes docgenie importable globally |
|
|
| ## π Deployment Size Comparison |
|
|
| | Deployment | Size | Contents | |
| |------------|------|----------| |
| | API (Railway) | ~2GB | Python 3.11 + docgenie + API deps + Playwright | |
| | Worker (Railway) | ~2GB | Same as API (shares image) | |
| | Handwriting (RunPod) | ~8GB | CUDA 11.8 + PyTorch + Diffusers + WordStylist | |
|
|
| **Total:** ~12GB (but cached independently) |
|
|
| ## β
Checklist for Successful Deployment |
|
|
| - [ ] Dockerfile copies **entire monorepo** for API |
| - [ ] `pip install -e .` runs before API requirements |
| - [ ] WORKDIR set to /app/api for runtime |
| - [ ] Handwriting Dockerfile copies only handwriting_service/ + WordStylist/ |
| - [ ] .dockerignore excludes data/ folders (too large) |
| - [ ] Environment variables set in Railway/EC2 |
| - [ ] Redis URL points to Upstash |
| - [ ] HANDWRITING_SERVICE_URL points to RunPod endpoint |
| |
| ## π Result |
| |
| ``` |
| β API can import from docgenie package |
| β Worker can use ClaudeBatchedClient |
| β Handwriting service runs independently |
| β All services communicate via HTTP |
| β No more ImportError! |
| ``` |
| |