MeshLex Research
Table of Contents
- Overview
- Current Status
- Repo Contents
- Core Hypothesis
- Model Architecture
- Experimental Results
- Data
- Quick Start
- Timeline
- License
Overview
A research project exploring whether 3D triangle meshes possess a finite, reusable "vocabulary" of local topological patterns β analogous to how BPE tokens form a vocabulary for natural language.
Instead of generating meshes face-by-face, MeshLex learns a codebook of ~4096 topology-aware patches (each covering 20-50 faces) and generates meshes by selecting, deforming, and assembling patches from this codebook. A 4000-face mesh becomes ~130 tokens β an order of magnitude more compact than the state-of-the-art (FACE, ICML 2026: ~400 tokens).
| MeshMosaic | FreeMesh | FACE | MeshLex | |
|---|---|---|---|---|
| Approach | Divide-and-conquer | BPE on coordinates | One-face-one-token | Topology patch codebook |
| Still per-face generation? | Yes | Yes | Yes | No |
| Has codebook? | No | Yes (coordinate-level) | No | Yes (topology-level) |
| Compression (4K faces) | N/A | ~300 tokens | ~400 tokens | ~130 tokens |
Current Status
Feasibility validation COMPLETE β 4/4 experiments STRONG GO. Ready for formal experiment design.
| # | Experiment | Status | Result |
|---|---|---|---|
| 1 | A-stage Γ 5-Category | Done | STRONG GO (ratio 1.145x, util 46%) |
| 2 | A-stage Γ LVIS-Wide | Done | STRONG GO (ratio 1.019x, util 95.3%) |
| 3 | B-stage Γ 5-Category | Done | STRONG GO (ratio 1.185x, util 47%) |
| 4 | B-stage Γ LVIS-Wide | Done | STRONG GO (ratio 1.019x, util 94.9%) |
Key findings:
- More categories = dramatically better generalization: LVIS-Wide (1156 cat) ratio 1.019x vs 5-cat 1.145x, util 95% vs 46%
- Best result (Exp4): Same-cat CD 211.6, Cross-cat CD 215.8 β near-zero generalization gap
- SimVQ collapse fix successful: utilization 0.46% β 99%+ (217x improvement)
- B-stage multi-token KV decoder effective: reconstruction CD reduced 6.2%
Repo Contents
This HuggingFace repo stores checkpoints and processed datasets for reproducibility.
Checkpoints
| Experiment | Path | Description |
|---|---|---|
| Exp1 A-stage Γ 5cat | checkpoints/exp1_A_5cat/ |
checkpoint_final.pt + training_history.json |
| Exp2 A-stage Γ LVIS-Wide | checkpoints/exp2_A_lvis_wide/ |
checkpoint_final.pt + training_history.json |
| Exp3 B-stage Γ 5cat | checkpoints/exp3_B_5cat/ |
checkpoint_final.pt + training_history.json |
| Exp4 B-stage Γ LVIS-Wide | checkpoints/exp4_B_lvis_wide/ |
checkpoint_final.pt + training_history.json |
Data
| File / Directory | Size | Contents |
|---|---|---|
data/meshlex_data.tar.gz |
~1.2 GB | All processed data in one archive (recommended) |
data/patches/ |
~1.1 GB | NPZ patch files (5cat + LVIS-Wide splits) |
data/meshes/ |
~931 MB | Preprocessed decimated OBJ files (5,497 meshes) |
data/objaverse/ |
~2 MB | Download manifests |
The tar.gz archive contains patches, meshes, and manifests β download it and extract to skip all preprocessing.
Core Hypothesis
Mesh local topology is low-entropy and universal across object categories. A finite codebook of ~4096 topology prototypes, combined with continuous deformation parameters, can reconstruct arbitrary meshes with high fidelity.
Model Architecture
The full model is a VQ-VAE with three modules:
Objaverse-LVIS GLB β Decimation (pyfqmr) β Normalize [-1,1]
β METIS Patch Segmentation (~35 faces/patch)
β PCA-aligned local coordinates
β Face features (15-dim: vertices + normal + angles)
β SAGEConv GNN Encoder β 128-dim embedding
β SimVQ Codebook (K=4096, learnable reparameterization)
β Cross-attention MLP Decoder β Reconstructed vertices
- PatchEncoder: 4-layer SAGEConv GNN + global mean pooling β 128-dim z
- SimVQ Codebook: Frozen base C + learnable linear W, effective codebook CW = W(C). All 4096 entries share W's gradient β no code is ever forgotten
- PatchDecoder: Cross-attention with learnable vertex queries β per-vertex xyz coordinates
- A-stage: Single KV token decoder (baseline)
- B-stage: 4 KV tokens decoder (improved reconstruction, resumed from A-stage)
Experimental Results
| Experiment | Scale | Stage | CD Ratio | Util (same) | Util (cross) | Decision |
|---|---|---|---|---|---|---|
| Exp1 | 5 categories | A (1 KV token) | 1.145x | 46.0% | 47.0% | β STRONG GO |
| Exp3 | 5 categories | B (4 KV tokens) | 1.185x | 47.1% | 47.3% | β STRONG GO |
| Exp2 | 1156 categories | A (1 KV token) | 1.019x | 95.3% | 83.6% | β STRONG GO |
| Exp4 | 1156 categories | B (4 KV tokens) | 1.019x | 94.9% | 82.8% | β STRONG GO |
CD Ratio = Cross-category CD / Same-category CD. Closer to 1.0 = better generalization. Target: < 1.2x.
Scaling from 5 to 1156 categories causes CD ratio to drop from 1.145x to 1.019x (near-perfect generalization) and utilization to surge from 46% to 95% (nearly full codebook activation).
Data
Training data sourced from Objaverse-LVIS (Allen AI).
- 5-Category: chair, table, airplane, car, lamp β used for initial validation
- LVIS-Wide: 1156 categories from Objaverse-LVIS, 10 objects per category
seen_train: 188,696 patches (1046 categories)seen_test: 45,441 patches (same 1046 categories, held-out objects)unseen: 12,655 patches (110 held-out categories, never seen during training)
Quick Start
# Clone the code repo
git clone https://github.com/Pthahnix/MeshLex-Research.git
cd MeshLex-Research
# Install dependencies
pip install -r requirements.txt
pip install torch-geometric
pip install pyg_lib torch_scatter torch_sparse torch_cluster torch_spline_conv \
-f https://data.pyg.org/whl/torch-2.4.0+cu124.html
# Download processed data from this HF repo
pip install huggingface_hub
python -c "
from huggingface_hub import hf_hub_download
hf_hub_download('Pthahnix/MeshLex-Research', 'data/meshlex_data.tar.gz', repo_type='model', local_dir='.')
"
tar xzf data/meshlex_data.tar.gz -C data/
# Download checkpoints
python -c "
from huggingface_hub import snapshot_download
snapshot_download('Pthahnix/MeshLex-Research', allow_patterns='checkpoints/*', repo_type='model', local_dir='.')
"
mv checkpoints data/checkpoints
# Run evaluation on Exp4 (best model)
PYTHONPATH=. python scripts/evaluate.py \
--checkpoint data/checkpoints/exp4_B_lvis_wide/checkpoint_final.pt \
--same_cat_dirs data/patches/lvis_wide/seen_test \
--cross_cat_dirs data/patches/lvis_wide/unseen \
--output results/eval_results.json
# Run unit tests
python -m pytest tests/ -v
Timeline
- Day 1 (2026-03-06): Project inception, gap analysis, idea generation, experiment design
- Day 2 (2026-03-07): Full codebase implementation (14 tasks), unit tests, initial experiment
- Day 3 (2026-03-08): Diagnosed codebook collapse, fixed SimVQ, Exp1 β STRONG GO
- Day 4 (2026-03-09): Exp2 + Exp3 completed β STRONG GO. Key finding: more categories = better generalization
- Day 5 (2026-03-13): Pod reset recovery, expanded LVIS-Wide (1156 cat), retrained Exp2, trained Exp4 β all STRONG GO
- Day 6 (2026-03-14): Final comparison report + visualizations. Full dataset + checkpoints backed up to HuggingFace
License
Apache-2.0