MeshLex Research

MeshLex: Learning a Topology-aware Patch Vocabulary for Compositional Mesh Generation

GitHub Hugging Face License


Table of Contents

  1. Overview
  2. Current Status
  3. Repo Contents
  4. Core Hypothesis
  5. Model Architecture
  6. Experimental Results
  7. Data
  8. Quick Start
  9. Timeline
  10. License

Overview

A research project exploring whether 3D triangle meshes possess a finite, reusable "vocabulary" of local topological patterns β€” analogous to how BPE tokens form a vocabulary for natural language.

Instead of generating meshes face-by-face, MeshLex learns a codebook of ~4096 topology-aware patches (each covering 20-50 faces) and generates meshes by selecting, deforming, and assembling patches from this codebook. A 4000-face mesh becomes ~130 tokens β€” an order of magnitude more compact than the state-of-the-art (FACE, ICML 2026: ~400 tokens).

MeshMosaic FreeMesh FACE MeshLex
Approach Divide-and-conquer BPE on coordinates One-face-one-token Topology patch codebook
Still per-face generation? Yes Yes Yes No
Has codebook? No Yes (coordinate-level) No Yes (topology-level)
Compression (4K faces) N/A ~300 tokens ~400 tokens ~130 tokens

Current Status

Feasibility validation COMPLETE β€” 4/4 experiments STRONG GO. Ready for formal experiment design.

# Experiment Status Result
1 A-stage Γ— 5-Category Done STRONG GO (ratio 1.145x, util 46%)
2 A-stage Γ— LVIS-Wide Done STRONG GO (ratio 1.019x, util 95.3%)
3 B-stage Γ— 5-Category Done STRONG GO (ratio 1.185x, util 47%)
4 B-stage Γ— LVIS-Wide Done STRONG GO (ratio 1.019x, util 94.9%)

Key findings:

  • More categories = dramatically better generalization: LVIS-Wide (1156 cat) ratio 1.019x vs 5-cat 1.145x, util 95% vs 46%
  • Best result (Exp4): Same-cat CD 211.6, Cross-cat CD 215.8 β€” near-zero generalization gap
  • SimVQ collapse fix successful: utilization 0.46% β†’ 99%+ (217x improvement)
  • B-stage multi-token KV decoder effective: reconstruction CD reduced 6.2%

Repo Contents

This HuggingFace repo stores checkpoints and processed datasets for reproducibility.

Checkpoints

Experiment Path Description
Exp1 A-stage Γ— 5cat checkpoints/exp1_A_5cat/ checkpoint_final.pt + training_history.json
Exp2 A-stage Γ— LVIS-Wide checkpoints/exp2_A_lvis_wide/ checkpoint_final.pt + training_history.json
Exp3 B-stage Γ— 5cat checkpoints/exp3_B_5cat/ checkpoint_final.pt + training_history.json
Exp4 B-stage Γ— LVIS-Wide checkpoints/exp4_B_lvis_wide/ checkpoint_final.pt + training_history.json

Data

File / Directory Size Contents
data/meshlex_data.tar.gz ~1.2 GB All processed data in one archive (recommended)
data/patches/ ~1.1 GB NPZ patch files (5cat + LVIS-Wide splits)
data/meshes/ ~931 MB Preprocessed decimated OBJ files (5,497 meshes)
data/objaverse/ ~2 MB Download manifests

The tar.gz archive contains patches, meshes, and manifests β€” download it and extract to skip all preprocessing.

Core Hypothesis

Mesh local topology is low-entropy and universal across object categories. A finite codebook of ~4096 topology prototypes, combined with continuous deformation parameters, can reconstruct arbitrary meshes with high fidelity.

Model Architecture

The full model is a VQ-VAE with three modules:

Objaverse-LVIS GLB β†’ Decimation (pyfqmr) β†’ Normalize [-1,1]
    β†’ METIS Patch Segmentation (~35 faces/patch)
    β†’ PCA-aligned local coordinates
    β†’ Face features (15-dim: vertices + normal + angles)
    β†’ SAGEConv GNN Encoder β†’ 128-dim embedding
    β†’ SimVQ Codebook (K=4096, learnable reparameterization)
    β†’ Cross-attention MLP Decoder β†’ Reconstructed vertices
  • PatchEncoder: 4-layer SAGEConv GNN + global mean pooling β†’ 128-dim z
  • SimVQ Codebook: Frozen base C + learnable linear W, effective codebook CW = W(C). All 4096 entries share W's gradient β€” no code is ever forgotten
  • PatchDecoder: Cross-attention with learnable vertex queries β†’ per-vertex xyz coordinates
  • A-stage: Single KV token decoder (baseline)
  • B-stage: 4 KV tokens decoder (improved reconstruction, resumed from A-stage)

Experimental Results

Experiment Scale Stage CD Ratio Util (same) Util (cross) Decision
Exp1 5 categories A (1 KV token) 1.145x 46.0% 47.0% βœ… STRONG GO
Exp3 5 categories B (4 KV tokens) 1.185x 47.1% 47.3% βœ… STRONG GO
Exp2 1156 categories A (1 KV token) 1.019x 95.3% 83.6% βœ… STRONG GO
Exp4 1156 categories B (4 KV tokens) 1.019x 94.9% 82.8% βœ… STRONG GO

CD Ratio = Cross-category CD / Same-category CD. Closer to 1.0 = better generalization. Target: < 1.2x.

Scaling from 5 to 1156 categories causes CD ratio to drop from 1.145x to 1.019x (near-perfect generalization) and utilization to surge from 46% to 95% (nearly full codebook activation).

Data

Training data sourced from Objaverse-LVIS (Allen AI).

  • 5-Category: chair, table, airplane, car, lamp β€” used for initial validation
  • LVIS-Wide: 1156 categories from Objaverse-LVIS, 10 objects per category
    • seen_train: 188,696 patches (1046 categories)
    • seen_test: 45,441 patches (same 1046 categories, held-out objects)
    • unseen: 12,655 patches (110 held-out categories, never seen during training)

Quick Start

# Clone the code repo
git clone https://github.com/Pthahnix/MeshLex-Research.git
cd MeshLex-Research

# Install dependencies
pip install -r requirements.txt
pip install torch-geometric
pip install pyg_lib torch_scatter torch_sparse torch_cluster torch_spline_conv \
    -f https://data.pyg.org/whl/torch-2.4.0+cu124.html

# Download processed data from this HF repo
pip install huggingface_hub
python -c "
from huggingface_hub import hf_hub_download
hf_hub_download('Pthahnix/MeshLex-Research', 'data/meshlex_data.tar.gz', repo_type='model', local_dir='.')
"
tar xzf data/meshlex_data.tar.gz -C data/

# Download checkpoints
python -c "
from huggingface_hub import snapshot_download
snapshot_download('Pthahnix/MeshLex-Research', allow_patterns='checkpoints/*', repo_type='model', local_dir='.')
"
mv checkpoints data/checkpoints

# Run evaluation on Exp4 (best model)
PYTHONPATH=. python scripts/evaluate.py \
  --checkpoint data/checkpoints/exp4_B_lvis_wide/checkpoint_final.pt \
  --same_cat_dirs data/patches/lvis_wide/seen_test \
  --cross_cat_dirs data/patches/lvis_wide/unseen \
  --output results/eval_results.json

# Run unit tests
python -m pytest tests/ -v

Timeline

  • Day 1 (2026-03-06): Project inception, gap analysis, idea generation, experiment design
  • Day 2 (2026-03-07): Full codebase implementation (14 tasks), unit tests, initial experiment
  • Day 3 (2026-03-08): Diagnosed codebook collapse, fixed SimVQ, Exp1 β€” STRONG GO
  • Day 4 (2026-03-09): Exp2 + Exp3 completed β€” STRONG GO. Key finding: more categories = better generalization
  • Day 5 (2026-03-13): Pod reset recovery, expanded LVIS-Wide (1156 cat), retrained Exp2, trained Exp4 β€” all STRONG GO
  • Day 6 (2026-03-14): Final comparison report + visualizations. Full dataset + checkpoints backed up to HuggingFace

License

Apache-2.0

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train Pthahnix/MeshLex-Research