bumpy-plantain
A LoRA adapter on FLUX.2 Klein (4B) for predicting tactile / haptic signatures from monocular RGB images. Tests whether a visually-pretrained image generator carries non-visual material priors that can be surfaced through the instruction-tuning recipe of Image Generators are Generalist Vision Learners (Gabeur et al., 2026; arXiv:2604.20329).
Thesis
Vision Banana demonstrates the recipe within visual perception. bumpy-plantain extends the claim out of visual perception. If the recipe can surface tactile structure — surface roughness, compliance, friction, hardness — from a backbone trained only on RGB pixels, then the latent priors of generative pretraining are not strictly visual: they encompass physical material properties accessible to direct contact sensing but not to vision per se. The instruction-tuning recipe becomes a probe of cross-modal grounding inherited from the base model's training distribution.
Method
Input: an RGB image of an object or surface. Output: a tactile signature image with surface roughness, compliance, and friction encoded as RGB via a Hamiltonian-path bijection through the color cube — the same bijection used by Vision Banana for metric depth, applied to scalar tactile fields. Training pairs are drawn from vision–tactile datasets in which each visual frame is paired with sensor readings from contact-based tactile sensors (e.g., GelSight, BioTac).
Status
Placeholder. Weights and training data forthcoming.
License
Apache 2.0 — matches base FLUX.2 Klein 4B.
References
- Gabeur, Long, Peng, et al. Image Generators are Generalist Vision Learners. arXiv:2604.20329 (2026).
- Black Forest Labs. FLUX.2 Klein. https://bfl.ai/models/flux-2-klein (2025).
- Downloads last month
- -
Model tree for phanerozoic/bumpy-plantain
Base model
black-forest-labs/FLUX.2-klein-base-4B