bumpy-plantain

A LoRA adapter on FLUX.2 Klein (4B) for predicting tactile / haptic signatures from monocular RGB images. Tests whether a visually-pretrained image generator carries non-visual material priors that can be surfaced through the instruction-tuning recipe of Image Generators are Generalist Vision Learners (Gabeur et al., 2026; arXiv:2604.20329).

Thesis

Vision Banana demonstrates the recipe within visual perception. bumpy-plantain extends the claim out of visual perception. If the recipe can surface tactile structure — surface roughness, compliance, friction, hardness — from a backbone trained only on RGB pixels, then the latent priors of generative pretraining are not strictly visual: they encompass physical material properties accessible to direct contact sensing but not to vision per se. The instruction-tuning recipe becomes a probe of cross-modal grounding inherited from the base model's training distribution.

Method

Input: an RGB image of an object or surface. Output: a tactile signature image with surface roughness, compliance, and friction encoded as RGB via a Hamiltonian-path bijection through the color cube — the same bijection used by Vision Banana for metric depth, applied to scalar tactile fields. Training pairs are drawn from vision–tactile datasets in which each visual frame is paired with sensor readings from contact-based tactile sensors (e.g., GelSight, BioTac).

Status

Placeholder. Weights and training data forthcoming.

License

Apache 2.0 — matches base FLUX.2 Klein 4B.

References

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for phanerozoic/bumpy-plantain

Adapter
(42)
this model

Collection including phanerozoic/bumpy-plantain

Paper for phanerozoic/bumpy-plantain