File size: 16,553 Bytes
924db9e 295dbf6 924db9e 295dbf6 924db9e 295dbf6 32e065b 295dbf6 32e065b 295dbf6 32e065b 295dbf6 32e065b ced21fc 32e065b fd3da57 32e065b ced21fc fd3da57 ced21fc 924db9e a2186eb 924db9e a2186eb 924db9e 5b1eb7f ced21fc 5b1eb7f ced21fc 5b1eb7f ced21fc 5b1eb7f fede6d2 5b1eb7f 3b77d98 5b1eb7f 295dbf6 5b1eb7f 295dbf6 ced21fc 5b1eb7f | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 | ---
license: cc-by-nd-4.0
language:
- en
library_name: pytorch
tags:
- eeg
- biosignal
- mamba
- state-space-model
- cross-attention
- foundation-model
- self-supervised
- masked-modeling
- lejepa
- topology-invariant
- neuroscience
datasets:
- TUEG
- TUAB
- APAVA
- TDBrain
- MoBI
- SEED-V
- Mumtaz2016
- MODMA
metrics:
- balanced_accuracy
- roc_auc
- pr_auc
- r2
- pearson_r
- cohen_kappa
thumbnail: https://raw.githubusercontent.com/pulp-bio/BioFoundation/refs/heads/main/docs/model/logo/LuMamba_logo.png
model-index:
- name: LuMamba-Tiny (LeJEPA-reconstruction pre-training)
results:
- task:
type: time-series-classification
name: EEG Abnormality Detection
dataset:
type: TUAB
name: TUH EEG Abnormal Corpus (TUAB)
metrics:
- type: balanced_accuracy
value: 80.99
name: Balanced Accuracy (%)
- type: roc_auc
value: 0.883
name: AUROC
- type: pr_auc
value: 0.892
name: AUC-PR
- task:
type: time-series-classification
name: Alzheimer's Disease Detection
dataset:
type: APAVA
name: APAVA
metrics:
- type: roc_auc
value: 0.955
name: AUROC
- type: pr_auc
value: 0.970
name: AUC-PR
- task:
type: time-series-classification
name: Parkinson's Disease Detection
dataset:
type: TDBrain
name: TDBrain
metrics:
- type: roc_auc
value: 0.961
name: AUROC
- type: pr_auc
value: 0.960
name: AUC-PR
- task:
type: time-series-classification
name: Major Depressive Disorder Detection
dataset:
type: Mumtaz2016
name: Mumtaz2016
metrics:
- type: roc_auc
value: 0.931
name: AUROC
- type: pr_auc
value: 0.952
name: AUC-PR
- name: LuMamba-Tiny (Reconstruction-only pre-training)
results:
- task:
type: time-series-classification
name: EEG Slowing Event and Seizure Detection
dataset:
type: TUSL
name: TUH EEG Slowing Corpus (TUSL)
metrics:
- type: roc_auc
value: 0.708
name: AUROC
- type: pr_auc
value: 0.289
name: AUC-PR
- task:
type: time-series-classification
name: EEG Artifact Detection
dataset:
type: TUAR
name: TUH EEG Artifact Corpus (TUAR)
metrics:
- type: roc_auc
value: 0.914
name: AUROC
- type: pr_auc
value: 0.510
name: AUC-PR
- task:
type: time-series-classification
name: Gait Prediction Regression
dataset:
type: MoBI
name: MoBI
metrics:
- type: r2
value: 0.116
name: R-squared
- type: rmse
value: 0.1482
name: Root Mean Squared Error
- task:
type: time-series-classification
name: 5-class Emotion Detection
dataset:
type: SEED-V
name: SEED-V
metrics:
- type: balanced_accuracy
value: 35.0
name: Balanced Accuracy (%)
- type: cohen_kappa
value: 0.191
name: Cohen's Kappa
- task:
type: time-series-classification
name: Major Depressive Disorder Detection
dataset:
type: MODMA
name: MODMA
metrics:
- type: balanced_accuracy
value: 59.5
name: Balanced Accuracy (%)
- type: roc_auc
value: 0.448
name: AUROC
- type: pr_auc
value: 0.420
name: AUC-PR
---
<div align="center">
<img src="https://raw.githubusercontent.com/pulp-bio/BioFoundation/refs/heads/main/docs/model/logo/LuMamba_logo.png" alt="LuMamba Logo" width="800"/>
<h1>LuMamba: Latent Unified Mamba for Electrode
Topology-Invariant and Efficient EEG Modeling</h1>
</div>
<p align="center">
<a href="https://github.com/pulp-bio/BioFoundation">
<img src ="https://img.shields.io/github/stars/pulp-bio/BioFoundation?color=ccf" alt="Github">
</a>
<a href="https://creativecommons.org/licenses/by-nd/4.0/">
<img src="https://img.shields.io/badge/License-CC_BY--ND_4.0-lightgrey.svg" alt="License">
</a>
<a href="https://arxiv.org/abs/2603.19100">
<img src="https://img.shields.io/badge/arXiv-2603.19100-b31b1b.svg" alt="Paper">
</a>
</p>
**LuMamba** (Latent Unified Mamba) is an **EEG foundation model** built on efficient **Mamba state-space learning**, capable of handling **heterogeneous channel topologies**.
LuMamba addresses varying channel layouts with **LUNA channel unification**, projecting a given EEG channel layout to a **fixed latent topology**, and overcomes the quadratic complexity of transformers with **FEMBA**'s efficient **bidirectional Mamba encoder**.
---
## 🔒 License & Usage Policy (Weights)
**Weights license:** The released model weights are licensed under **Creative Commons Attribution–NoDerivatives 4.0 (CC BY-ND 4.0)**. This section summarizes the practical implications for users. *This is not legal advice; please read the full license text.*
### ✅ You may
- **Use** and **redistribute** the **unmodified** LuMamba weights (including in commercial settings) **with proper attribution** to the LuMamba authors.
- **Fine-tune / adapt** the weights **for your internal use** (research or production) **without redistributing** the modified weights.
- **Publish your code, configs, logs, and papers** describing experiments with LuMamba (please cite the paper).
### 🚫 You may not
- **Share, host, or redistribute any modified weights** (including LoRA/adapter/delta checkpoints or pruned/quantized variants). Any parameter set that encodes an adaptation is considered a derivative and cannot be shared under CC BY-ND 4.0.
- **Imply endorsement** by the LuMamba authors for any derivative or evaluation without our written permission.
- **Use the LuMamba name** in a way that suggests your modified model is an official LuMamba release.
### 🤝 How to contribute improvements (PR-gated releases)
We welcome community improvements via a **pull-request (PR)** workflow. If you believe your improvements should become an **official LuMamba release**:
1. **Open a PR** in the [BioFoundation repository](https://github.com/pulp-bio/BioFoundation) describing the change (architecture/head/training recipe, datasets, preprocessing, compute).
2. Include **reproducibility artifacts**: configs, seeds, scripts, environment details, training/validation logs, and the **evaluation protocol** (e.g., TUAB/TUAR/TUSL) with exact splits.
3. Provide **comprehensive results** (AUROC/AUPR/BA, FLOPs, memory) vs. the baselines reported in the LuMamba paper.
4. After **maintainer review**, approved changes will be **retrained/validated** and, if accepted, **released by the maintainers** as a new **official LuMamba** checkpoint under **CC BY-ND 4.0**.
> Rationale: CC BY-ND protects users from fragmented, lower-quality “LuMamba variants,” while still enabling internal fine-tuning and a path for the community to upstream improvements through review.
---
## 🔎 Model Summary
- **Goal:** Efficient and topology-agnostic EEG modeling with linear complexity in sequence length.
- **Core idea:** **Channel-Unification Module** uses **learned queries** (Q) with **cross-attention** to map any set of channels to a fixed latent space. **bidirectional Mamba blocks** then operate on that latent sequence.
- **Pre-training data:** TUEG, **>21,000 hours** of raw EEG; downstream subjects removed to avoid leakage.
- **Downstream tasks:** **TUAB** (abnormal), **TUAR** (artifacts), **TUSL** (slowing), **SEED-V** (emotion; unseen 62-ch montage), **APAVA** (Alzheimer's disease; unseen 16-ch layout, **TDBrain** (Parkinson's disease; unseen 26-ch layout)
---
## 🚀 Model Variants
The model currently exists in a Tiny Variant, with the following parameters:
| Variant | Parameters | FEMBA parameters |LUNA parameters |
|-----------------|------------|-----------------------------|------------------------------------|
| LuMamba_tiny | 4.1M |(`num_blocks` = 2, `exp` = 2)|(`num_queries` = 6, `embed_dim` = 64)
Larger model sizes can be attained by increasing the number of bi-Mamba blocks `num_blocks` (e.g. 8 bi-Mamba blocks yields 15M parameters).
---
## 📊 Results
- **TUAB (abnormal vs normal):** 80.99 % Bal. Acc., 0.883 AUROC, 0.892 AUPR (LuMamba-Tiny, pre-trained with LeJEPA-reconstruction).
- **TUSL (slowing event VS. seizure detection)**: 0.708 AUROC, 0.289 AUPR (LuMamba-Tiny, pre-trained with reconstruction-only).
- **TUAR (artifact detection)**: 0.914 AUROC, 0.510 AUPR (LuMamba-Tiny, pre-trained with reconstruction-only).
- **APAVA (Alzheimer's detection)**: 0.955 AUROC, 0.970 AUPR (LuMamba-Tiny, pre-trained with LeJEPA-reconstruction).
- **TDBrain (Parkinson's detection)**: 0.961 AUROC, 0.960 AUPR (LuMamba-Tiny, pre-trained with LeJEPA-reconstruction).
- **Mumtaz2016 (Depression detection)**: 0.725 Bal. Acc., 0.931 AUROC, 0.952 AUPR (LuMamba-Tiny, pre-trained with LeJEPA-reconstruction).
- **SEED-V (5-class emotion detection)**: 0.350 Bal. Acc., 0.191 Cohen's Kappa (LuMamba-Tiny, pre-trained with reconstruction-only).
- **MoBI (gait prediction)**: 0.116 R-squared, 0.148 RMSE (LuMamba-Tiny, pre-trained with reconstruction-only).
- **MODMA (full 128-channel set)**: 59.47 % Bal. Acc., 0.448 AUROC, 0.420 AUPR (LuMamba-Tiny, pre-trained with reconstruction-only)
- **MODMA (reduced 13-channel subset)**: 59.09 % Bal. Acc., 0.522 AUROC, 0.4153 AUPR (LuMamba-Tiny, pre-trained with LeJEPA-reconstruction).
**Efficiency:** Up to **377× fewer FLOPs** relative to transformer-based baselines and supporting up to **500x longer** EEG windows, thanks to the efficient FEMBA bi-Mamba encoder.
---
## 🧠 Intended Use & Limitations
**Intended use.** Research on EEG representation learning & classification (abnormality, artifacts, slowing, emotion), especially when **montages vary** or **channel counts are high**.
**Limitations.**
- **Not a medical device.** Do **not** use for clinical decisions without proper validation & regulatory clearance.
- **Unseen topologies:** Zero-shot transfer to **very different/dense** layouts (e.g., SEED-V) can underperform SOTA despite positive scaling; consider augmenting pre-training montage diversity and spatial encodings.
- **Distribution shifts:** Performance varies across cohorts, devices, and label protocols; validate locally and consider domain adaptation.
---
## 🏗️ Architecture & Training
**LUNA Tokenizer & features.** EEG is patch-segmented; temporal features via 1D conv w/ GroupNorm+GELU; **frequency features** (FFT mag/phase → MLP) are added; 3D electrode coordinates encoded via **NeRF-style sinusoids → MLP** (positional enc).
**LUNA Channel-Unification Module.** **Q learned queries** cross-attend to **channel-wise patch features** to produce a **fixed Q×E latent** per patch; FFN + Transformer layers refine the query tokens. Complexity is **O(Q·C)** (linear in channels).
**FEMBA Bi-Mamba Temporal encoder.** **Mamba blocks** process the embeddings in separate forward and backward streams.
**Pre-training objectives.** **Masked-patch reconstruction** is used to reconstruct masked tokens. In parallel, the **LeJEPA loss** encourages an isotropic Gaussian embedding distribution to minimize downstream prediction risk.
---
## 🔧 How to Use
LuMamba weights are organized by pre-training configuration:
- **`Reconstruction-only`** → variants pre-trained with masked reconstruction exclusively
- **`LeJEPA-reconstruction`** → variants pre-trained with a balanced mixture of masked reconstruction and LeJEPA losses. Variants exist for two different LeJEPA hyperparameters: 128 and 300 projection slices.
- **`LeJEPA-only`** → variant pre-trained with LeJEPA exclusively.
All variants are pre-trained on TUEG.
LuMamba experiments are categorized by two Hydra configurations, in `BioFoundation/config/experiments`:
- **`LuMamba_finetune.yaml`** → configuration for fine-tuning experiments.
- **`LuMamba_pretrain.yaml`** → configuration for pre-training experiments.
---
## 🔧 Fine-tuning — General Checklist
0. **Install & read data prep**: clone the [BioFoundation repo](https://github.com/pulp-bio/BioFoundation), set up the environment as described there, then open `make_datasets/README.md` for dataset-specific notes (naming, expected folder layout, and common pitfalls).
1. **Point to weights**: set `pretrained_safetensors_path: /path/to/LuMamba_*.safetensors` in the experiment YAML.
2. **Preprocess data**: acquire fine-tuning dataset and follow preprocessing protocol (see guide in `/make_datasets/README.md`) to generate `train/test/val.h5` files.
3. **Update data module of `LuMamba_finetune.yaml` config**:
- **TUH datasets (TUAB/TUSL/TUAR)** → change `_target_` in `/data_module:` to `datasets.tuh_dataset.TUH_Dataset`.
- **Other** → change `/data_module:_target_` to corresponding dataset.py file in `BioFoundation/datasets` (e.g., for TDBrain dataset use `_target_:datasets.tdbrain_dataset.TDBrain_Dataset`)
- **HDF5 file location** → change `/data_module:hdf5_file` for `train`, `test`, and `val` with the path to the corresponding HDF5 data split file.
4. **Task settings**:
- **Task type**: override with `/task:finetune_task_LUNA` for classification and `/task:finetune_regression_task_LuMamba` for regression tasks
- **Classification type**: set `classification_type` (`bc`, `mcc`) and `model.num_classes` to match your downstream task. In a regression scenario,`mcc` is used and `model.num_classes` describes the number of features in the output.
- **Classifier choice**: set `/model:classifier_option` (`mamba` for FEMBA classifier, `linear` for single-layer linear classifier,`null` for default LUNA classifier)
- Configuration file includes further `#CHANGEME` tags and instructions for a working example.
5. **Env vars**: export `DATA_PATH` (dataset root) and `CHECKPOINT_DIR` (artifacts).
6. **Trainer/optimizer**: adjust `gpus/devices`, `batch_size`, `max_epochs`, LR/scheduler if needed.
7. **I/O**: set `io.base_output_path` and confirm `io.checkpoint_dirpath` exists.
To launch fine-tuning (Hydra):
```bash
python -u run_train.py +experiment=LuMamba_finetune
```
---
## ⚖️ Responsible AI, Risks & Biases
- **Clinical safety:** research-only; human oversight required.
- **Bias & drift:** montage/device/population differences can induce shifts; validate and monitor.
- **Artifacts & rare events:** robustness varies; use QC and task-appropriate preprocessing.
---
## 🔗 Sources
- **Code:** https://github.com/pulp-bio/BioFoundation
- **Paper:** LuMamba: Latent Unified Mamba for Electrode Topology-Invariant and Efficient EEG Modeling (arxiv:2603.19100)
---
## 📜 Citation
If you use LuMamba, please cite:
```bibtex
@misc{broustail2026lumambalatentunifiedmamba,
title={LuMamba: Latent Unified Mamba for Electrode Topology-Invariant and Efficient EEG Modeling},
author={Danaé Broustail and Anna Tegon and Thorir Mar Ingolfsson and Yawei Li and Luca Benini},
year={2026},
eprint={2603.19100},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2603.19100},
}
```
---
## 🛠️ Maintenance & Contact
- **Issues & support:** please open a GitHub issue in the BioFoundation repository.
---
---
## 🔗 Related Models
- **[LUNA](https://huggingface.co/PulpBio/LUNA)** — Transformer-based topology-agnostic EEG foundation model (NeurIPS 2025). Source of the channel-unification cross-attention module that LuMamba reuses.
- **[FEMBA](https://huggingface.co/PulpBio/FEMBA)** — Bidirectional Mamba foundation model for EEG. Source of the linear-complexity temporal backbone that LuMamba reuses.
- **[TinyMyo](https://huggingface.co/PulpBio/TinyMyo)** — Tiny foundation model for flexible EMG signal processing at the edge.
## 🗒️ Changelog
- **v1.0:** Initial release of LuMamba model card with task-specific checkpoints and instructions.
|