File size: 16,553 Bytes
924db9e
 
 
 
295dbf6
924db9e
 
295dbf6
924db9e
295dbf6
 
 
 
 
 
 
 
 
 
 
 
 
32e065b
 
 
 
295dbf6
 
 
 
32e065b
 
 
295dbf6
 
32e065b
295dbf6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
32e065b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ced21fc
32e065b
 
 
 
 
 
 
 
 
 
 
fd3da57
32e065b
 
 
 
ced21fc
 
 
 
 
 
 
 
fd3da57
ced21fc
 
 
 
 
 
 
924db9e
 
a2186eb
924db9e
 
 
 
 
 
 
 
 
 
 
a2186eb
924db9e
5b1eb7f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ced21fc
5b1eb7f
ced21fc
 
 
5b1eb7f
 
ced21fc
 
 
 
 
 
5b1eb7f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
fede6d2
 
5b1eb7f
 
 
 
 
 
 
3b77d98
5b1eb7f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
295dbf6
 
 
5b1eb7f
295dbf6
 
 
ced21fc
5b1eb7f
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
---
license: cc-by-nd-4.0
language:
- en
library_name: pytorch
tags:
- eeg
- biosignal
- mamba
- state-space-model
- cross-attention
- foundation-model
- self-supervised
- masked-modeling
- lejepa
- topology-invariant
- neuroscience
datasets:
- TUEG
- TUAB
- APAVA
- TDBrain
- MoBI
- SEED-V
- Mumtaz2016
- MODMA
metrics:
- balanced_accuracy
- roc_auc
- pr_auc
- r2
- pearson_r
- cohen_kappa
thumbnail: https://raw.githubusercontent.com/pulp-bio/BioFoundation/refs/heads/main/docs/model/logo/LuMamba_logo.png
model-index:
  - name: LuMamba-Tiny (LeJEPA-reconstruction pre-training)
    results:
      - task:
          type: time-series-classification
          name: EEG Abnormality Detection
        dataset:
          type: TUAB
          name: TUH EEG Abnormal Corpus (TUAB)
        metrics:
          - type: balanced_accuracy
            value: 80.99
            name: Balanced Accuracy (%)
          - type: roc_auc
            value: 0.883
            name: AUROC
          - type: pr_auc
            value: 0.892
            name: AUC-PR
      - task:
          type: time-series-classification
          name: Alzheimer's Disease Detection
        dataset:
          type: APAVA
          name: APAVA
        metrics:
          - type: roc_auc
            value: 0.955
            name: AUROC
          - type: pr_auc
            value: 0.970
            name: AUC-PR
      - task:
          type: time-series-classification
          name: Parkinson's Disease Detection
        dataset:
          type: TDBrain
          name: TDBrain
        metrics:
          - type: roc_auc
            value: 0.961
            name: AUROC
          - type: pr_auc
            value: 0.960
            name: AUC-PR
      - task:
          type: time-series-classification
          name: Major Depressive Disorder Detection
        dataset:
          type: Mumtaz2016
          name: Mumtaz2016
        metrics:
          - type: roc_auc
            value: 0.931
            name: AUROC
          - type: pr_auc
            value: 0.952
            name: AUC-PR
  - name: LuMamba-Tiny (Reconstruction-only pre-training)
    results:
      - task:
          type: time-series-classification
          name: EEG Slowing Event and Seizure Detection
        dataset:
          type: TUSL
          name: TUH EEG Slowing Corpus (TUSL)
        metrics:
          - type: roc_auc
            value: 0.708
            name: AUROC
          - type: pr_auc
            value: 0.289
            name: AUC-PR
      - task:
          type: time-series-classification
          name: EEG Artifact Detection
        dataset:
          type: TUAR
          name: TUH EEG Artifact Corpus (TUAR)
        metrics:
          - type: roc_auc
            value: 0.914
            name: AUROC
          - type: pr_auc
            value: 0.510
            name: AUC-PR
      - task:
          type: time-series-classification
          name: Gait Prediction Regression
        dataset:
          type: MoBI
          name: MoBI
        metrics:
          - type: r2
            value: 0.116
            name: R-squared
          - type: rmse
            value: 0.1482
            name: Root Mean Squared Error
      - task:
          type: time-series-classification
          name: 5-class Emotion Detection
        dataset:
          type: SEED-V
          name: SEED-V
        metrics:
          - type: balanced_accuracy
            value: 35.0
            name: Balanced Accuracy (%)
          - type: cohen_kappa
            value: 0.191
            name: Cohen's Kappa
      - task:
          type: time-series-classification
          name: Major Depressive Disorder Detection
        dataset:
          type: MODMA
          name: MODMA
        metrics:
          - type: balanced_accuracy
            value: 59.5
            name: Balanced Accuracy (%)
          - type: roc_auc
            value: 0.448
            name: AUROC
          - type: pr_auc
            value: 0.420
            name: AUC-PR
---
<div align="center">
  <img src="https://raw.githubusercontent.com/pulp-bio/BioFoundation/refs/heads/main/docs/model/logo/LuMamba_logo.png" alt="LuMamba Logo" width="800"/>
  <h1>LuMamba: Latent Unified Mamba for Electrode
Topology-Invariant and Efficient EEG Modeling</h1>
</div>
<p align="center">
  <a href="https://github.com/pulp-bio/BioFoundation">
    <img src ="https://img.shields.io/github/stars/pulp-bio/BioFoundation?color=ccf" alt="Github">
  </a>
  <a href="https://creativecommons.org/licenses/by-nd/4.0/">
    <img src="https://img.shields.io/badge/License-CC_BY--ND_4.0-lightgrey.svg" alt="License">
  </a>
  <a href="https://arxiv.org/abs/2603.19100">
    <img src="https://img.shields.io/badge/arXiv-2603.19100-b31b1b.svg" alt="Paper">
  </a>
</p>


**LuMamba** (Latent Unified Mamba) is an **EEG foundation model** built on efficient **Mamba state-space learning**, capable of handling **heterogeneous channel topologies**.
LuMamba addresses varying channel layouts with **LUNA channel unification**, projecting a given EEG channel layout to a **fixed latent topology**, and overcomes the quadratic complexity of transformers with **FEMBA**'s efficient **bidirectional Mamba encoder**.

---

## 🔒 License & Usage Policy (Weights)

**Weights license:** The released model weights are licensed under **Creative Commons Attribution–NoDerivatives 4.0 (CC BY-ND 4.0)**. This section summarizes the practical implications for users. *This is not legal advice; please read the full license text.*

### ✅ You may
- **Use** and **redistribute** the **unmodified** LuMamba weights (including in commercial settings) **with proper attribution** to the LuMamba authors.
- **Fine-tune / adapt** the weights **for your internal use** (research or production) **without redistributing** the modified weights.
- **Publish your code, configs, logs, and papers** describing experiments with LuMamba (please cite the paper).

### 🚫 You may not
- **Share, host, or redistribute any modified weights** (including LoRA/adapter/delta checkpoints or pruned/quantized variants). Any parameter set that encodes an adaptation is considered a derivative and cannot be shared under CC BY-ND 4.0.
- **Imply endorsement** by the LuMamba authors for any derivative or evaluation without our written permission.
- **Use the LuMamba name** in a way that suggests your modified model is an official LuMamba release.

### 🤝 How to contribute improvements (PR-gated releases)
We welcome community improvements via a **pull-request (PR)** workflow. If you believe your improvements should become an **official LuMamba release**:
1. **Open a PR** in the [BioFoundation repository](https://github.com/pulp-bio/BioFoundation) describing the change (architecture/head/training recipe, datasets, preprocessing, compute).
2. Include **reproducibility artifacts**: configs, seeds, scripts, environment details, training/validation logs, and the **evaluation protocol** (e.g., TUAB/TUAR/TUSL) with exact splits.
3. Provide **comprehensive results** (AUROC/AUPR/BA, FLOPs, memory) vs. the baselines reported in the LuMamba paper.
4. After **maintainer review**, approved changes will be **retrained/validated** and, if accepted, **released by the maintainers** as a new **official LuMamba** checkpoint under **CC BY-ND 4.0**.

> Rationale: CC BY-ND protects users from fragmented, lower-quality “LuMamba variants,” while still enabling internal fine-tuning and a path for the community to upstream improvements through review.

---

## 🔎 Model Summary

- **Goal:** Efficient and topology-agnostic EEG modeling with linear complexity in sequence length.
- **Core idea:** **Channel-Unification Module** uses **learned queries** (Q) with **cross-attention** to map any set of channels to a fixed latent space. **bidirectional Mamba blocks** then operate on that latent sequence.
- **Pre-training data:** TUEG, **>21,000 hours** of raw EEG; downstream subjects removed to avoid leakage.
- **Downstream tasks:** **TUAB** (abnormal), **TUAR** (artifacts), **TUSL** (slowing), **SEED-V** (emotion; unseen 62-ch montage), **APAVA** (Alzheimer's disease; unseen 16-ch layout, **TDBrain** (Parkinson's disease; unseen 26-ch layout)

---

## 🚀 Model Variants

The model currently exists in a Tiny Variant, with the following parameters: 

| Variant         | Parameters | FEMBA parameters            |LUNA parameters                     |
|-----------------|------------|-----------------------------|------------------------------------|
| LuMamba_tiny    | 4.1M       |(`num_blocks` = 2, `exp` = 2)|(`num_queries` = 6, `embed_dim` = 64)

Larger model sizes can be attained by increasing the number of bi-Mamba blocks `num_blocks` (e.g. 8 bi-Mamba blocks yields 15M parameters).

---

## 📊 Results

- **TUAB (abnormal vs normal):** 80.99 % Bal. Acc., 0.883 AUROC, 0.892 AUPR (LuMamba-Tiny, pre-trained with LeJEPA-reconstruction).
- **TUSL (slowing event VS. seizure detection)**: 0.708 AUROC, 0.289 AUPR (LuMamba-Tiny, pre-trained with reconstruction-only).
- **TUAR (artifact detection)**: 0.914 AUROC, 0.510 AUPR (LuMamba-Tiny, pre-trained with reconstruction-only).
- **APAVA (Alzheimer's detection)**: 0.955 AUROC, 0.970 AUPR (LuMamba-Tiny, pre-trained with LeJEPA-reconstruction).
- **TDBrain (Parkinson's detection)**: 0.961 AUROC, 0.960 AUPR (LuMamba-Tiny, pre-trained with LeJEPA-reconstruction).
- **Mumtaz2016 (Depression detection)**: 0.725 Bal. Acc., 0.931 AUROC, 0.952 AUPR (LuMamba-Tiny, pre-trained with LeJEPA-reconstruction).
- **SEED-V (5-class emotion detection)**: 0.350 Bal. Acc., 0.191 Cohen's Kappa (LuMamba-Tiny, pre-trained with reconstruction-only).
- **MoBI (gait prediction)**: 0.116 R-squared, 0.148 RMSE (LuMamba-Tiny, pre-trained with reconstruction-only).
- **MODMA (full 128-channel set)**: 59.47 % Bal. Acc., 0.448 AUROC, 0.420 AUPR (LuMamba-Tiny, pre-trained with reconstruction-only)
- **MODMA (reduced 13-channel subset)**: 59.09 % Bal. Acc., 0.522 AUROC, 0.4153 AUPR (LuMamba-Tiny, pre-trained with LeJEPA-reconstruction).


**Efficiency:** Up to **377× fewer FLOPs** relative to transformer-based baselines and supporting up to **500x longer** EEG windows, thanks to the efficient FEMBA bi-Mamba encoder.

---

## 🧠 Intended Use & Limitations

**Intended use.** Research on EEG representation learning & classification (abnormality, artifacts, slowing, emotion), especially when **montages vary** or **channel counts are high**.

**Limitations.**
- **Not a medical device.** Do **not** use for clinical decisions without proper validation & regulatory clearance.  
- **Unseen topologies:** Zero-shot transfer to **very different/dense** layouts (e.g., SEED-V) can underperform SOTA despite positive scaling; consider augmenting pre-training montage diversity and spatial encodings.
- **Distribution shifts:** Performance varies across cohorts, devices, and label protocols; validate locally and consider domain adaptation.

---

## 🏗️ Architecture & Training

**LUNA Tokenizer & features.** EEG is patch-segmented; temporal features via 1D conv w/ GroupNorm+GELU; **frequency features** (FFT mag/phase → MLP) are added; 3D electrode coordinates encoded via **NeRF-style sinusoids → MLP** (positional enc).

**LUNA Channel-Unification Module.** **Q learned queries** cross-attend to **channel-wise patch features** to produce a **fixed Q×E latent** per patch; FFN + Transformer layers refine the query tokens. Complexity is **O(Q·C)** (linear in channels).

**FEMBA Bi-Mamba Temporal encoder.** **Mamba blocks** process the embeddings in separate forward and backward streams.

**Pre-training objectives.** **Masked-patch reconstruction** is used to reconstruct masked tokens. In parallel, the **LeJEPA loss** encourages an isotropic Gaussian embedding distribution to minimize downstream prediction risk.

---

## 🔧 How to Use

LuMamba weights are organized by pre-training configuration:

- **`Reconstruction-only`** → variants pre-trained with masked reconstruction exclusively  
- **`LeJEPA-reconstruction`** → variants pre-trained with a balanced mixture of masked reconstruction and LeJEPA losses. Variants exist for two different LeJEPA hyperparameters: 128 and 300 projection slices.
- **`LeJEPA-only`** → variant pre-trained with LeJEPA exclusively.

All variants are pre-trained on TUEG.

LuMamba experiments are categorized by two Hydra configurations, in `BioFoundation/config/experiments`:
- **`LuMamba_finetune.yaml`** → configuration for fine-tuning experiments.
- **`LuMamba_pretrain.yaml`** → configuration for pre-training experiments.

---

## 🔧 Fine-tuning — General Checklist

0. **Install & read data prep**: clone the [BioFoundation repo](https://github.com/pulp-bio/BioFoundation), set up the environment as described there, then open `make_datasets/README.md` for dataset-specific notes (naming, expected folder layout, and common pitfalls).
1. **Point to weights**: set `pretrained_safetensors_path: /path/to/LuMamba_*.safetensors` in the experiment YAML.
2. **Preprocess data**: acquire fine-tuning dataset and follow preprocessing protocol (see guide in `/make_datasets/README.md`) to generate `train/test/val.h5` files.
3. **Update data module of `LuMamba_finetune.yaml` config**:
    - **TUH datasets (TUAB/TUSL/TUAR)** → change `_target_` in `/data_module:` to `datasets.tuh_dataset.TUH_Dataset`. 
    - **Other** → change `/data_module:_target_` to corresponding dataset.py file in `BioFoundation/datasets` (e.g., for TDBrain dataset use `_target_:datasets.tdbrain_dataset.TDBrain_Dataset`)
    - **HDF5 file location**  → change `/data_module:hdf5_file` for `train`, `test`, and `val` with the path to the corresponding HDF5 data split file.
4. **Task settings**: 
    - **Task type**: override with `/task:finetune_task_LUNA` for classification and `/task:finetune_regression_task_LuMamba` for regression tasks
    - **Classification type**: set `classification_type` (`bc`, `mcc`) and `model.num_classes` to match your downstream task. In a regression scenario,`mcc` is used and `model.num_classes` describes the number of features in the output.
    - **Classifier choice**: set `/model:classifier_option` (`mamba` for FEMBA classifier, `linear` for single-layer linear classifier,`null` for default LUNA classifier)
    - Configuration file includes further `#CHANGEME` tags and instructions for a working example.
5. **Env vars**: export `DATA_PATH` (dataset root) and `CHECKPOINT_DIR` (artifacts).
6. **Trainer/optimizer**: adjust `gpus/devices`, `batch_size`, `max_epochs`, LR/scheduler if needed.
7. **I/O**: set `io.base_output_path` and confirm `io.checkpoint_dirpath` exists.


To launch fine-tuning (Hydra):

```bash
python -u run_train.py +experiment=LuMamba_finetune
```

---

## ⚖️ Responsible AI, Risks & Biases

- **Clinical safety:** research-only; human oversight required.  
- **Bias & drift:** montage/device/population differences can induce shifts; validate and monitor.  
- **Artifacts & rare events:** robustness varies; use QC and task-appropriate preprocessing.

---

## 🔗 Sources

- **Code:** https://github.com/pulp-bio/BioFoundation  
- **Paper:** LuMamba: Latent Unified Mamba for Electrode Topology-Invariant and Efficient EEG Modeling (arxiv:2603.19100)

---

## 📜 Citation

If you use LuMamba, please cite:

```bibtex
@misc{broustail2026lumambalatentunifiedmamba,
      title={LuMamba: Latent Unified Mamba for Electrode Topology-Invariant and Efficient EEG Modeling}, 
      author={Danaé Broustail and Anna Tegon and Thorir Mar Ingolfsson and Yawei Li and Luca Benini},
      year={2026},
      eprint={2603.19100},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2603.19100}, 
}
```

---

## 🛠️ Maintenance & Contact

- **Issues & support:** please open a GitHub issue in the BioFoundation repository.

---
---

## 🔗 Related Models

- **[LUNA](https://huggingface.co/PulpBio/LUNA)** — Transformer-based topology-agnostic EEG foundation model (NeurIPS 2025). Source of the channel-unification cross-attention module that LuMamba reuses.
- **[FEMBA](https://huggingface.co/PulpBio/FEMBA)** — Bidirectional Mamba foundation model for EEG. Source of the linear-complexity temporal backbone that LuMamba reuses.
- **[TinyMyo](https://huggingface.co/PulpBio/TinyMyo)** — Tiny foundation model for flexible EMG signal processing at the edge.
  
## 🗒️ Changelog

- **v1.0:** Initial release of LuMamba model card with task-specific checkpoints and instructions.