--- license: bsd-3-clause library_name: braindecode pipeline_tag: feature-extraction tags: - eeg - biosignal - pytorch - neuroscience - braindecode - foundation-model - transformer --- # REVE **R**\ epresentation for **E**\ EG with **V**\ ersatile **E**\ mbeddings (REVE) from El Ouahidi et al. (2025) [reve]. > **Architecture-only repository.** Documents the > `braindecode.models.REVE` class. **No pretrained weights are > distributed here.** Instantiate the model and train it on your own > data. ## Quick start ```bash pip install braindecode ``` ```python from braindecode.models import REVE model = REVE( n_chans=22, sfreq=250, input_window_seconds=4.0, n_outputs=4, ) ``` The signal-shape arguments above are illustrative defaults — adjust to match your recording. ## Documentation - Full API reference: - Interactive browser (live instantiation, parameter counts): - Source on GitHub: ## Architecture ![REVE architecture](https://brain-bzh.github.io/reve/static/images/architecture.png) ## Parameters | Parameter | Type | Description | |---|---|---| | `embed_dim` | int, default=512 | Embedding dimension. Use 512 for REVE-Base, 1250 for REVE-Large. | | `depth` | int, default=22 | Number of Transformer layers. | | `heads` | int, default=8 | Number of attention heads. | | `head_dim` | int, default=64 | Dimension per attention head. | | `mlp_dim_ratio` | float, default=2.66 | FFN hidden dimension ratio: `mlp_dim = embed_dim × mlp_dim_ratio`. | | `use_geglu` | bool, default=True | Use GEGLU activation (recommended) or standard GELU. | | `freqs` | int, default=4 | Number of frequencies for Fourier positional embedding. | | `patch_size` | int, default=200 | Temporal patch size in samples (200 samples = 1 second at 200 Hz). | | `patch_overlap` | int, default=20 | Overlap between patches in samples. | | `attention_pooling` | bool, default=False | Pooling strategy for aggregating transformer outputs before classification. If `False` (default), all tokens are flattened into a single vector of size `(n_chans x n_patches x embed_dim)`, which is then passed through LayerNorm and a linear classifier. If `True`, uses attention-based pooling with a learnable query token that attends to all encoder outputs, producing a single embedding of size `embed_dim`. Attention pooling is more parameter-efficient for long sequences and variable-length inputs. | ## References 1. El Ouahidi, Y., Lys, J., Thölke, P., Farrugia, N., Pasdeloup, B., Gripon, V., Jerbi, K. & Lioi, G. (2025). REVE: A Foundation Model for EEG - Adapting to Any Setup with Large-Scale Pretraining on 25,000 Subjects. The Thirty-Ninth Annual Conference on Neural Information Processing Systems. https://openreview.net/forum?id=ZeFMtRBy4Z 2. Défossez, A., Caucheteux, C., Rapin, J., Kabeli, O., & King, J. R. (2023). Decoding speech perception from non-invasive brain recordings. Nature Machine Intelligence, 5(10), 1097-1107. ## Citation Cite the original architecture paper (see *References* above) and braindecode: ```bibtex @article{aristimunha2025braindecode, title = {Braindecode: a deep learning library for raw electrophysiological data}, author = {Aristimunha, Bruno and others}, journal = {Zenodo}, year = {2025}, doi = {10.5281/zenodo.17699192}, } ``` ## License BSD-3-Clause for the model code (matching braindecode). Pretraining-derived weights, if you fine-tune from a checkpoint, inherit the licence of that checkpoint and its training corpus.