---
license: bsd-3-clause
library_name: braindecode
pipeline_tag: feature-extraction
tags:
  - eeg
  - biosignal
  - pytorch
  - neuroscience
  - braindecode
  - convolutional
---

# BrainModule

BrainModule from [brainmagick], also known as SimpleConv.

> **Architecture-only repository.** Documents the
> `braindecode.models.BrainModule` class. **No pretrained weights are
> distributed here.** Instantiate the model and train it on your own
> data.

## Quick start

```bash
pip install braindecode
```

```python
from braindecode.models import BrainModule

model = BrainModule(
    n_chans=22,
    sfreq=250,
    input_window_seconds=4.0,
    n_outputs=4,
)
```

The signal-shape arguments above are illustrative defaults — adjust to
match your recording.

## Documentation
- Full API reference: <https://braindecode.org/stable/generated/braindecode.models.BrainModule.html>
- Interactive browser (live instantiation, parameter counts):
  <https://huggingface.co/spaces/braindecode/model-explorer>
- Source on GitHub: <https://github.com/braindecode/braindecode/blob/master/braindecode/models/brainmodule.py#L25>


## Architecture

![BrainModule architecture](../_static/model/simpleconv.png)


## Parameters

| Parameter | Type | Description |
|---|---|---|
| `hidden_dim` | int, default=320 | Hidden dimension for convolutional layers. Input is projected to this dimension before the convolutional blocks. |
| `depth` | int, default=10 | Number of convolutional blocks. Each block contains a dilated convolution with batch normalization and activation, followed by a residual connection. |
| `kernel_size` | int, default=3 | Convolutional kernel size. Must be odd for proper padding with dilation. |
| `growth` | float, default=1.0 | Channel size multiplier: hidden_dim * (growth ** layer_index). Values > 1.0 grow channels deeper; < 1.0 shrink them. Note: growth != 1.0 disables residual connections between layers with different channel sizes. |
| `dilation_growth` | int, default=2 | Dilation multiplier per layer (e.g., 2 means dilation doubles each layer). Improves receptive field exponentially. Requires odd kernel_size. |
| `dilation_period` | int, default=5 | Reset dilation to 1 every N layers. Prevents dilation from growing too large and maintains local connectivity. |
| `conv_drop_prob` | float, default=0.0 | Dropout probability for convolutional layers. |
| `dropout_input` | float, default=0.0 | Dropout probability applied to model input only. |
| `batch_norm` | bool, default=True | If True, apply batch normalization after each convolution. |
| `activation` | type[nn.Module], default=nn.GELU | Activation function class to use (e.g., nn.GELU, nn.ReLU, nn.ELU). |
| `n_subjects` | int, default=200 | Number of unique subjects (for subject-specific pathways). Only used if subject_dim > 0. |
| `subject_dim` | int, default=0 | Dimension of subject embeddings. If 0, no subject-specific features. If > 0, adds subject embeddings to the input before encoding. |
| `subject_layers` | bool, default=False | If True, apply subject-specific linear transformations to input channels. Each subject has its own weight matrix. Requires subject_dim > 0. |
| `subject_layers_dim` | str, default="input" | Where to apply subject layers: "input" or "hidden". |
| `subject_layers_id` | bool, default=False | If True, initialize subject layers as identity matrices. |
| `embedding_scale` | float, default=1.0 | Scaling factor for subject embeddings learning rate. |
| `n_fft` | int, optional | FFT size for STFT processing. If None, no STFT is applied. If specified, applies spectrogram transform before encoding. |
| `fft_complex` | bool, default=True | If True, keep complex spectrogram. If False, use power spectrogram. Only used when n_fft is not None. |
| `channel_dropout_prob` | float, default=0.0 | Probability of dropping each channel during training (0.0 to 1.0). If 0.0, no channel dropout is applied. |
| `channel_dropout_type` | str, optional | If specified with chs_info, only drop channels of this type (e.g., 'eeg', 'ref', 'eog'). If None with dropout_prob > 0, drops any channel. |
| `glu` | int, default=2 | If > 0, applies Gated Linear Units (GLU) every N convolutional layers. GLUs gate intermediate representations for more expressivity. If 0, no GLU is applied. |
| `glu_context` | int, default=1 | Context window size for GLU gates. If > 0, uses contextual information from neighboring time steps for gating. Requires glu > 0. |


## References

1. Défossez, A., Caucheteux, C., Rapin, J., Kabeli, O., & King, J. R. (2023). Decoding speech perception from non-invasive brain recordings. Nature Machine Intelligence, 5(10), 1097-1107.


## Citation

Cite the original architecture paper (see *References* above) and braindecode:

```bibtex
@article{aristimunha2025braindecode,
  title   = {Braindecode: a deep learning library for raw electrophysiological data},
  author  = {Aristimunha, Bruno and others},
  journal = {Zenodo},
  year    = {2025},
  doi     = {10.5281/zenodo.17699192},
}
```

## License

BSD-3-Clause for the model code (matching braindecode).
Pretraining-derived weights, if you fine-tune from a checkpoint,
inherit the licence of that checkpoint and its training corpus.