--- license: bsd-3-clause library_name: braindecode pipeline_tag: feature-extraction tags: - eeg - biosignal - pytorch - neuroscience - braindecode - convolutional --- # BrainModule BrainModule from [brainmagick], also known as SimpleConv. > **Architecture-only repository.** Documents the > `braindecode.models.BrainModule` class. **No pretrained weights are > distributed here.** Instantiate the model and train it on your own > data. ## Quick start ```bash pip install braindecode ``` ```python from braindecode.models import BrainModule model = BrainModule( n_chans=22, sfreq=250, input_window_seconds=4.0, n_outputs=4, ) ``` The signal-shape arguments above are illustrative defaults — adjust to match your recording. ## Documentation - Full API reference: - Interactive browser (live instantiation, parameter counts): - Source on GitHub: ## Architecture ![BrainModule architecture](../_static/model/simpleconv.png) ## Parameters | Parameter | Type | Description | |---|---|---| | `hidden_dim` | int, default=320 | Hidden dimension for convolutional layers. Input is projected to this dimension before the convolutional blocks. | | `depth` | int, default=10 | Number of convolutional blocks. Each block contains a dilated convolution with batch normalization and activation, followed by a residual connection. | | `kernel_size` | int, default=3 | Convolutional kernel size. Must be odd for proper padding with dilation. | | `growth` | float, default=1.0 | Channel size multiplier: hidden_dim * (growth ** layer_index). Values > 1.0 grow channels deeper; < 1.0 shrink them. Note: growth != 1.0 disables residual connections between layers with different channel sizes. | | `dilation_growth` | int, default=2 | Dilation multiplier per layer (e.g., 2 means dilation doubles each layer). Improves receptive field exponentially. Requires odd kernel_size. | | `dilation_period` | int, default=5 | Reset dilation to 1 every N layers. Prevents dilation from growing too large and maintains local connectivity. | | `conv_drop_prob` | float, default=0.0 | Dropout probability for convolutional layers. | | `dropout_input` | float, default=0.0 | Dropout probability applied to model input only. | | `batch_norm` | bool, default=True | If True, apply batch normalization after each convolution. | | `activation` | type[nn.Module], default=nn.GELU | Activation function class to use (e.g., nn.GELU, nn.ReLU, nn.ELU). | | `n_subjects` | int, default=200 | Number of unique subjects (for subject-specific pathways). Only used if subject_dim > 0. | | `subject_dim` | int, default=0 | Dimension of subject embeddings. If 0, no subject-specific features. If > 0, adds subject embeddings to the input before encoding. | | `subject_layers` | bool, default=False | If True, apply subject-specific linear transformations to input channels. Each subject has its own weight matrix. Requires subject_dim > 0. | | `subject_layers_dim` | str, default="input" | Where to apply subject layers: "input" or "hidden". | | `subject_layers_id` | bool, default=False | If True, initialize subject layers as identity matrices. | | `embedding_scale` | float, default=1.0 | Scaling factor for subject embeddings learning rate. | | `n_fft` | int, optional | FFT size for STFT processing. If None, no STFT is applied. If specified, applies spectrogram transform before encoding. | | `fft_complex` | bool, default=True | If True, keep complex spectrogram. If False, use power spectrogram. Only used when n_fft is not None. | | `channel_dropout_prob` | float, default=0.0 | Probability of dropping each channel during training (0.0 to 1.0). If 0.0, no channel dropout is applied. | | `channel_dropout_type` | str, optional | If specified with chs_info, only drop channels of this type (e.g., 'eeg', 'ref', 'eog'). If None with dropout_prob > 0, drops any channel. | | `glu` | int, default=2 | If > 0, applies Gated Linear Units (GLU) every N convolutional layers. GLUs gate intermediate representations for more expressivity. If 0, no GLU is applied. | | `glu_context` | int, default=1 | Context window size for GLU gates. If > 0, uses contextual information from neighboring time steps for gating. Requires glu > 0. | ## References 1. Défossez, A., Caucheteux, C., Rapin, J., Kabeli, O., & King, J. R. (2023). Decoding speech perception from non-invasive brain recordings. Nature Machine Intelligence, 5(10), 1097-1107. ## Citation Cite the original architecture paper (see *References* above) and braindecode: ```bibtex @article{aristimunha2025braindecode, title = {Braindecode: a deep learning library for raw electrophysiological data}, author = {Aristimunha, Bruno and others}, journal = {Zenodo}, year = {2025}, doi = {10.5281/zenodo.17699192}, } ``` ## License BSD-3-Clause for the model code (matching braindecode). Pretraining-derived weights, if you fine-tune from a checkpoint, inherit the licence of that checkpoint and its training corpus.