braindecode
/

MSVTNet

@@ -14,13 +14,12 @@ tags:
 # MSVTNet
-MSVTNet model from Liu K et al (2024) from .
-> **Architecture-only repository.** This repo documents the
 > `braindecode.models.MSVTNet` class. **No pretrained weights are
-> distributed here** — instantiate the model and train it on your own
-> data, or fine-tune from a published foundation-model checkpoint
-> separately.
 ## Quick start
@@ -39,148 +38,49 @@ model = MSVTNet(
 )
 ```
-The signal-shape arguments above are example defaults — adjust them
-to match your recording.
 ## Documentation
-- Full API reference (parameters, references, architecture figure):
-  <https://braindecode.org/stable/generated/braindecode.models.MSVTNet.html>
-- Interactive browser with live instantiation:
   <https://huggingface.co/spaces/braindecode/model-explorer>
 - Source on GitHub: <https://github.com/braindecode/braindecode/blob/master/braindecode/models/msvtnet.py#L13>
-## Architecture description
-The block below is the rendered class docstring (parameters,
-references, architecture figure where available).
-<div class='bd-doc'><main>
-<p>MSVTNet model from Liu K et al (2024) from [msvt2024]_.</p>
-<span style="display:inline-block;padding:2px 8px;border-radius:4px;background:#5cb85c;color:white;font-size:11px;font-weight:600;margin-right:4px;">Convolution</span><span style="display:inline-block;padding:2px 8px;border-radius:4px;background:#6c757d;color:white;font-size:11px;font-weight:600;margin-right:4px;">Recurrent</span><span style="display:inline-block;padding:2px 8px;border-radius:4px;background:#56B4E9;color:white;font-size:11px;font-weight:600;margin-right:4px;">Attention/Transformer</span>
- This model implements a multi-scale convolutional transformer network
- for EEG signal classification, as described in [msvt2024]_.
- .. figure:: https://raw.githubusercontent.com/SheepTAO/MSVTNet/refs/heads/main/MSVTNet_Arch.png
-    :align: center
-    :alt: MSVTNet Architecture
- Parameters
- ----------
- n_filters_list : list[int], optional
-     List of filter numbers for each TSConv block, by default (9, 9, 9, 9).
- conv1_kernels_size : list[int], optional
-     List of kernel sizes for the first convolution in each TSConv block,
-     by default (15, 31, 63, 125).
- conv2_kernel_size : int, optional
-     Kernel size for the second convolution in TSConv blocks, by default 15.
- depth_multiplier : int, optional
-     Depth multiplier for depthwise convolution, by default 2.
- pool1_size : int, optional
-     Pooling size for the first pooling layer in TSConv blocks, by default 8.
- pool2_size : int, optional
-     Pooling size for the second pooling layer in TSConv blocks, by default 7.
- drop_prob : float, optional
-     Dropout probability for convolutional layers, by default 0.3.
- num_heads : int, optional
-     Number of attention heads in the transformer encoder, by default 8.
- ffn_expansion_factor : float, optional
-     Ratio to compute feedforward dimension in the transformer, by default 1.
- att_drop_prob : float, optional
-     Dropout probability for the transformer, by default 0.5.
- num_layers : int, optional
-     Number of transformer encoder layers, by default 2.
- activation : Type[nn.Module], optional
-     Activation function class to use, by default nn.ELU.
- return_features : bool, optional
-     Whether to return predictions from branch classifiers, by default False.
- Notes
- -----
- This implementation is not guaranteed to be correct, has not been checked
- by original authors, only reimplemented based on the original code [msvt2024code]_.
- References
- ----------
- .. [msvt2024] Liu, K., et al. (2024). MSVTNet: Multi-Scale Vision
-    Transformer Neural Network for EEG-Based Motor Imagery Decoding.
-    IEEE Journal of Biomedical an Health Informatics.
- .. [msvt2024code] Liu, K., et al. (2024). MSVTNet: Multi-Scale Vision
-    Transformer Neural Network for EEG-Based Motor Imagery Decoding.
-    Source Code: https://github.com/SheepTAO/MSVTNet
- .. rubric:: Hugging Face Hub integration
- When the optional ``huggingface_hub`` package is installed, all models
- automatically gain the ability to be pushed to and loaded from the
- Hugging Face Hub. Install with::
-     pip install braindecode[hub]
- **Pushing a model to the Hub:**
- .. code::
-     from braindecode.models import MSVTNet
-     # Train your model
-     model = MSVTNet(n_chans=22, n_outputs=4, n_times=1000)
-     # ... training code ...
-     # Push to the Hub
-     model.push_to_hub(
-         repo_id="username/my-msvtnet-model",
-         commit_message="Initial model upload",
-     )
- **Loading a model from the Hub:**
- .. code::
-     from braindecode.models import MSVTNet
-     # Load pretrained model
-     model = MSVTNet.from_pretrained("username/my-msvtnet-model")
-     # Load with a different number of outputs (head is rebuilt automatically)
-     model = MSVTNet.from_pretrained("username/my-msvtnet-model", n_outputs=4)
- **Extracting features and replacing the head:**
- .. code::
-     import torch
-     x = torch.randn(1, model.n_chans, model.n_times)
-     # Extract encoder features (consistent dict across all models)
-     out = model(x, return_features=True)
-     features = out["features"]
-     # Replace the classification head
-     model.reset_head(n_outputs=10)
- **Saving and restoring full configuration:**
- .. code::
-     import json
-     config = model.get_config()            # all __init__ params
-     with open("config.json", "w") as f:
-         json.dump(config, f)
-     model2 = MSVTNet.from_config(config)    # reconstruct (no weights)
- All model parameters (both EEG-specific and model-specific such as
- dropout rates, activation functions, number of filters) are automatically
- saved to the Hub and restored when loading.
- See :ref:`load-pretrained-models` for a complete tutorial.</main>
-</div>
 ## Citation
-Please cite both the original paper for this architecture (see the
-*References* section above) and braindecode:
 ```bibtex
 @article{aristimunha2025braindecode,

 # MSVTNet
+MSVTNet model from Liu K et al (2024) from [msvt2024].
+> **Architecture-only repository.** Documents the
 > `braindecode.models.MSVTNet` class. **No pretrained weights are
+> distributed here.** Instantiate the model and train it on your own
+> data.
 ## Quick start
 )
 ```
+The signal-shape arguments above are illustrative defaults — adjust to
+match your recording.
 ## Documentation
+- Full API reference: <https://braindecode.org/stable/generated/braindecode.models.MSVTNet.html>
+- Interactive browser (live instantiation, parameter counts):
   <https://huggingface.co/spaces/braindecode/model-explorer>
 - Source on GitHub: <https://github.com/braindecode/braindecode/blob/master/braindecode/models/msvtnet.py#L13>
+## Architecture
+![MSVTNet architecture](https://raw.githubusercontent.com/SheepTAO/MSVTNet/refs/heads/main/MSVTNet_Arch.png)
+## Parameters
+| Parameter | Type | Description |
+|---|---|---|
+| `n_filters_list` | list[int], optional | List of filter numbers for each TSConv block, by default (9, 9, 9, 9). |
+| `conv1_kernels_size` | list[int], optional | List of kernel sizes for the first convolution in each TSConv block, by default (15, 31, 63, 125). |
+| `conv2_kernel_size` | int, optional | Kernel size for the second convolution in TSConv blocks, by default 15. |
+| `depth_multiplier` | int, optional | Depth multiplier for depthwise convolution, by default 2. |
+| `pool1_size` | int, optional | Pooling size for the first pooling layer in TSConv blocks, by default 8. |
+| `pool2_size` | int, optional | Pooling size for the second pooling layer in TSConv blocks, by default 7. |
+| `drop_prob` | float, optional | Dropout probability for convolutional layers, by default 0.3. |
+| `num_heads` | int, optional | Number of attention heads in the transformer encoder, by default 8. |
+| `ffn_expansion_factor` | float, optional | Ratio to compute feedforward dimension in the transformer, by default 1. |
+| `att_drop_prob` | float, optional | Dropout probability for the transformer, by default 0.5. |
+| `num_layers` | int, optional | Number of transformer encoder layers, by default 2. |
+| `activation` | Type[nn.Module], optional | Activation function class to use, by default nn.ELU. |
+| `return_features` | bool, optional | Whether to return predictions from branch classifiers, by default False. |
+## References
+1. Liu, K., et al. (2024). MSVTNet: Multi-Scale Vision Transformer Neural Network for EEG-Based Motor Imagery Decoding. IEEE Journal of Biomedical an Health Informatics.
+2. Liu, K., et al. (2024). MSVTNet: Multi-Scale Vision Transformer Neural Network for EEG-Based Motor Imagery Decoding. Source Code: https://github.com/SheepTAO/MSVTNet
 ## Citation
+Cite the original architecture paper (see *References* above) and braindecode:
 ```bibtex
 @article{aristimunha2025braindecode,