EquiformerV3: Scaling Efficient, Expressive, and General SE(3)-Equivariant Graph Attention Transformers
Abstract
EquiformerV3 advances SE(3)-equivariant graph neural networks through enhanced efficiency, expressivity, and generality via optimized implementation, improved architectural components, and novel activation functions for accurate 3D atomic modeling.
As SE(3)-equivariant graph neural networks mature as a core tool for 3D atomistic modeling, improving their efficiency, expressivity, and physical consistency has become a central challenge for large-scale applications. In this work, we introduce EquiformerV3, the third generation of the SE(3)-equivariant graph attention Transformer, designed to advance all three dimensions: efficiency, expressivity, and generality. Building on EquiformerV2, we have the following three key advances. First, we optimize the software implementation, achieving 1.75times speedup. Second, we introduce simple and effective modifications to EquiformerV2, including equivariant merged layer normalization, improved feedforward network hyper-parameters, and attention with smooth radius cutoff. Third, we propose SwiGLU-S^2 activations to incorporate many-body interactions for better theoretical expressivity and to preserve strict equivariance while reducing the complexity of sampling S^2 grids. Together, SwiGLU-S^2 activations and smooth-cutoff attention enable accurate modeling of smoothly varying potential energy surfaces (PES), generalizing EquiformerV3 to tasks requiring energy-conserving simulations and higher-order derivatives of PES. With these improvements, EquiformerV3 trained with the auxiliary task of denoising non-equilibrium structures (DeNS) achieves state-of-the-art results on OC20, OMat24, and Matbench Discovery.
Community
Introducing EquiformerV3 with state-of-the-art results on OC20, OMat24, and Matbench Discovery.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- A recipe for scalable attention-based MLIPs: unlocking long-range accuracy with all-to-all node attention (2026)
- VecFormer: Towards Efficient and Generalizable Graph Transformer with Graph Token Attention (2026)
- Affine-Scaled Attention: Towards Flexible and Stable Transformer Attention (2026)
- Rank-Factorized Implicit Neural Bias: Scaling Super-Resolution Transformer with FlashAttention (2026)
- $\lambda$-GELU: Learning Gating Hardness for Controlled ReLU-ization in Deep Networks (2026)
- Size Transferability of Graph Transformers with Convolutional Positional Encodings (2026)
- LassoFlexNet: Flexible Neural Architecture for Tabular Data (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend
Get this paper in your agent:
hf papers read 2604.09130 Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash Models citing this paper 2
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper