AASIST-L

AASIST-L is the lightweight variant of AASIST audio anti-spoofing (voice-deepfake detection) from "AASIST: Audio Anti-Spoofing using Integrated Spectro-Temporal Graph Attention Networks" (Jung et al., ICASSP 2022). It uses the upstream clovaai/aasist ASVspoof2019 LA pretrained AASIST-L checkpoint. The model takes a raw speech waveform and returns a score where higher = more bona fide.

Code: https://github.com/clovaai/aasist
Paper: https://arxiv.org/abs/2110.01200
Parameters: 85,306 (0.085 M)
Checkpoint: AASIST-L.pth

This repo is self-contained for inference: the network definition is in _net.py (identical to the full AASIST) and the exact wrapper used to produce the Arena scores in aasist_l.py. AASIST-L shares the AASIST architecture but with a narrower residual stack and graph dimensions (~85k params vs ~298k).

Architecture

AASIST operates directly on the raw waveform: a sinc-convolution front-end and a RawNet2-style residual encoder produce a spectro-temporal feature map, which is modelled by heterogeneous stacking graph attention layers over spectral and temporal sub-graphs with a learnable max/average readout, followed by a 2-class output (bona fide vs. spoof). The Arena score is the bona-fide logit. The "-L" variant narrows the residual channels (…[32,24],[24,24]) and graph dims ([24,32]).

Reproducing the Arena scores

Inference uses a deterministic first-64600-sample window (no random crop), matching the upstream data_utils.pad() used at eval. Audio is provided as float32 mono at 16 kHz (no resampling in the wrapper).

from aasist_l import AASIST_L
m = AASIST_L(); m.load()
scores = m.score_batch([wav], [16000])   # higher = more bona fide

Dataset	EER %	n_trials
ASVspoof2019_LA (in-domain)	0.99	71,237
ASVspoof2021_LA	13.15	181,566
ASVspoof2021_DF	15.96	611,829
InTheWild	44.45	31,779
CD-ADD	50.72	20,786

The in-domain ASVspoof2019 LA result (~0.99%) reproduces the paper's reported AASIST-L EER. AASIST-L matches the full AASIST closely at ~3.5× fewer parameters.

License

MIT (inherited from clovaai/aasist; see LICENSE).

Maintainer

Maintained by Kirill Borodin (SpeechAntiSpoofingBenchmarks).

Email: kborodin.research@gmail.com
Telegram: @korallll_ai

Downloads last month: 10

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for SpeechAntiSpoofingBenchmarks/AASIST-L

AASIST: Audio Anti-Spoofing using Integrated Spectro-Temporal Graph Attention Networks

Paper • 2110.01200 • Published Oct 4, 2021