CAM++

Speaker Verification & Diarization β€” identify and distinguish speakers in audio.

CAM++ is a speaker embedding model used for speaker verification (is this the same person?) and speaker diarization (who spoke when?).

Quick Start

from funasr import AutoModel

# Speaker diarization as part of ASR pipeline
model = AutoModel(
    model="funasr/paraformer-zh",
    hub="hf",
    vad_model="funasr/fsmn-vad",
    punc_model="funasr/ct-punc",
    spk_model="funasr/campplus",
    device="cuda",
)
result = model.generate(input="meeting.wav")
# Each sentence has a speaker label
for sentence in result[0]["sentence_info"]:
    print(f"[Speaker {sentence['spk']}] {sentence['text']}")

Features

  • Speaker embedding extraction
  • Speaker verification (same/different speaker)
  • Speaker diarization (multi-speaker segmentation)
  • Works with FunASR pipeline via spk_model parameter

Model Details

Property Value
Architecture CAM++ (Class-Aware Multi-scale)
Embedding Dim 192
Sample Rate 16kHz

Links

Downloads last month
783
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Spaces using funasr/campplus 100