legacy-datasets/wikipedia
Updated • 122k • 632
How to use wolfnuker/leaf-embed-beir with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("feature-extraction", model="wolfnuker/leaf-embed-beir") # Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("wolfnuker/leaf-embed-beir", dtype="auto")How to use wolfnuker/leaf-embed-beir with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("wolfnuker/leaf-embed-beir")
sentences = [
"The weather is lovely today.",
"It's so sunny outside!",
"He drove to the stadium."
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]A text embedding model trained using LEAF (Lightweight Embedding Alignment Framework) Distillation to achieve competitive performance on the BEIR benchmark.
This model was created by distilling knowledge from Snowflake/snowflake-arctic-embed-m-v1.5 (teacher) into a smaller, more efficient student architecture.
| Component | Details |
|---|---|
| Encoder | 8-layer BERT with 512 hidden size |
| Attention Heads | 8 |
| Output Dimension | 768 |
| Parameters | ~65M (vs 109M teacher) |
| Pooling | Mean pooling |
Snowflake/snowflake-arctic-embed-m-v1.5import torch
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("wolfnuker/leaf-embed-beir")
model = AutoModel.from_pretrained("wolfnuker/leaf-embed-beir")
def mean_pooling(model_output, attention_mask):
token_embeddings = model_output.last_hidden_state
input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)
# Example usage
sentences = ["This is an example sentence", "Each sentence is converted to a vector"]
encoded = tokenizer(sentences, padding=True, truncation=True, return_tensors="pt")
with torch.no_grad():
outputs = model(**encoded)
embeddings = mean_pooling(outputs, encoded["attention_mask"])
embeddings = torch.nn.functional.normalize(embeddings, p=2, dim=1)
print(embeddings.shape) # [2, 768]
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("wolfnuker/leaf-embed-beir")
embeddings = model.encode(["This is an example sentence", "Each sentence is converted"])
| Dataset | NDCG@10 |
|---|---|
| NFCorpus | 0.0896 |
Note: This is an initial baseline model. Performance will improve with:
| Parameter | Value |
|---|---|
| Learning Rate | 2e-5 → 2e-8 (cosine decay) |
| Batch Size | 320 (64 × 5 gradient accumulation) |
| Warmup Ratio | 10% |
| Mixed Precision | FP16 |
| Max Sequence Length | 256 |
LEAF uses L2 loss on normalized embeddings:
L = MSE(normalize(student_emb), normalize(teacher_emb))
If you use this model, please cite:
@misc{leaf-embed-beir,
author = {RankSaga},
title = {LEAF Embed BEIR: Text Embeddings via Distillation},
year = {2026},
publisher = {HuggingFace},
url = {https://huggingface.co/wolfnuker/leaf-embed-beir}
}
Apache 2.0