WECHSEL-XLM-R-Dense — EViRAL v6

Cross-lingual dense retrieval model: Ede (Rhade) query → Vietnamese passage.

How to load for continued fine-tuning

from huggingface_hub import hf_hub_download
import torch, json, numpy as np

vocab      = json.load(open(hf_hub_download('NIRVLab/ede-xlm-roberta-base', 'vocab.json')))
tok_cfg    = json.load(open(hf_hub_download('NIRVLab/ede-xlm-roberta-base', 'tokenizer_config.json')))
wechsel_np = np.load(hf_hub_download('NIRVLab/ede-xlm-roberta-base', 'wechsel_embeddings.npy'))
state_dict = torch.load(hf_hub_download('NIRVLab/ede-xlm-roberta-base', 'align.pt'), map_location='cpu')

# Rebuild encoder (same code as notebook)
encoder = make_encoder(wechsel_np)   # uses vocab, VOCAB_SIZE, etc. from notebook
encoder.load_state_dict(state_dict)

Training details

Backbone: xlm-roberta-base
WECHSEL k=10, τ=0.1
Bilingual dict: NIRVLab/rhade-vietnamese-mt
Pipeline: MLM (3 epochs) → cross-lingual alignment (2 epochs)

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support