Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up

comma-project
/
modernbert-sentembeddings

Sentence Similarity
sentence-transformers
Safetensors
modernbert
feature-extraction
dense
Generated from Trainer
dataset_size:99840
loss:MultipleNegativesRankingLoss
text-embeddings-inference
Model card Files Files and versions
xet
Community

Instructions to use comma-project/modernbert-sentembeddings with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

  • Libraries
  • sentence-transformers

    How to use comma-project/modernbert-sentembeddings with sentence-transformers:

    from sentence_transformers import SentenceTransformer
    
    model = SentenceTransformer("comma-project/modernbert-sentembeddings")
    
    sentences = [
        "na. es. sui .s. siqs aut̃. It nrͣ hodie ꝙ ea demũ sit ma. ⁊ a. g̃iadħ de ni ma. ⁊ a. siadħ de nolib ubis ⁊ młrib e Uia ex ꝯcubinis filu nascũt᷑ uales ⁊ te nalib faluꝰ usdeimꝰ ⁊ de mr̃ib eoꝵ .s. qui dicãt᷑ nales. ⁊ ꝙͣtũ pocut eiꝰ relinqͥ. ł int̾ iuiuoꝵ. ⁊ ĩ ucima nol̃tate quoqͣ.tdari ⁊ postea ꝓseqũ teꝰ denorẽ ice. dicemꝰ quib ex cai lb̾i uales fi ant sus .i. redigãt᷑ in potatẽ ꝑentũ. ⁊ de h tͣctatu",
        "na. es. sui .s. siqs aut̃. It nrͣ hodie ꝙ ea demũ sit ma. ⁊ a. g̃iadħ de ni ma. ⁊ a. siadħ de nolib ubis ⁊ młrib e Uia ex ꝯcubinis filu nascũt᷑ uales ⁊ te nalib faluꝰ usdeimꝰ ⁊ de mr̃ib eoꝵ .s. qui dicãt᷑ nales. ⁊ ꝙͣtũ pocut eiꝰ relinqͥ. ł int̾ iuiuoꝵ. ⁊ ĩ ucima nol̃tate quoqͣ.tdari ⁊ postea ꝓseqũ teꝰ denorẽ ice. dicemꝰ quib ex cai lb̾i uales fi ant sus .i. redigãt᷑ in potatẽ ꝑentũ. ⁊ de h tͣctatu",
        "illius excubaret: ibidem ꝓ fide xp̃i aꝑsecutorib tradita est qi cum digna & eumenia: & eupe Ciuitate falare: passio scõtu graciliani. & felicissime iurg nis. Quoꝵ ora ꝓxp̃o contusi lapidib. dehinc gladio ꝑcusi optatam martytii suscepert̃ palmam. idus augusti.",
        "Et nos Poncius Ugo, Dei gracia Impuriarum comes predictus, promitimus vobis Raimundo Xetmario, nomine dicte domne Marchisie, predictam forciam deffendere ab omni homine qui a te directum accipere noluerit vel facere. Assigno etiam vobis et dono in feudum, in esmendam dicti careu, dictos V squillatas milii, annuatim accipiendas in festo Omnium Sanctorum, in omnibus nostris directis et taschis quas accipimus in stagno de Cils. Actum est hoc VII kalendas novembris anno Domini MºCCºLXXº octavo. Sig(+)num Raimundi Xetmarii predicti, qui hoc firmo et laudo. Sig(+)num Ponci Ugonis, Dei gracia comis Impuriarum predicti, qui hoc firmamus et laudamus. Testes huius rei sunt: Bernardus de Palaciolo de Villanova, et Berengarius de Lanciano, et Guilelmus Alferici et Simon de Trilia, milites."
    ]
    embeddings = model.encode(sentences)
    
    similarities = model.similarity(embeddings, embeddings)
    print(similarities.shape)
    # [4, 4]
  • Notebooks
  • Google Colab
  • Kaggle
modernbert-sentembeddings
536 MB
Ctrl+K
Ctrl+K
  • 1 contributor
History: 2 commits
ponteineptique's picture
ponteineptique
Upload 10 files
2f67c47 verified 8 months ago
  • 1_Pooling
    Upload 10 files 8 months ago
  • .gitattributes
    1.52 kB
    initial commit 8 months ago
  • README.md
    30.9 kB
    Upload 10 files 8 months ago
  • config.json
    1.09 kB
    Upload 10 files 8 months ago
  • config_sentence_transformers.json
    283 Bytes
    Upload 10 files 8 months ago
  • model.safetensors
    534 MB
    xet
    Upload 10 files 8 months ago
  • modules.json
    229 Bytes
    Upload 10 files 8 months ago
  • sentence_bert_config.json
    58 Bytes
    Upload 10 files 8 months ago
  • special_tokens_map.json
    971 Bytes
    Upload 10 files 8 months ago
  • tokenizer.json
    2.1 MB
    Upload 10 files 8 months ago
  • tokenizer_config.json
    1.53 kB
    Upload 10 files 8 months ago