Fine-tuning and multilingual capability

by adrien-alloreview - opened Dec 23, 2022

Dec 23, 2022

First I find this embedding very very interesting.
Indeed, I've always been frustrated by the fact that it was not possible to "explain" to an embedding what is the purpose of the embedding. Thanks to your work it is now possible.
I would like to know if you plan to make this model multilingual and how would it be possible to fine-tune it to be multilingual and to fine tuning to more specific task ?

Thanks in advance

multi-train

NLP Group of The University of Hong Kong org Dec 28, 2022

Thank you very much for your interests!

We are considering making this model multilingual. It is very easy to finetune the model on more specific tasks. You may prepare the data following the format in https://github.com/HKUNLP/instructor-embedding#training, store them as a json file and name it as medi-data.json. Next, just follow the README: https://github.com/HKUNLP/instructor-embedding#train-instructor, and train the model!

If you encounter any problem, feel free to leave your question here or contact me at hjsu@cs.hku.hk!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment