BHAAV- A Text Corpus for Emotion Analysis from Hindi Stories
Paper • 1910.04073 • Published
How to use krinal/BertWordPieceTokenizer-hi with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("token-classification", model="krinal/BertWordPieceTokenizer-hi") # Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("krinal/BertWordPieceTokenizer-hi", dtype="auto")from transformers import AutoTokenizer
hi_tokenizer = AutoTokenizer.from_pretrained('krinal/BertWordPieceTokenizer-hi')
hi_str = "आज का सूर्य देखो, कितना प्यारा, कितना शीतल है"
# encode text
encoded_str = hi_tokenizer.encode(hi_str)
# decode text
decoded_str = hi_tokenizer.decode(encoded_str)
@article{kumar2019bhaav,
title={BHAAV-A Text Corpus for Emotion Analysis from Hindi Stories},
author={Kumar, Yaman and Mahata, Debanjan and Aggarwal, Sagar and Chugh, Anmol and Maheshwari, Rajat and Shah, Rajiv Ratn},
journal={arXiv preprint arXiv:1910.04073},
year={2019}
}