dalat5 / src /train_tokeniser.py

Commit History

Refined (v5.3) update
3cf1937

crossroderick commited on

Refined (v5.1) update
33f8089

crossroderick commited on

Pre-v5 update for the tokeniser (training date pushed to the 25th)
794cf97

crossroderick commited on

Removed unnecessary imports
8dc2b55

crossroderick commited on

Removed NFD and StripAccents from the tokeniser training process
f93a822

crossroderick commited on

Addition of a new tokeniser (pre-v5)
178501c

crossroderick commited on