| | --- |
| | library_name: transformers |
| | license: cc-by-nc-sa-4.0 |
| | base_model: utter-project/mHuBERT-147 |
| | tags: |
| | - generated_from_trainer |
| | datasets: |
| | - asierhv/composite_corpus_eu_v2.1 |
| | language: |
| | - eu |
| | metrics: |
| | - wer |
| | - cer |
| | model-index: |
| | - name: hubert_for_basque |
| | results: [] |
| | --- |
| | |
| | <!-- This model card has been generated automatically according to the information the Trainer had access to. You |
| | should probably proofread and complete it, then remove this comment. --> |
| |
|
| | # hubert_for_basque |
| |
|
| | This model is a fine-tuned version of [utter-project/mHuBERT-147](https://huggingface.co/utter-project/mHuBERT-147) on the composite_corpus_eu_v2.1 dataset. |
| | |
| | ## Training procedure |
| | |
| | All the training and evaluation code is on https://github.com/ansuehu/mHubert-basque-ASR |
| | |
| | ### Training hyperparameters |
| | |
| | The following hyperparameters were used during training: |
| | - learning_rate: 0.0001 |
| | - train_batch_size: 64 |
| | - eval_batch_size: 8 |
| | - seed: 42 |
| | - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments |
| | - lr_scheduler_type: linear |
| | - lr_scheduler_warmup_steps: 1000 |
| | - num_epochs: 24 |
| | - mixed_precision_training: Native AMP |
| |
|
| | ### Framework versions |
| |
|
| | - Transformers 4.48.3 |
| | - Pytorch 2.5.1+cu124 |
| | - Datasets 3.3.2 |
| | - Tokenizers 0.21.0 |
| |
|
| | ### Sample predictions: |
| |
|
| | Test CV WER: 0.074 |
| |
|
| | Test CV CER: 0.013 |
| |
|
| | Sample predictions: |
| |
|
| | - Reference: honek garrantzi handia zuen ehun urteko gerran |
| | - Prediction: honek garrantzi handia zuen eun urteko gerran |
| |
|
| | - Reference: osasuna aurkari zuzena da eta beraz puntuek balio bikoitza dute |
| | - Prediction: osasuna aurkari zuzena da eta beraz puntuek balio bikoitza dute |
| |
|
| | - Reference: irungo familia boteretsu bat da olazabal familia |
| | - Prediction: irungo familia boteretsu bat da olazabal familia |
| |
|
| | - Reference: hezkuntzak prestatu zituen probak pisa eta antzekoak eredu |
| | - Prediction: hezkuntzak prestatu zituen probak pisa eta antzekoak eredu |
| |
|
| | - Reference: bestalde botilek abangoardiako diseinu orijinalak dituzte |
| | - Prediction: bestalde botillek abanbardiako diseinu originalak dituzte |
| |
|
| | -------------- |
| |
|
| | Test Parl WER: 0.068 |
| |
|
| | Test Parl CER: 0.018 |
| |
|
| | Sample predictions: |
| |
|
| | - Reference: por iñigo cabacas eskerrik asko eskerrik asko |
| | - Prediction: por inigo cabacas eskerrik asko eskerrik asko |
| |
|
| | - Reference: eta ikusita obra hau hamar urteetan bueltaka ibili dela eta ikusten da zaharkitutako |
| | - Prediction: eta ikusita obra hau hamar urteetan bueltaka ibili dela eta ikusten da zaharkitutako |
| |
|
| | - Reference: dena legearen garapen zuzena oztopatzeko helburuarekin ez dut nik esango ez eskatzaile guztiek |
| | - Prediction: dena legearen garapen zuzena oztopatzeko helburuarekin ez dut nik esango ez eskatzaile guztiek |
| |
|
| | - Reference: eginda da eginikoa da ea gaurko adostasunak |
| | - Prediction: eginda da eginekoa da ea gaurko adostasunak |
| |
|
| | - Reference: kontatu gabe eta udalen ordezkarien izenean izena joan gabe |
| | - Prediction: kontatu gabe eta udalen ordezkarien izenea izenean joan gabe |
| |
|
| | -------------- |
| |
|
| | Test OSLR WER: 0.204 |
| |
|
| | Test OSLR CER: 0.042 |
| |
|
| | Sample predictions: |
| | - Reference: new yorkeko aireportuan eskala egin genuen kaliforniara bidean |
| | - Prediction: new yyorkeko aireportua neskala egin genuen kaliforniara bidean |
| |
|
| | - Reference: janet jackson michael jackson abeslari ospetsuaren arreba da |
| | - Prediction: janez jason mikel jaxon abeslari ospetsuaren arreba da |
| |
|
| | - Reference: londreseko heathrow aireportua munduko handienetarikoena da |
| | - Prediction: londreseko hitrow aireportua munduko handienetarikoa da |
| |
|
| | - Reference: hamabietan izango da txupinazoa eta udaletxeko balkoitik botako dute urtero bezala |
| | - Prediction: hamabitan izango da txupinasoa eta udaletxeko palkoitik botako dute urtero bezala |
| |
|
| | - Reference: motorolaren telefono berria erostekotan nabil |
| | - Prediction: motrolaren telefono berria erostekotan nabil |
| |
|
| | ## How to use |
| |
|
| | ```python |
| | from transformers import AutoProcessor, AutoModelForCTC |
| | import torch |
| | from datasets import load_dataset |
| | |
| | # Load model and processor |
| | processor = AutoProcessor.from_pretrained("Ansu/mHubert_basque_ASR") |
| | model = AutoModelForCTC.from_pretrained("Ansu/mHubert_basque_ASR") |
| | |
| | # Load audio from dataset |
| | ds = load_dataset("asierhv/composite_corpus_eu_v2.1", split="test") |
| | audio_input = ds[0]["audio"] |
| | |
| | #Load audio from local file |
| | audio = AudioSegment.from_file('path/to/audio') |
| | audio = audio.set_frame_rate(16000) # Set frame rate to 16kHz |
| | |
| | # Convert to raw PCM audio data |
| | # Create a BytesIO object to simulate an in-memory file |
| | with io.BytesIO() as wav_file: |
| | # Export the audio to the in-memory file |
| | audio.export(wav_file, format='wav') |
| | # Seek to the beginning of the file before reading |
| | wav_file.seek(0) |
| | # Read the audio data as a NumPy array |
| | audio_input = wavfile.read(wav_file)[1] # read data from wave file |
| | |
| | # Process audio |
| | inputs = processor(audio_input, sampling_rate=16000, return_tensors="pt") |
| | with torch.no_grad(): |
| | logits = model(**inputs).logits |
| | |
| | # Decode output |
| | predicted_ids = torch.argmax(logits, dim=-1) |
| | transcription = processor.batch_decode(predicted_ids) |
| | print(transcription[0]) |
| | ``` |