OpenMOSS-Team
/

MOSS-Audio-Tokenizer

Feature Extraction

moss-audio-tokenizer

audio-tokenizer

moss-tts-family

MOSS Audio Tokenizer

speech-tokenizer

trust-remote-code

Model card Files Files and versions

fdugyt commited on Feb 8

Commit

e4e8324

·

verified ·

1 Parent(s): 4a52ecf

Update modeling_moss_audio_tokenizer.py

Files changed (1) hide show

modeling_moss_audio_tokenizer.py +1 -1

modeling_moss_audio_tokenizer.py CHANGED Viewed

@@ -941,7 +941,7 @@ class MossAudioTokenizerPatchedPretransform(nn.Module):
         x = x.reshape(b, d, -1, h).permute(0, 1, 3, 2).reshape(b, d * h, -1)
         # We pad the input waveform to a multiple of `downsample_rate` before applying the encoder.
         # Use a ceil division to match that padding and avoid dropping the last (partially padded) frame.
-        output_lengths = (input_lengths + self.patch_size - 1) // self.patch_size
         return x, output_lengths
     def decode(self, x, input_lengths):

         x = x.reshape(b, d, -1, h).permute(0, 1, 3, 2).reshape(b, d * h, -1)
         # We pad the input waveform to a multiple of `downsample_rate` before applying the encoder.
         # Use a ceil division to match that padding and avoid dropping the last (partially padded) frame.
+        output_lengths = input_lengths // self.patch_size
         return x, output_lengths
     def decode(self, x, input_lengths):