Training vs inference input difference #258

emilh18 · 2025-04-03T12:58:31Z

emilh18
Apr 3, 2025

I trained my custom model for some commands in slovak and wanted to improve it using user-specific verifier. But here i found non-intuitive thing for me.
The model is trained by default on 3second recordings. With sample rate of 16000 it means 48000 time dimension, that is converted into feature with shape 28, 96. But when using predict_clip function in Evaluate model section or later train_custom_modifier, just 80ms recordings (chunks) are used and actually it works quite well.
How can this work technically if model is traned on 3s inputs(some padding?)? How can it work practically if most of commands are much longer than 80ms?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training vs inference input difference #258

{{title}}

Replies: 0 comments

Select a reply

Training vs inference input difference #258

emilh18 Apr 3, 2025

Replies: 0 comments

emilh18
Apr 3, 2025