Skip to content

Conversation

juliohsu
Copy link

@juliohsu juliohsu commented May 8, 2025

Added padding logic to ensure consistent frame sizes across octaves when the hop_length is not a power of 2. This prevents size alignment errors by padding smaller frames to match the maximum time axis frame length before concatenation.

@gudgud96
Copy link
Collaborator

Hey @juliohsu, just saw this. I helped with the VQT part some time ago.

Thanks for spotting the edge cases! Indeed there might be rounding issues if hop length is not a power of 2. I think there might be a way to make get_cqt_complex more robust towards this case (I am not sure), but I think your solution using interpolation should work too! LGTM.

@juliohsu
Copy link
Author

@gudgud96 do you think there is any thing to be test or improve, in order to be merged?

@gudgud96
Copy link
Collaborator

@juliohsu I'll suggest to (i) remove unnecessary comments; (ii) add a test case under test_vqt.py that would only pass, if the interpolation line you proposed is included.

cc @KinWaiCheuk

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants