Quality Evaluation of Synthesized Phase Data

Background Information

Music source separation is the task of decomposing music into its individual components, most commonly bass, drums, vocals, and other. Whether it is done computationally or algorithmically, a necessary subtask is evaluation of separation quality. When training a deep net for music source separation, for example, we need to know how close the source separated audio is to the expected outcome.

The most ideal evaluation method would be human perception, or having a large group of people listen to the source separated audio and give it a rating of some kind. This, however, would be extremely time consuming and expensive. Instead, researchers opt for objective evaluation methods, with the most widely used metric being signal-to-distortion ratio (SDR) [4].

SDR is a fickle metric that can output significantly different scores depending on a variety of factors. For this project I focus on the aspect of phase and how it can affect an audio signal's SDR.

Project Overview

I have selected a 7-second excerpt of a song from the MUSDB18 dataset [5]. I source separated this excerpt using three prominent models: Demucs [1], Hybrid Demucs (MDX) [2], and Spleeter [3]. For these experiments I only use the bass stem.

I removed the phase data from these separated signals and replaced it with five different phase data:

The ground-truth bass stem
The original mix
Phase estimated by the Griffin-Lim phase reconstruction algorithm
Phase of a MIDI replication of the ground-truth bass stem
No phase data

I calculated the SDR of the resulting 15 stems using the implementation of scale-invariant SDR provided in the NUSSL library. I also gathered ratings of audio quality from a few listeners. Listeners were asked to rate the recording quality of the 15 stems on a scale of 1 (Bad) to 5 (Excellent). Although the pool of respondents is not large enough to be more than anecdotal, it still could be helpful in identifying the discrepancy between objective and subjective evaluations of audio.

Resources

Code of these experiments can be found in the Google CoLab notebook here: https://colab.research.google.com/drive/18eQLeYDTLCn7xM6yV6keF5SnjPe7npMs?usp=sharing.

Within the notebook, I explain the steps I took to create the audio data, and a copy of the notebook is also available in this repository.

The audio of the different stem + phase combinations are available here: erumbold.github.io/synth-subjective-evals

References

Défossez, A., Usunier, N., Bottou, L., and Bach, F. Music source separation in the waveform domain. arXiv preprint arXiv:1911.13254 (2019).
Défossez, A. Hybrid spectrogram and waveform source separation. arXiv preprint arXiv:2111.03600 (2021).
Hennequin, R., Khlif, A., Voituret, F., and Moussallam, M. Spleeter: a fast and efficient music source separation tool with pre-trained models. Journal of Open Source Software 5, 50 (2020), 2154.
Le Roux, J., Wisdom, S., Erdogan, H., & Hershey, J. R. (2019, May). SDR–half-baked or well done?. In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 626-630). IEEE.
Rafii, Z., Liutkus, A., Stöter, F. R., Mimilakis, S. I., & Bittner, R. (2017). The MUSDB18 corpus for music separation.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
audio		audio
.gitignore		.gitignore
Progress Report.md		Progress Report.md
README.md		README.md
Synthesizing_Phase_for_Music_Source_Separation.ipynb		Synthesizing_Phase_for_Music_Source_Separation.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Quality Evaluation of Synthesized Phase Data

Background Information

Project Overview

Resources

References

About

Uh oh!

Releases

Packages

Languages

erumbold/SynthPhaseEval

Folders and files

Latest commit

History

Repository files navigation

Quality Evaluation of Synthesized Phase Data

Background Information

Project Overview

Resources

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages