CUSSer - Culturally Sensitive Speech Emotion Recognition
We reccomend using anaconda to manage your python environments. Install the requirements from the environment.yml file using the following.
conda env create -f environment.yml
The central dataset we are using is the CREMA-D dataset, the demographic information VideoDemographics.csv
is already included in the repo, the audio files need to be downloaded seperatly either from the CREMA-D repository using git-lfts or from Kaggle, and put in a folder callsed AudioWAV
with this repository as the root.
(wav2vec_emotion_classifier.ipynb) - pretrained wav2vec transformer with classifier head fine-tuned on the emotion classification downstream task (CREMA-D, EmoDB, BanglaSER)
Required packages: transformers pytorch-lightning wandb interpret torch torchaudio librosa scikit-learn scipy matplotlib.pyplot pandas tqdm
Final models' checkpoints can be downloaded here
Ready dataframes with BB model predictions and Global Surrogate model predictions can be downloaded here