Fine-tuning Wav2Vec2 for Tamil Speech Recognition

This repository contains the Jupyter Notebook and resources for fine-tuning the Wav2Vec2 model for Tamil speech recognition using the Hugging Face Transformers library.

Introduction

Wav2Vec2 is a state-of-the-art model for automatic speech recognition (ASR). This project aims to adapt Wav2Vec2 for the Tamil language, leveraging available datasets to improve performance in recognizing spoken Tamil.

Requirements

To run this project, ensure you have the following installed:

Python 3.7 or higher
Jupyter Notebook
PyTorch
Transformers
Datasets
Librosa
Soundfile
CUDA

You can install the required packages using the following command:

pip install -r requirements.txt

Dataset

We use Tamil Speech Dataset for fine-tuning the model. The dataset consists of audio files in Tamil along with their transcriptions. Please ensure you download the dataset and place it in an accessible directory. Refer datapreprocessing.py

Training

To fine-tune the Wav2Vec2 model, open the Jupyter Notebook and follow the instructions provided within the notebook to execute the training process.

Inference

After training, you can perform inference using the code snippets provided in the Jupyter Notebook. Ensure to replace the paths with your specific audio files.

Results

The performance of the model can be evaluated using standard metrics such as Word Error Rate (WER). The notebook contains sections on evaluating the model's performance.

pip install jiwer

import jiwer

original_transcript = "God is great"  # Example script replace with your transcription
output_transcription = "good is great"

# Compute WER
wer = jiwer.wer(reference, hypothesis)
print(f"Word Error Rate (WER): {wer:.2f}")

Acknowledgments

For further reference please visit: Fairseq Wav2Vec2

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
Finetune_wav2vec2_xlsr_tamil.ipynb		Finetune_wav2vec2_xlsr_tamil.ipynb
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fine-tuning Wav2Vec2 for Tamil Speech Recognition

Table of Contents

Introduction

Requirements

Dataset

Training

Inference

Results

Acknowledgments

About

Releases

Packages

Languages

sugarcane-mk/finetuning_wav2vec2

Folders and files

Latest commit

History

Repository files navigation

Fine-tuning Wav2Vec2 for Tamil Speech Recognition

Table of Contents

Introduction

Requirements

Dataset

Training

Inference

Results

Acknowledgments

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages