📝 YouTube Audio Downloader & Transcriber This project is a Python script that downloads audio from a YouTube video, transcribes it using OpenAI's Whisper, and saves the transcript as a text file. It's a simple and powerful tool for turning spoken content from YouTube into written form.
This script is intended for educational purposes only. It is designed to demonstrate how audio can be downloaded and transcribed using publicly available tools.
Users are responsible for ensuring that their use of this script complies with YouTube's Terms of Service and all applicable copyright laws. The developer assumes no liability for any misuse of the tool.
Downloading or transcribing copyrighted material without explicit permission from the content owner may violate copyright laws and YouTube's policies. Please consult a legal professional if you are unsure about your rights and obligations.
This project is licensed under the MIT License. See the LICENSE file for details.
🚀 Features
Downloads audio from any YouTube video. Automatically transcribes the audio into text using OpenAI's Whisper. Saves the transcript with a timestamped filename. Deletes temporary audio files after processing.
📋 Requirements Before you begin, ensure you have the following installed on your system:
Python 3.8 or later FFmpeg (for audio processing)
To install FFmpeg, follow the instructions for your operating system:
Linux: sudo apt install ffmpeg Mac: brew install ffmpeg Windows: Download FFmpeg
🛠 Installation
Clone the Repository bashCopy codegit clone https://github.com//youtube-audio-transcriber.git cd youtube-audio-transcriber
Install Python Dependencies Use the provided requirements.txt file to set up dependencies: bashCopy codepip install -r requirements.txt
⚙️ Usage Run the Script To use the script, run the following command in your terminal: bashCopy codepython app.py
Replace with the link to the YouTube video you wish to process. Example bashCopy codepython app.py "https://www.youtube.com/watch?v=example_id"
What Happens:
The script downloads the audio from the video. Transcribes the audio into text using Whisper. Saves the transcript in a file named transcript_.txt.
📂 Output
The script saves the transcript file in the same directory as the script. Example output file: textCopy codetranscript_1756128686.txt
🧩 Customization Model Size By default, the script uses Whisper's base model. If you want to use a different model (e.g., small, medium, etc.), modify the model_size argument in the transcribe_audio() function: pythonCopy codetranscribe_audio(audio_file, model_size="small")
Refer to the Whisper documentation for details on model sizes.
🔧 Troubleshooting Common Issues
FFmpeg Not Found
Ensure FFmpeg is correctly installed and added to your system’s PATH.
Whisper Not Working
Ensure that torch is installed and compatible with your hardware. For GPU support, install a GPU-compatible version of PyTorch from PyTorch.org.
🌐 Supported Platforms This script leverages yt-dlp, a powerful tool that supports downloading audio from numerous platforms. While primarily designed for YouTube, it can also work with the following platforms, provided the URLs are accessible:
YouTube
Vimeo
Facebook
Instagram
Twitter
Twitch (clips and past broadcasts)
SoundCloud
Dailymotion
And many more! (See the full list of supported sites: yt-dlp Supported Sites)
💡 Acknowledgments
yt-dlp for downloading YouTube audio. OpenAI Whisper for speech-to-text transcription.
🖥 Contributing Pull requests are welcome! If you’d like to contribute, please fork the repository and submit a PR with your changes.
📧 Contact For questions or feedback, reach out to [[email protected]] or open an issue on the repository.