- Semantic Voice Activity Detection: Enhance the current VAD system with semantic understanding
- Component Refactoring:
- Create comprehensive refactoring plan
- Focus on consistency and reliability
- Implement systematic component architecture
- Token Counter: Add functionality to track and display token usage
Voice recorder with transcription, translation, and language detection powered by OpenAI.
Check out BabelTron in action:
- Record audio from selected input device.
- Automatic transcription using OpenAI
- Automatic language detection.
- Automatic translation between selected languages using OpenAI (Language A <-> Language B).
- Text-to-Speech (TTS) synthesis of the translation using OpenAI.
- Adjustable models for transcription and TTS.
- Visual audio waveform display.
- Voice Activity Detection (VAD) option for recording.
- Basic settings management (API Key, device, models).
Prerequisites:
- Node.js (includes npm)
- Git
Steps:
- Clone the repository:
git clone <repository-url> cd babeltron
- Install dependencies:
npm install
- Set your OpenAI API Key in the application's settings.
-
Development Mode:
npm run dev
This will start the application with live reloading for the renderer process.
-
Production Build:
- Build the application:
npm run build:prod-no-package && electron-builder --dir
- The installable application will be in the `
- Build the application:
Contributions are welcome! Here's how you can contribute to BabelTron:
- Fork the repository on GitHub
- Clone your fork to your local machine
- Create a branch for your feature or bugfix (
git checkout -b feature/amazing-feature
) - Make your changes and commit them with descriptive messages
- Push your branch to your fork on GitHub
- Open a Pull Request from your fork to the main repository
Please follow these guidelines:
- Use descriptive commit messages following conventional commits format
- Include tests for new features
- Update documentation for any changes
- Maintain the existing code style
- Respect the 100% vibe coding requirement ✨
This project is licensed under the BSD Zero Clause License (0fucksGiven) - see the LICENSE file for details.
- OpenAI for providing the APIs that power this application
- Electron for making cross-platform desktop applications possible
- React and MUI for the user interface
- All contributors who have helped improve this project