Skip to content

WhisperForge is a Python tool that leverages OpenAI's Whisper model to transcribe large audio files. It automatically splits files into manageable chunks, processes them, and combines the transcriptions into a single document. Ideal for handling lengthy recordings and generating clear, organized transcriptions.

License

Notifications You must be signed in to change notification settings

WalksWithASwagger/whisperforge

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

WhisperForge v3.0.0 🌌

Transform audio into structured, intelligent content with AI-powered processing

WhisperForge is a powerful Streamlit application that converts audio files into comprehensive content packages including transcripts, insights, articles, and social media posts. Now with revolutionary large file processing supporting files up to 2GB.

✨ Key Features

  • πŸŽ™οΈ Audio Transcription - High-quality speech-to-text using OpenAI Whisper
  • πŸ’‘ Wisdom Extraction - AI-powered insights and key takeaways
  • πŸ“‹ Content Outline - Structured organization and flow
  • πŸ“° Article Generation - Complete written content from audio
  • πŸ“± Social Media Posts - Platform-optimized content
  • πŸ“š Notion Integration - Automatic publishing to Notion workspace
  • πŸ“‚ Knowledge Base - Add custom context from your files
  • πŸ“ Custom Prompts - Personalize AI output
  • πŸš€ Large File Processing - Handle files up to 2GB with intelligent chunking
  • 🌊 Real-time Streaming - Watch content generate step-by-step
  • 🎨 Aurora Theme - Beautiful bioluminescent UI design

πŸ—οΈ Project Structure

whisperforge--prime/
β”œβ”€β”€ app_simple.py          # Main Streamlit application (v3.0.0)
β”œβ”€β”€ app.py                 # Redirect to main app
β”œβ”€β”€ core/                  # Core functionality modules
β”‚   β”œβ”€β”€ content_generation.py
β”‚   β”œβ”€β”€ file_upload.py     # Enhanced large file processing
β”‚   β”œβ”€β”€ supabase_integration.py
β”‚   └── ...
β”œβ”€β”€ prompts/               # Custom AI prompts
β”œβ”€β”€ static/                # CSS, JS, and assets
β”œβ”€β”€ tests/                 # Test suite
β”œβ”€β”€ docs/                  # Documentation
└── requirements.txt       # Dependencies

πŸš€ Quick Start

Prerequisites

  • Python 3.8+
  • Supabase account (for data storage)
  • OpenAI API key (for AI processing)

Installation

  1. Clone the repository

    git clone https://github.com/your-username/whisperforge.git
    cd whisperforge
  2. Set up virtual environment

    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
  3. Install dependencies

    pip install -r requirements.txt
  4. Configure environment variables

    cp env.example .env
    # Edit .env with your API keys
  5. Run the application

    streamlit run app_simple.py

πŸ”§ Configuration

Create a .env file with your API keys:

# Required
SUPABASE_URL=your_supabase_url
SUPABASE_ANON_KEY=your_supabase_anon_key
OPENAI_API_KEY=your_openai_api_key

# Optional
NOTION_API_KEY=your_notion_api_key
NOTION_DATABASE_ID=your_notion_database_id

🎯 Usage

  1. Upload Audio - Support for MP3, WAV, M4A, and video files up to 2GB
  2. Choose Processing Mode - Standard (≀25MB) or Enhanced Large File (≀2GB)
  3. Watch Real-time Processing - See content generate step-by-step
  4. Review Results - Comprehensive content package with all outputs
  5. Auto-publish - Optional Notion integration for seamless publishing

πŸ§ͺ Testing

Before running tests, make sure all dependencies are installed:

pip install -r requirements.txt

You can also use the helper script scripts/setup_test_env.sh to create a virtual environment with the required packages.

Run the test suite:

# Run all tests
pytest

# Run specific test categories
pytest -m unit          # Unit tests only
pytest -m integration   # Integration tests only
pytest tests/test_basic_functionality.py -v  # Specific test file

πŸ“š Documentation

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • OpenAI for Whisper and GPT models
  • Supabase for backend infrastructure
  • Streamlit for the amazing web framework
  • The open-source community for inspiration and tools

WhisperForge v3.0.0 - Transform your audio into intelligent content 🌌

🎯 Architecture Overview

β”œβ”€β”€ app_simple.py          # Main Streamlit application (v3.0.0)
β”œβ”€β”€ app.py                 # Redirect to main app
β”œβ”€β”€ core/
β”‚   β”œβ”€β”€ streaming_pipeline.py    # Step-by-step content processing
β”‚   β”œβ”€β”€ streaming_results.py     # Real-time content display
β”‚   β”œβ”€β”€ content_generation.py    # AI content generation functions
β”‚   β”œβ”€β”€ supabase_integration.py  # Database operations
β”‚   β”œβ”€β”€ visible_thinking.py      # AI thinking bubbles
β”‚   β”œβ”€β”€ session_manager.py       # User session handling
β”‚   └── styling.py              # Aurora UI components
└── prompts/                # Default and custom AI prompts

🌊 Core Features

1. Real-Time Audio Processing

  • Upload audio files (MP3, WAV, M4A, FLAC, etc.)
  • Automatic transcription using OpenAI Whisper
  • Progressive content generation with live updates

2. Enhanced AI Content Pipeline

  1. Transcription - Speech-to-text conversion
  2. Wisdom Extraction - Key insights and takeaways
  3. Outline Creation - Structured content organization
  4. Article Generation - Complete written content
  5. Social Media - Platform-optimized posts
  6. 🌌 Notion Publishing - Auto-publish to Notion with beautiful formatting
  7. Database Storage - Persistent content library with Supabase

3. Modern Aurora Interface

  • Bioluminescent 2025 design system
  • Real-time progress indicators
  • Animated content cards
  • Responsive Aurora color scheme

πŸ”§ Technical Stack

  • Frontend: Streamlit with custom Aurora CSS
  • Backend: Supabase (PostgreSQL)
  • AI Models: OpenAI GPT-4
  • Audio Processing: OpenAI Whisper
  • Authentication: Supabase Auth + OAuth
  • Deployment: Streamlit Cloud ready

πŸš€ Getting Started

  1. Clone Repository

    git clone <repository-url>
    cd whisperforge--prime
  2. Install Dependencies

    python -m venv venv
    source venv/bin/activate  # or `venv\Scripts\activate` on Windows
    pip install -r requirements.txt
  3. Environment Setup Create .env file or set environment variables:

    # Required - Supabase Database
    SUPABASE_URL=your_supabase_url
    SUPABASE_ANON_KEY=your_supabase_anon_key
    SUPABASE_SERVICE_ROLE_KEY=your_service_role_key  # Optional for admin features
    
    # Required - AI Provider
    OPENAI_API_KEY=your_openai_key
    
    # Notion Integration - Auto-Publishing
    NOTION_API_KEY=your_notion_integration_token
    NOTION_DATABASE_ID=your_notion_database_id
    
    # Optional - OAuth & Integrations
    OAUTH_REDIRECT_URL=http://localhost:8501  # For OAuth flows
    
    # Optional - Security & Monitoring
    JWT_SECRET=your_jwt_secret_key
    SENTRY_DSN=your_sentry_dsn  # For error tracking
    
    # Optional - Development
    DEBUG=true
    LOG_LEVEL=INFO
    ENVIRONMENT=development  # or 'production'
  4. Run Application

    ./start_app.sh                 # development (default)
    ./start_app.sh production      # production mode

🎨 Aurora Design System

The WhisperForge UI uses a custom Aurora design system featuring:

  • Bioluminescent Effects: Glowing borders and animations
  • Gradient Backgrounds: Dynamic color transitions
  • Glass Morphism: Backdrop blur effects
  • Responsive Cards: Animated content containers
  • Progress Streams: Real-time processing indicators

πŸ“Š Database Schema

Core Tables

  • users - User accounts and settings
  • content - Generated content and metadata
  • prompts - Custom AI prompts
  • knowledge_base - User-uploaded files
  • api_keys - Encrypted API credentials

πŸ” Security Features

  • Encrypted Storage: API keys and sensitive data
  • Session Management: Secure user sessions
  • Input Validation: File size and type restrictions
  • Rate Limiting: API usage controls

πŸ›‘ Current Known Issues

  1. Database Content Retrieval: 26 processed files not displaying in history (investigating field name mismatches)
  2. Real-time Streaming: Content shows but not truly real-time like cursor chat
  3. Session Persistence: Authentication doesn't persist across refreshes consistently
  4. Prompt Saving: Custom prompts saving but not loading properly
  5. Thinking Bubbles: AI thinking stream not integrating smoothly

πŸ”„ Debugging Tools

The content history page includes debug information:

  • Database connection status
  • Raw record samples
  • Session state inspection
  • Content structure analysis

πŸ“ˆ Roadmap

Immediate Fixes

  • Fix content history display issues
  • Implement true real-time streaming
  • Resolve session persistence
  • Debug prompt saving/loading

Enhancements

  • Batch audio processing
  • Export to multiple formats
  • Advanced AI model selection
  • Team collaboration features

πŸ’‘ Contributing

This is currently a private project focused on creating the best audio-to-content transformation experience with a beautiful, modern interface.

πŸ“„ License

MIT License - See LICENSE file for details.


WhisperForge - Transforming audio into actionable insights with the beauty of Aurora. 🌌

πŸ— Architecture (Simplified)

Session Management

# Simple, reliable pattern
if 'authenticated' not in st.session_state:
    st.session_state.authenticated = False

@st.cache_resource  
def init_supabase():
    return get_supabase_client()

Database Pattern

  • Supabase Client: Cached with @st.cache_resource
  • User Data: Loaded fresh each session (not cached in session state)
  • Content Storage: Direct to database, no complex state management

Authentication Flow

  1. User enters credentials β†’ Verify against Supabase
  2. Set simple session state flags β†’ No tokens or complex persistence
  3. Load user preferences from database β†’ Use @st.cache_data for performance

About

WhisperForge is a Python tool that leverages OpenAI's Whisper model to transcribe large audio files. It automatically splits files into manageable chunks, processes them, and combines the transcriptions into a single document. Ideal for handling lengthy recordings and generating clear, organized transcriptions.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

No packages published

Languages