Skip to content

AI-powered tool to turn long videos into short, viral-ready clips. Combines transcription, speaker diarization, scene detection & 9:16 resizing — perfect for creators & smart automation.

License

Notifications You must be signed in to change notification settings

alperensumeroglu/ai-clips-maker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🎬 ai-clips-maker

Created by Alperen Sümeroğlu — An AI-native video engine that turns long-form content into short, viral-ready clips with surgical precision.

ai-clips-maker is a smart, modular Python tool built for creators, educators, and developers. It transcribes speech, detects speakers, analyzes scenes, and crops around the key moments — creating ready-to-share vertical clips for TikTok, Reels, and Shorts with zero manual editing.


📚 Contents


📦 Features

  • 🎞️ Auto-segment videos based on speech & scene shifts
  • 🧠 Word-level transcription using WhisperX
  • 🗣️ Speaker diarization (who spoke when) via Pyannote
  • 🪄 Face/body-aware cropping focused on active speaker
  • 📐 Output formats: 9:16 (vertical), 1:1 (square), 16:9 (wide)
  • 🔌 Modular and easily extensible pipeline

🛠 Installation

# Install main package
pip install ai-clips-maker

# Install WhisperX from source
pip install git+https://github.com/m-bain/whisperx.git

# Install dependencies
# macOS
brew install libmagic ffmpeg

# Ubuntu/Debian
sudo apt install libmagic1 ffmpeg

🚀 Quickstart

from ai_clips_maker import Transcriber, ClipFinder, resize

# Step 1: Transcription
transcriber = Transcriber()
transcription = transcriber.transcribe(audio_file_path="/path/to/video.mp4")

# Step 2: Clip detection
clip_finder = ClipFinder()
clips = clip_finder.find_clips(transcription=transcription)
print(clips[0].start_time, clips[0].end_time)

# Step 3: Cropping & resizing
crops = resize(
    video_file_path="/path/to/video.mp4",
    pyannote_auth_token="your_huggingface_token",
    aspect_ratio=(9, 16)
)
print(crops.segments)

🔍 How It Works

  1. 🎧 Extracts audio from video
  2. ✍️ Transcribes speech using WhisperX
  3. 🧍 Identifies speakers with Pyannote
  4. 🎬 Detects scene changes & speaker shifts
  5. 🎯 Crops video around active speaker’s position
  6. 📤 Exports clips in desired format

⚙️ Tech Stack

🔧 Module 🧠 Technology 💡 Purpose
Transcription WhisperX Word-level speech-to-text with timestamps
Diarization Pyannote.audio Speaker segmentation (who spoke when)
Video Processing OpenCV, PyAV Frame-by-frame video control
Scene Detection Scenedetect Detects shot boundaries
ML Inference PyTorch Powering WhisperX & Pyannote models
Data Handling NumPy, Pandas Transcription & clip structuring
Media Utilities ffmpeg, libmagic Media decoding + type detection
Testing Framework pytest End-to-end and unit testing support

All tools were selected for speed, flexibility, and production-grade stability.


🎯 Use Cases

  • 🎙 Podcasters clipping episodes into shareable highlights
  • 📚 Teachers summarizing lecture content
  • 📱 Social media teams repurposing YouTube for Reels
  • 🧠 Developers automating video workflows
  • 🚀 Startups building AI-based content tools

🧪 Tests

# Run test suite
pytest tests/

Covers all components: transcriber, diarizer, clip detector, resizer.


🗺 Roadmap

Status Feature Note
Core pipeline: Transcribe → Diarize → Detect Implemented in v1.0
Speaker-aware video cropping Production ready
🚧 Multi-language subtitle generation Planned for Q2 2025
📌 Auto-caption overlay In design phase
🧪 Web UI (upload + preview clips) Prototype in progress
🧠 HuggingFace or Streamlit live demo On backlog

🤝 Contribute

We welcome pull requests, ideas, and feedback.

# Fork the repo
git clone https://github.com/alperensumeroglu/ai-clips-maker.git
cd ai-clips-maker

# Create feature branch
git checkout -b feat/your-feature

# Make changes, commit, and push
git commit -am "Add feature"
git push origin feat/your-feature

Before contributing, please review open issues and coding style guide.


👤 Author

Alperen Sümeroğlu
Computer Engineer • Entrepreneur • World Explorer 🌍
15+ European countries explored ✈️

“Let your code tell your story — clean, powerful, and useful.”


🎧 Weekly Rewind Podcast

🎤 Weekly insights on AI, tech, and building globally — by Alperen Sümeroğlu.

🚀 What does it take to grow as a Computer Engineering student, build projects, and explore global innovation?

This API is part of a bigger journey I share in Weekly Rewind — my real-time documentary podcast series, where I reflect weekly on coding breakthroughs, innovation insights, startup stories, and lessons from around the world.

💡 What is Weekly Rewind?

A behind-the-scenes look at real-world experiences, global insights, and hands-on learning. Each episode includes:

  • 🔹 Inside My Coding & Engineering Projects
  • 🔹 Startup Ideas & Entrepreneurial Lessons
  • 🔹 Trends in Tech & AI
  • 🔹 Innovation from 15+ Countries
  • 🔹 Guest Conversations with Builders & Engineers
  • 🔹 Productivity, Learning & Growth Strategies

🎧 Listen now:

“True learning isn’t in tutorials — it’s in building, exploring, and reflecting.”


📄 License

MIT License — Free for commercial and personal use.
© 2024 Alperen Sümeroğlu