Skip to content

Blind Clicks is a self-hosted AI agent development framework that gives you full control over your automation workflows. Inspired by OpenAI's Operator and Anthropic's Computer Use

Notifications You must be signed in to change notification settings

aemal/blindclicks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 

Repository files navigation

Blind Clicks

AI Agent Framework for the Real World
Build smarter automations β€” even for legacy software with no APIs.

🚧 Note: This project is currently under active development and will be released as an open-source project soon. You can already star this repository to stay updated, and follow the build-in-public journey on YouTube. For more details, visit blindclicks.dev.

πŸ–±οΈ What Is Blind Clicks?

Blind Clicks is a self-hosted AI agent development framework that gives you full control over your automation workflows. Inspired by OpenAI's Operator and Anthropic's Computer Use, it goes one step further:
It works blindly β€” by simulating actual user clicks and keystrokes at the OS level.

It's like macros on steroids, but with an LLM-powered fallback vision layer β€” for when things don't go as planned.

βš™οΈ How It Works

Blind Clicks uses:

  • RobotJS to simulate OS-level mouse and keyboard actions
  • Dockerized Linux (via Webtop) to run applications in a browser-accessible container
  • LLMs (your choice β€” bring your own) for prompt understanding
  • Vision fallback (via OmniParser and others) to analyze screen metadata and adapt intelligently when clicks fail

Automate boring, complex workflows β€” even in ancient software that still runs on Windows XP.

πŸ’‘ Why It Matters

This isn't another browser automation tool.
No Puppeteer. No Playwright. No DOM required.
Just raw, deterministic automation β€” designed for:

  • Legacy systems with no APIs
  • On-prem software where you can't use cloud agents
  • Privacy-sensitive use cases (100% self-hosted)
  • Developers & integrators building custom AI agents for real-world, messy UIs

🧠 Smart Recovery with LLMs

When clicks fail or a UI changes unexpectedly, Blind Clicks captures the screen, analyzes it with tools like Microsoft's OmniParser, and adapts the workflow intelligently.
Fallback logic is promptable, explainable, and customizable.

Human-in-the-loop? Absolutely.
Even your AI agent can call you if it gets stuck.

🧾 Real-World Demo: Dativ Accounting System

Here's a real demo of an AI agent built using Blind Clicks β€” automating the accounting system Dativ (widely used in Germany).
This agent logs in, uploads invoices, parses and reconciles OCR data, and notifies the human when done. It even handles two-factor authentication with a human-in-the-loop fallback.

"If you've ever had to upload hundreds of invoices manually, you'll understand why this matters."
– Aemal

πŸ“Ί Building in Public β€” Follow the Journey

Blind Clicks is being built live, every Sunday, in public.
Every bug. Every breakthrough. Every idea. All documented and shared.

▢️ Watch the full YouTube Playlist
Episodes include live coding, real debugging sessions, design choices, and honest reflections.

πŸ’¬ "It's messy, fun, and very real. I lock myself in a room, hit record, and build this agent layer by layer."

Subscribe, follow the journey, and learn how AI agents are made β€” for real-world tasks.

πŸ›  Built for Builders

Blind Clicks is a developer-first tool.
It's open-source, hackable, and still early in its journey. Every Sunday, I build this live on my YouTube channel.

πŸ’¬ "This project started as a way to help my wife automate a boring task. Now it's my playground for rethinking automation from the ground up."
– Aemal Sayer, Creator of Blind Clicks

πŸ” 100% GDPR & AI Act Compliant

Run it offline. Use your own LLM. Lock it behind a firewall.
Blind Clicks is privacy-first by design β€” perfect for European governments, hospitals, or any company that says "no" to OpenAI.

πŸ‘¨β€πŸ’» About the Author

Aemal Sayer is a software engineer and AI enthusiast passionate about building practical solutions to real-world problems. With a background in automation and AI, he created Blind Clicks to solve a personal pain point that turned into a broader mission to make automation accessible for legacy systems.

πŸ‘‹ Try It. Break It. Improve It.

This is not a product pitch. It's a project β€” a creative challenge β€” an open invitation.

If you're excited by deterministic AI agents, OS-level automation, or voice-based workflows that just work…

πŸ“¬ Reach out or
πŸ“Ί Subscribe on YouTube β€” and let's build this together.

About

Blind Clicks is a self-hosted AI agent development framework that gives you full control over your automation workflows. Inspired by OpenAI's Operator and Anthropic's Computer Use

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published