An offline AI assistant that runs entirely in your browser using WebLLM and WebGPU! ✨
This application allows you to chat with LLMs directly in your browser without sending data to external servers. All processing happens locally on your device.
- 🌐 A modern browser with WebGPU support:
- Chrome 113+
- Edge 113+
- Firefox 118+
- 💻 A device with sufficient GPU capabilities
- 💾 Approximately 1-4GB of storage space (depending on model size)
- Simply open the
index.html
file in a supported browser - Select a model from the dropdown menu
- Click "Load Model" and wait for the download to complete
- Start chatting with the AI! 💬
This app uses:
- 🧠 WebLLM library to run models in the browser
- ⚡ WebGPU for hardware acceleration
- 🗜️ Quantized models for efficient performance
- 📝 Basic HTML, CSS, and JavaScript (no framework dependencies)
- SmolLM2 360M: 🐥 A very small model, great for basic tasks
- Llama 3.1 8B: 🦙 Medium-sized model with good capabilities
- Phi 3.5 Mini: 🦊 Larger model with enhanced response quality
Feel free to modify the app to suit your needs. The entire application is contained in a single HTML file for simplicity.
This project is open source and available under the MIT License.
For a more feature-rich implementation with additional functionality, check out:
- WebLLM Offline AI Assistant - A more advanced version with:
- 🖥️ PC-themed desktop interface
- 💬 Chat history support
- 🗃️ IndexedDB caching
- 📝 Logger
- 🖱️ Draggable windows
- 🔽 Taskbar and window controls
- 📱 Responsive design for mobile and desktop
✨ Live demo: chat.ebenezerdon.com