This project implements a completely offline chatbot that runs a local LLM directly in your browser using WebAssembly. The model runs entirely client-side without any server processing or internet access after initial loading, ensuring complete privacy.
- 100% Client-side Processing: All inference happens locally in the browser
- Privacy-focused: No data leaves your device
- Offline Operation: Works without internet once loaded
- Multiple Model Support: Compatible with GGUF format models (Llama 2, Mistral, Phi-2, etc.)
- Interactive UI: Clean chat interface with streaming responses
- Memory Usage Monitoring: Track resource consumption
- Demo Mode: Test UI functionality without building WebAssembly files
Try it online at: https://sukur123.github.io/localLLMWeb (demo mode)
- Modern web browser (Chrome/Edge recommended for best performance)
- For building:
- Linux/macOS environment
- Emscripten SDK
- CMake (3.14+)
- libcurl development files (for some build options)
-
Clone the repository:
git clone https://github.com/sukur123/localLLMWeb.git cd localLLMWeb -
Download a model file:
./download-models.sh
-
Setup Emscripten (one-time setup):
./setup-emscripten.sh source ./source-emscripten.sh -
Build the WebAssembly files:
# Option 1: Simple build without curl dependency (recommended) ./build-wasm-nocurl.sh # Option 2: Build with advanced features ./build-wasm-simple.sh
-
Start the server:
./start-server.sh
-
Open in browser: Visit the URL displayed by the server (typically
http://localhost:8080)
If you encounter build errors:
-
Run the diagnostic tool:
./diagnostic.sh
-
Install dependencies:
./install-dependencies.sh
-
Common issues:
- libcurl not found:
sudo apt-get install libcurl4-openssl-dev - Emscripten not in PATH:
source ./source-emscripten.sh - Build fails: Try the no-curl build with
./build-wasm-nocurl.sh
- libcurl not found:
See TROUBLESHOOTING.md for detailed assistance.
| Browser | Support | Notes |
|---|---|---|
| Chrome/Edge | ✅ Excellent | Best performance and highest memory limits |
| Firefox | ✅ Good | Slightly lower memory limits than Chrome |
| Safari | More restrictive memory limits, works with smaller models | |
| Mobile browsers | Works with tiny models (≤1B parameters) |
For optimal performance, use smaller instruction-tuned models with high quantization:
| Model | Size | Type | Performance |
|---|---|---|---|
| SmolLM2-135M-Instruct | 85MB | Q2_K | Ultra-lightweight, fast |
| TinyLlama-1.1B-Chat | ~600MB | Q4_K | Good balance of size/quality |
| Phi-2 | ~1.2GB | Q4_K | Excellent quality for size |
| Mistral-7B-Instruct | ~4GB | Q4_K | Better quality, requires powerful device |
The application includes a "demo mode" that activates when:
- WebAssembly files aren't available
- No model file is found
?demo=trueis added to the URL
This allows testing the UI without compiling WebAssembly or downloading models.
- Small models (1-2B parameters): 1-2GB RAM
- Medium models (7B parameters): 4-8GB RAM
- The llama.cpp library is compiled to WebAssembly using Emscripten
- The WASM module loads the GGUF model file into memory
- User input is processed into the appropriate prompt format
- The model generates tokens that are streamed back to the UI
- The chat history maintains context for the conversation
localLLMWeb/
├── css/ # Styling
├── js/ # Application logic
│ ├── chat.js # Chat management
│ ├── log.js # Logging utilities
│ └── ui.js # UI interactions
├── models/ # GGUF model storage
├── wasm/ # WebAssembly files
├── build-wasm-nocurl.sh # Build script without curl dependency
├── build-wasm-simple.sh # Simplified build script
├── build-wasm.sh # Original build script
├── diagnostic.sh # Environment diagnostics
├── download-models.sh # Model downloader
├── install-dependencies.sh # Dependency installer
├── setup-emscripten.sh # Emscripten setup
├── source-emscripten.sh # Environment sourcing
├── start-server.sh # Local web server
└── TROUBLESHOOTING.md # Detailed troubleshooting
- Edit the system prompt in
chat.jsto change the model's behavior - Adjust generation parameters in
chat.jsfor different response styles - Modify the UI in
styles.cssto customize the appearance
This project is licensed under the MIT License - see the LICENSE file for details.
- llama.cpp for the core inference engine
- Emscripten for WebAssembly compilation
- GGUF model creators for providing open models
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
