MediaPipe LLM Inference task for web

Overview

This web sample demonstrates how to use the LLM Inference API to run common text-to-text generation tasks like information retrieval, email drafting, and document summarization, on web.

Prerequisites

A browser with WebGPU support (eg. Chrome on macOS or Windows).

Running the demo

Follow the following instructions to run the sample on your device:

Make a folder for the task, named as llm_task, and copy the index.html and index.js files into your llm_task folder.
Download Gemma 2B (TensorFlow Lite 2b-it-gpu-int4 or 2b-it-gpu-int8) or convert an external LLM (Phi-2, Falcon, or StableLM) following the guide (only gpu backend is currently supported), into the llm_task folder.
In your index.js file, update modelFileName with your model file's name.
Run python3 -m http.server 8000 under the llm_task folder to host the three files (or python -m SimpleHTTPServer 8000 for older python versions).
Open localhost:8000 in Chrome. Then the button on the webpage will be enabled when the task is ready (~10 seconds).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MediaPipe LLM Inference task for web

Overview

Prerequisites

Running the demo

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
index.html		index.html
index.js		index.js

joinalahmed/ondevice-slm

Folders and files

Latest commit

History

Repository files navigation

MediaPipe LLM Inference task for web

Overview

Prerequisites

Running the demo

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages