Summary

llm-rag-assistant is a fully local, retrieval-augmented chatbot powered by llama-cpp-python, designed to answer questions in Spanish using your own Q&A dataset. It uses semantic search via FAISS + multilingual sentence-transformers to retrieve relevant answers, and combines it with a local instruction-tuned LLM (e.g., Mistral-7B-Instruct in GGUF format) for contextual response generation.

🚀 Features

🔍 Semantic Search with multilingual embeddings (sentence-transformers)
🧠 Local LLM inference without a GPU using optimized GGUF models + llama-cpp-python
💻 Runs on standard laptops and desktops — no CUDA, no GPU, no special hardware required
🔒 No API keys, no cloud dependency — fully private and offline
🌐 Instant web interface with Streamlit
🐳 Docker & Docker Compose ready for easy deployment
🗂️ Plug-and-play with any Q&A dataset in JSON format

RAG Local - Instructions

This package lets you run a console chatbot with semantic retrieval (RAG) on your machine, with no need for a GPU or external connection.

This version works in the console. For a UI version, see the streamlit version.

Requirements:

Python 3.9+
Install dependencies: pip install llama-cpp-python faiss-cpu sentence-transformers

Tested with python-3.13.5, specific versions in environment.yml # On macOS, if build fails try conda install -c conda-forge llama-cpp-python pip install faiss-cpu sentence-transformers

Download the GGUF model:

For example

   wget https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF/resolve/main/mistral-7b-instruct-v0.1.Q4_K_M.gguf -O mistral-7b-instruct.Q4_K_M.gguf

Open source model, apache 2.0 license https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1

Build a question and answer dataset

Important: Save it in the file qa_dataset.json

It should have the following structure (example)

[
  {
    "pregunta": "¿Cuál es el horario de atención?",
    "respuesta": "Nuestro horario de atención es de lunes a viernes de 9:00 a 18:00 horas y sábados de 9:00 a 14:00."
  },
  {
    "pregunta": "¿Cómo puedo contactar con soporte técnico?",
    "respuesta": "Puede contactar con soporte técnico a través del email [email protected], llamando al 900-123-456 o mediante el chat en vivo de nuestra web."
  },
  ...
]

Create the config.yaml file for RAG System configuration

For example

models:
  embeddings:
    model_name: "sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2"
  generation:
    llama_cpp_model_path: "models/mistral-7b-instruct.Q4_K_M.gguf"
    max_tokens: 256

Note: To work with this type of Q&A dataset, you need an instruction-tuned model.

TODO:

Add temperature configuration

Included files:

prepare_embeddings.py → generates scibot_index.faiss and qa.json from your dataset
app.py → runs the streamlit app
qa_dataset.json → your knowledge base

Steps:

Use docker compose (see below) or run manually:

Run: python prepare_embeddings.py
Run: streamlit run app.py
Chat with your knowledge base using a Spanish bot :)

Requirements:

8GB RAM minimum (16GB recommended)
~5GB of space for the models

Build and run with docker compose

docker-compose build

docker-compose up -d

docker-compose down

docker-compose logs -f

Access to aplication

Open your browser at: http://localhost:8501

🐳 Extra docker commands

# Rebuild from scratch
docker-compose build --no-cachedocker-compose build --no-cache

# Execute inside the container
docker-compose exec rag-app python compute_embeddings.py

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
app.py		app.py
config.yaml		config.yaml
docker-compose.yml		docker-compose.yml
entrypoint.sh		entrypoint.sh
prepare_embeddings.py		prepare_embeddings.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Summary

🚀 Features

RAG Local - Instructions

Requirements:

TODO:

Included files:

Steps:

Requirements:

Build and run with docker compose

Access to aplication

🐳 Extra docker commands

About

Uh oh!

Releases

Packages

Languages

License

it-courses-material/llm-rag-assistant-streamlit

Folders and files

Latest commit

History

Repository files navigation

Summary

🚀 Features

RAG Local - Instructions

Requirements:

TODO:

Included files:

Steps:

Requirements:

Build and run with docker compose

Access to aplication

🐳 Extra docker commands

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages