Welcome! This project turns an ordinary folder of Standard Operating Procedures (SOPs) and Support Articles into an intelligent, chat-like search assistant. Under the hood it uses Retrieval-Augmented Generation (RAG) with two Chroma vector databases and a local Large Language Model (LLM) served by Ollama.
Dental offices accumulate dozens of SOP documents and support articles scattered across shared drives. Finding the right paragraph quickly is frustrating. This assistant lets you ask a question in plain English (e.g. "How do I process refunds?") and get a Markdown answer with the exact document excerpts that back it up.
High-level flow:
flowchart LR
A(User Question) --> B(Flask Web App)
B --> C(RAG Engine in rag.py)
C --> D{Chroma DB – SOPs}
C --> E{Chroma DB – Support}
C --> F(Ollama LLM)
D & E --> C
F --> C
C --> B --> G(Browser Answer)
Paste the following in PowerShell from the project folder (ai-enhanced-search) – each block is one command:
# 1. Create Python virtual environment
python -m venv .venv
.venv\Scripts\Activate.ps1
# 2. Install dependencies
pip install -r requirements.txt
# 3. Download the LLM (≈ 4 GB)
ollama pull qwen3:latest
# 4. Build or refresh the SOP database (optional first run may take minutes)
python index_sop.py
# 5. Launch the web app
python app.pyOpen your browser at http://localhost:5000 and start asking questions!
Tip: The first query is slower because embeddings are generated; subsequent searches are fast.
- Windows 10/11 (macOS & Linux work too – commands are analogous).
- Python 3.10+ – Download from https://www.python.org/downloads and check the "Add Python to PATH" box during install.
- Git – optional but handy (https://git-scm.com/download/win).
- Ollama – local LLM server. Get the Windows installer from https://ollama.ai/download. After install, Ollama runs in the background on port 11434.
- (Optional) ocrmypdf – only needed if you have image-only PDFs. Install via:
choco install ocrmypdf # <- requires Chocolatey
All tunable settings live in rag.py and index_sop.py. Most people can skip this, but if you need to customise paths or models, edit the constants near the top of each file.
| Setting | File | Purpose | Default |
|---|---|---|---|
OLLAMA_MODEL |
rag.py |
Which LLM Ollama serves | qwen3:latest |
N_CHUNKS |
rag.py |
# of document chunks retrieved | 4 |
CHUNK_SIZE |
index_sop.py |
Words per chunk when splitting docs | 300 |
SOURCE_DIR |
index_sop.py |
Folder containing your raw SOP files | sop_documents |
CHROMA_PATH |
both | Where the vector DB is stored on disk | ./chroma_sops + ./chromadb_data |
You can also set any of these as environment variables before running the app, e.g.:
$Env:OLLAMA_MODEL = "mistral:7b"
$Env:SOURCE_DIR = "C:\path\to\your\documents"
python app.py- Place your SOP PDFs, DOCX, or TXT files in the folder defined by
SOURCE_DIR(defaults tosop_documents/in the project root). - (Optional) Add front-matter metadata inside the document or maintain a CSV named
SOP_metadata.csv– the script will merge these. - Run
python index_sop.py. - The script extracts text, OCRs if needed, splits into overlapping chunks, embeds them, and stores everything in Chroma.
- You should see a success message like
✓ Indexed 12,345 chunks into 'sop_vectors'.
Repeat the same for Support Articles or add your own by following the pattern making sure to update the CHROMA_PATH and COLLECTION_NAME variables
Each chunk stored in Chroma contains:
| Canonical Key | Aliases in raw docs | Example |
|---|---|---|
title |
title, sop_name, name |
"Patient Check-In" |
id |
sop_id, guid |
SOP-123 |
department |
dept, team |
"Front Desk" |
The mapping is defined in DB_CFG inside rag.py. Add a new key or alias just by editing that dictionary.
POST /search
Request JSON:
{
"query" : "How do I process refunds?",
"domain" : "sop" // or "support"
}Successful Response:
{
"answer" : "Markdown answer …",
"sources": [
{ "title":"Refund Policy", "relevance":92.3, "preview":"…", "id":"SOP-045", "department":"Finance" }
]
}Error Codes:
- 400 – Empty query or unknown domain
- 500 – Server error (see console log)
- Choose Company SOPs or Support Articles from the dropdown.
- Type your question and press Enter.
- The answer appears in nicely formatted Markdown; supporting source cards show relevance percentages.
- Click Ask Another Question to reset.
This repo currently has no automated tests. A simple manual test:
- Start Ollama (
ollama run qwen3) and the Flask app (python app.py). - In a separate terminal run:
curl -X POST http://localhost:5000/search -H "Content-Type: application/json" -d '{"query":"What is HIPAA?","domain":"sop"}'
- Verify that
answeris non-empty andsourcesis an array.
Developers: add
pytest+chromadb.Client(settings={"is_persistent":False})for unit tests.
- Production server – run via gunicorn (Linux) or waitress (Windows):
pip install waitress waitress-serve --call 'app:create_app'
- Behind a reverse-proxy (Nginx/Apache) forward port 80 → 5000.
- Back up the folders
chroma_sopsandchromadb_dataregularly – they hold all embeddings. - Logs: Flask prints to console; redirect to a file using
>> app.log 2>&1if needed.
Want to add a third knowledge-base?
- Build a new Chroma collection as shown in
index_sop.py. - Add a new entry in
DB_CFG(copy-paste and tweak). Pointpersist_dirandcollectionto your new folder. - Add an option in
templates/index.html's<select>dropdown. - Re-start the app – no further code changes!
- All data and the LLM run locally – nothing is sent to external servers.
- There is no authentication baked in; if you expose the app beyond localhost you must add a login layer (Flask-Login, reverse-proxy auth, etc.).
- Files stored:
chroma_*/(vector DB) •*.pdf/.docx/.txt(raw docs) • log files you create.
This project is licensed under the MIT License - see the LICENSE file for details.
- Fork → create feature branch → commit with clear messages.
- Run
black(pip install black) for formatting. - Submit a Pull Request.
- The maintainer will review, request changes if necessary, and merge.
Create CHANGELOG.md using the Keep a Changelog format. Example entry:
## [1.1.0] – 2025-03-27
### Added
- Support Article database
- Front-end domain selector
Key functions such as rag_inference() and search_similar_chunks() contain inline docstrings – open the files and read them in any IDE for deeper understanding.
| Term | Meaning |
|---|---|
| RAG | Retrieval-Augmented Generation – combining search results with an LLM answer. |
| SOP | Standard Operating Procedure document. |
| Chroma | Open-source vector database storing embeddings. |
| Embedding | Numeric representation of text that captures semantic meaning. |
| Ollama | Local server that runs LLMs on your machine. |
- Max tokens generated set to 4096 – very long answers may be cut off.
- Only two knowledge-bases (SOP, Support) are wired in code.
- No authentication or role-based access control.
- Indexer scripts for Support Articles are not yet open-sourced.
Planned: add a settings UI, implement user login, and ship Docker images for one-command deployment.