This project demonstrates how to build a simple Retrieval-Augmented Generation (RAG) system using:
- Local Text Embeddings via SmartComponents.LocalEmbeddings
- Vector Storage & Search via Qdrant
- Ollama to serve a local language model (e.g.,
mistral:7b)
The biggest challenge in RAG systems is text embedding — converting meaningful text into high-dimensional vectors for retrieval. While many solutions rely on cloud APIs or heavy frameworks like ONNX, this project demonstrates how to:
- Use local embeddings with minimal setup
- Achieve fast and accurate semantic search
- Keep everything offline and privacy-preserving
| Component | Tool/Library |
|---|---|
| Embeddings | SmartComponents.LocalEmbeddings |
| Vector DB | Qdrant (running locally) |
| LLM | Ollama (mistral:7b) |
| API Layer | ASP.NET Core Web API |
[User Question]
↓
Local Embedder (Microsoft's smartcomponents)
↓
Qdrant Search (Vector DB)
↓
Top Document Retrieved
↓
Mistral via Ollama (LLM)
↓
Final Answer
- Clone the repo.
- Install Qdrant locally or run via Docker.
- Run Ollama with:
ollama run mistral
- Launch the ASP.NET API project.
MIT License.