CyGen is a powerful Retrieval-Augmented Generation (RAG) system built with FastAPI, MongoDB, Qdrant, and Groq LLM, featuring a Streamlit frontend for seamless interaction. This system allows you to upload PDF documents, process them intelligently, and have natural language conversations about their content.
-
📄 Advanced PDF Document Ingestion
- Multi-threaded PDF processing
- Intelligent text chunking with configurable parameters
- Background task queue for non-blocking operations
- Progress tracking for document processing
-
🔍 Smart Vector Search
- Semantic similarity search using embeddings
- Context-aware document retrieval
- Configurable relevance thresholds
- Metadata-enhanced document chunks
-
💬 Interactive Chat Interface
- Real-time chat with HTTP POST endpoint
- Context window management
- Conversation history with MongoDB
- Automatic conversation titles generation
-
🧠 Groq LLM Integration
- Fast inference with 8k context window
- Optimized prompting strategy
- Balanced context retrieval
- Temperature control for response diversity
-
🖥️ User-friendly Web UI
- Document upload with progress indicators
- Conversation management
- Responsive design
- Real-time chat updates
flowchart TD
subgraph Client
UI[Streamlit Frontend]
end
subgraph Backend
API[FastAPI Backend]
TaskQueue[Background Task Queue]
VectorDB[(Qdrant Vector DB)]
MongoDB[(MongoDB)]
LLM[Groq LLM API]
end
subgraph Processing
PDF[PDF Processor]
Chunker[Text Chunker]
Embedder[Embedding Model]
end
%% Client to Backend interactions
UI -->|1. Upload PDF| API
UI -->|5. Send Query| API
API -->|8. Stream Response| UI
%% Document Processing Flow
API -->|2. Process Document| TaskQueue
TaskQueue -->|3. Extract & Chunk| PDF
PDF -->|3.1. Split Text| Chunker
Chunker -->|3.2. Generate Embeddings| Embedder
Embedder -->|3.3. Store Vectors| VectorDB
Embedder -->|3.4. Store Metadata| MongoDB
%% Query Processing Flow
API -->|6. Retrieve Context| VectorDB
API -->|6.1. Get History| MongoDB
API -->|7. Generate Response| LLM
VectorDB -->|6.2. Relevant Chunks| API
MongoDB -->|6.3. Conversation History| API
%% Styles
classDef primary fill:#4527A0,stroke:#4527A0,color:white,stroke-width:2px
classDef secondary fill:#7E57C2,stroke:#7E57C2,color:white
classDef database fill:#1A237E,stroke:#1A237E,color:white
classDef processor fill:#FF7043,stroke:#FF7043,color:white
classDef client fill:#00ACC1,stroke:#00ACC1,color:white
class API,TaskQueue primary
class PDF,Chunker,Embedder processor
class VectorDB,MongoDB database
class LLM secondary
class UI client
The system comprises several key components that work together:
-
FastAPI Backend
- RESTful API endpoints and background task processing
- Asynchronous request handling for high concurrency
- Dependency injection for clean service management
- Error handling and logging
-
MongoDB
- Conversation history storage
- Document metadata and status tracking
- Asynchronous operations with Motor client
- Indexed collections for fast retrieval
-
Qdrant Vector Database
- High-performance vector storage and retrieval
- Scalable embedding storage
- Similarity search with metadata filtering
- Optimized for semantic retrieval
-
Groq LLM Integration
- Ultra-fast inference for responsive conversation
- 8k token context window
- Adaptive system prompts based on query context
- Clean API integration with error handling
-
Streamlit Frontend
- Intuitive user interface for document uploads
- Conversation management and history
- Real-time chat interaction
- Mobile-responsive design
Our PDF processing pipeline is designed for efficiency and accuracy:
- Text Extraction: Extract raw text from PDF documents using PyPDF2
- Text Cleaning: Remove artifacts and normalize text
- Chunking Strategy: Implement recursive chunking with smart boundary detection
- Metadata Enrichment: Add page numbers, file paths, and other metadata
- Vector Embedding: Generate embeddings for each chunk
- Storage: Store vectors in Qdrant and metadata in MongoDB
The RAG system follows a sophisticated approach to content retrieval:
- Query Analysis: Analyze user query for intent and keywords
- Context Retrieval: Retrieve relevant document chunks from vector store
- Threshold Filtering: Filter results based on similarity score threshold
- Context Assembly: Combine retrieved chunks with conversation history
- Prompt Construction: Build prompt with system instructions and context
- LLM Generation: Generate response using Groq LLM
- Response Delivery: Deliver response to user in real-time
- Docker and Docker Compose
- Python 3.11+
- uv package manager (recommended for local development)
- Groq API key
- MongoDB instance (local or Atlas)
- Qdrant instance (local or cloud)
-
Clone the repository:
git clone https://github.com/yourusername/cygen.git cd cygen
-
Copy the example environment file:
cp .env.example .env
-
Update the following variables in
.env
:GROQ_API_KEY=your_groq_api_key MONGODB_URL=mongodb://username:password@host:port/db_name QDRANT_URL=http://qdrant_host:port MAX_WORKERS=4 CHUNK_SIZE=512 CHUNK_OVERLAP=50 TOP_K=5 RAG_THRESHOLD=0.75 TEMPERATURE=0.7 N_LAST_MESSAGE=5
chmod +x start.sh
./start.sh
The launcher offers the following options:
- Start both the FastAPI backend and Streamlit frontend with Docker Compose
- Start only the FastAPI backend
- Start only the Streamlit frontend (with Docker or locally)
Start all services:
docker-compose up --build
Start only specific services:
docker-compose up --build app # Backend only
docker-compose up --build streamlit # Frontend only
-
Create and activate a virtual environment:
uv venv source .venv/bin/activate # Linux/macOS .venv\Scripts\activate # Windows
-
Install dependencies:
uv pip install -e .
-
Start the FastAPI backend:
uvicorn src.main:app --reload --port 8000
-
Start the Streamlit frontend (in a separate terminal):
cd streamlit ./run.sh # or `streamlit run app.py`
- Streamlit Frontend: http://localhost:8501
- FastAPI Swagger Docs: http://localhost:8000/docs
- API Base URL: http://localhost:8000/api/v1
- Navigate to the Streamlit web interface
- Click on the "Upload Documents" section in the sidebar
- Select a PDF file (limit: 200MB per file)
- Click "Process Document"
- Wait for the processing to complete (progress will be displayed)
- Click "New Conversation" in the sidebar
- A new conversation will be created with a temporary title
- The title will be automatically updated based on your first message
- Type your question in the chat input
- The system will:
- Retrieve relevant context from your documents
- Consider your conversation history
- Generate a comprehensive answer
- Continue the conversation with follow-up questions
- All your conversations are saved and accessible from the sidebar
- Select any conversation to continue where you left off
- Conversation history is preserved between sessions
The system exposes the following key API endpoints:
POST /api/v1/documents/upload
: Upload a PDF documentGET /api/v1/documents/task/{task_id}
: Check document processing status
PUT /api/v1/chat/conversation
: Create a new conversationGET /api/v1/chat/conversations
: List all conversationsGET /api/v1/chat/conversations/{conversation_id}
: Get a specific conversationDELETE /api/v1/chat/conversations/{conversation_id}
: Delete a conversationPOST /api/v1/chat/{conversation_id}
: Send a message in a conversation
.
├── docker/ # Docker configuration files
│ ├── app/ # Backend Docker setup
│ └── streamlit/ # Frontend Docker setup
├── logs/ # Application logs
├── src/ # Backend source code
│ ├── router/ # API route definitions
│ │ ├── chat.py # Chat endpoints
│ │ └── documents.py # Document endpoints
│ ├── utils/ # Utility modules
│ │ ├── llm.py # LLM integration
│ │ ├── pdf_processor.py # PDF processing
│ │ ├── text_chunking.py # Text chunking
│ │ └── vector_store.py # Vector database interface
│ ├── main.py # FastAPI application entry
│ └── settings.py # Application settings
├── streamlit/ # Streamlit frontend
│ ├── app.py # Main Streamlit application
│ └── utils.py # Frontend utilities
├── tests/ # Test suite
│ ├── unit/ # Unit tests
│ └── integration/ # Integration tests
├── uploads/ # Uploaded documents storage
├── .env.example # Example environment variables
├── docker-compose.yml # Docker Compose configuration
├── Dockerfile # Backend Dockerfile
├── pyproject.toml # Python project configuration
├── start.sh # Interactive launcher script
└── README.md # Project documentation
The system can be configured through environment variables:
Variable | Description | Default |
---|---|---|
GROQ_API_KEY |
Groq API key for LLM integration | - |
MONGODB_URL |
MongoDB connection string | mongodb://localhost:27017 |
MONGODB_DB_NAME |
MongoDB database name | rag_system |
QDRANT_URL |
Qdrant server URL | http://localhost:6333 |
MAX_WORKERS |
Maximum worker threads for PDF processing | 4 |
CHUNK_SIZE |
Target chunk size for document splitting | 512 |
CHUNK_OVERLAP |
Overlap between consecutive chunks | 50 |
TOP_K |
Number of chunks to retrieve per query | 5 |
RAG_THRESHOLD |
Similarity threshold for relevance | 0.75 |
TEMPERATURE |
LLM temperature setting | 0.7 |
N_LAST_MESSAGE |
Number of previous messages to include | 5 |
Contributions are welcome! Here's how you can help:
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature
- Commit your changes:
git commit -m 'Add amazing feature'
- Push to the branch:
git push origin feature/amazing-feature
- Open a pull request
Please ensure your code follows our style guidelines and includes appropriate tests.
This project is licensed under the MIT License - see the LICENSE file for details.
Project Link: https://github.com/NnA301023/cygen
- Magazine: ITSec Buzz
- Engineering Space: ITSec Asia Tech
Built with ❤️ by RnD Team