A sophisticated AI agent system that combines traditional RAG (Retrieval-Augmented Generation) with knowledge graphs to provide comprehensive insights using vector similarity and knowledge graph traversal.
- Python 3.11 or higher
- PostgreSQL with pgvector extension
- Neo4j database
- LLM Provider API key (OpenAI, Anthropic, Gemini, etc.)
# Clone and enter the project
cd agentic-rag-knowledge-graph
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txtPostgreSQL with pgvector:
# Execute the schema (adjust embedding dimensions if needed)
psql -d your_database -f sql/schema.sqlNeo4j:
- Install Neo4j Desktop or use a cloud instance
- Create a new database and note the connection details
# Copy example environment file
cp .env.example .env
# Edit .env with your actual configuration
nano .envKey configuration:
DATABASE_URL: PostgreSQL connection stringNEO4J_*: Neo4j connection detailsLLM_PROVIDER&LLM_API_KEY: Your preferred LLM providerEMBEDDING_MODEL: Embedding model (ensure dimensions match schema)
# Add documents to the documents/ folder
mkdir -p documents
# Copy your markdown documents here
# Run ingestion (this may take a while for knowledge graph extraction)
python -m ingestion.ingest
# Options:
python -m ingestion.ingest --clean # Clean database first
python -m ingestion.ingest --no-semantic # Skip semantic chunking
python -m ingestion.ingest --no-entities # Skip entity extraction# Start the FastAPI server
python -m agent.api
# Server will run on http://localhost:8058# In a new terminal, start the interactive CLI
python cli.py
# The CLI will connect to your API server automaticallyYou: What are Microsoft's main AI initiatives?
π€ Assistant:
Microsoft has several major AI initiatives including Azure OpenAI Service,
Copilot integration across Office 365, and significant investments in AI
research through Microsoft Research...
π Tools Used:
1. vector_search (query='Microsoft AI initiatives', limit=10)
β Found 8 relevant chunks about Microsoft's AI research
# Health check
curl http://localhost:8058/health
# Chat (non-streaming)
curl -X POST "http://localhost:8058/chat" \
-H "Content-Type: application/json" \
-d '{"message": "How are Microsoft and OpenAI connected?"}'
# Streaming chat
curl -X POST "http://localhost:8058/chat/stream" \
-H "Content-Type: application/json" \
-d '{"message": "Compare AI strategies of tech giants"}'-
Agent System (
agent/)agent.py: Main Pydantic AI agent with tool orchestrationproviders.py: Multi-provider LLM abstraction (OpenAI, Anthropic, Gemini, Ollama)models.py: Pydantic models for API and data structuresprompts.py: System prompts and templatesapi.py: FastAPI web server with streaming endpoints
-
Ingestion Pipeline (
ingestion/)ingest.py: Main ingestion orchestratorchunker.py: Semantic document chunking using LLM analysisembedder.py: Embedding generation with multiple provider support
-
Database Layer
- PostgreSQL with pgvector for vector similarity search
- Neo4j with Graphiti for temporal knowledge graphs
- Optimized schemas and indexes for performance
The agent intelligently chooses between three search strategies:
- Vector Search: Semantic similarity across document chunks
- Graph Search: Entity relationships and knowledge traversal
- Hybrid Search: Combined approach for comprehensive analysis
- "What is X?" β Vector search for content understanding
- "How are X and Y connected?" β Graph search for relationships
- "Compare X and Y" β Hybrid search for comprehensive analysis
Supported embedding providers and their dimensions:
# OpenAI (1536 dimensions)
EMBEDDING_MODEL=text-embedding-3-small
# Ollama (768 dimensions - update SQL schema!)
EMBEDDING_MODEL=nomic-embed-textImportant: Ensure your SQL schema dimensions match your embedding model.
# OpenAI
LLM_PROVIDER=openai
LLM_CHOICE=gpt-4o-mini
# Anthropic Claude
LLM_PROVIDER=anthropic
LLM_CHOICE=claude-3-5-sonnet-20241022
# Local Ollama
LLM_PROVIDER=ollama
LLM_CHOICE=llama3.2:3b
# Google Gemini
LLM_PROVIDER=gemini
LLM_CHOICE=gemini-1.5-flash
# OpenRouter (access to multiple models)
LLM_PROVIDER=openrouter
LLM_CHOICE=anthropic/claude-3.5-sonnet# Chunking parameters
python -m ingestion.ingest --chunk-size 800 --overlap 150
# Embedding batch size
# Modify EmbeddingGenerator.generate_embeddings_batch(batch_size=20)
# Database connection pooling
DB_POOL_SIZE=15
DB_MAX_OVERFLOW=30# API health
curl http://localhost:8058/health
# Database connectivity
curl http://localhost:8058/statshealth - Check API server status
stats - View system statistics
clear - Clear conversation session
help - Show available commands
# Run all tests
pytest
# Run with coverage
pytest --cov=agent --cov=ingestion --cov-report=html
# Test specific components
pytest tests/agent/
pytest tests/ingestion/Database Connection Errors:
# Test PostgreSQL connection
psql -d "$DATABASE_URL" -c "SELECT 1;"
# Verify pgvector extension
psql -d "$DATABASE_URL" -c "SELECT * FROM pg_extension WHERE extname = 'vector';"Neo4j Connection Issues:
# Test Neo4j connectivity
docker run --rm neo4j:5.11 bolt://localhost:7687No Search Results:
# Verify documents were ingested
psql -d "$DATABASE_URL" -c "SELECT COUNT(*) FROM documents;"
psql -d "$DATABASE_URL" -c "SELECT COUNT(*) FROM chunks WHERE embedding IS NOT NULL;"Embedding Dimension Mismatch:
-- Check current schema dimension
\d+ chunks
-- Look for embedding VECTOR(dimension)
-- Update schema if needed (replace 1536 with your model's dimension)
ALTER TABLE chunks ALTER COLUMN embedding TYPE VECTOR(768);
ALTER TABLE relationships ALTER COLUMN embedding TYPE VECTOR(768);Slow Ingestion:
- Use
--no-semanticfor faster chunking - Use
--no-entitiesto skip knowledge graph extraction - Reduce chunk size or batch size
Slow Queries:
- Check database indexes:
\diin psql - Monitor query performance:
EXPLAIN ANALYZE SELECT ... - Consider increasing database connection pool size
agentic-rag-knowledge-graph/
βββ agent/ # AI agent and API
β βββ agent.py # Main Pydantic AI agent
β βββ api.py # FastAPI application
β βββ models.py # Data models
β βββ providers.py # LLM provider abstraction
β βββ prompts.py # System prompts
βββ ingestion/ # Document processing
β βββ ingest.py # Main ingestion pipeline
β βββ chunker.py # Semantic chunking
β βββ embedder.py # Embedding generation
βββ sql/ # Database schema
β βββ schema.sql # PostgreSQL schema with pgvector
βββ tests/ # Test suites
βββ documents/ # Your documents (create this)
βββ cli.py # Interactive CLI
βββ requirements.txt # Dependencies
βββ .env.example # Configuration template
- Fork the repository
- Create a feature branch:
git checkout -b feature-name - Make your changes with tests
- Run the test suite:
pytest - Submit a pull request
This project is licensed under the MIT License - see the LICENSE file for details.
- Pydantic AI for the agent framework
- Graphiti for temporal knowledge graphs
- pgvector for vector similarity search
- FastAPI for the web framework
- Rich for beautiful CLI output
Built with β€οΈ for the AI community