Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 47 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,7 @@
- [X] [2024.11.11]🎯📢LightRAG now supports [deleting entities by their names](https://github.com/HKUDS/LightRAG?tab=readme-ov-file#delete).
- [X] [2024.11.09]🎯📢Introducing the [LightRAG Gui](https://lightrag-gui.streamlit.app), which allows you to insert, query, visualize, and download LightRAG knowledge.
- [X] [2024.11.04]🎯📢You can now [use Neo4J for Storage](https://github.com/HKUDS/LightRAG?tab=readme-ov-file#using-neo4j-for-storage).
- [X] [2024.11.04]🎯📢You can now [use FalkorDB for Storage](https://github.com/HKUDS/LightRAG?tab=readme-ov-file#using-falkordb-for-storage).
- [X] [2024.10.29]🎯📢LightRAG now supports multiple file types, including PDF, DOC, PPT, and CSV via `textract`.
- [X] [2024.10.20]🎯📢We've added a new feature to LightRAG: Graph Visualization.
- [X] [2024.10.18]🎯📢We've added a link to a [LightRAG Introduction Video](https://youtu.be/oageL-1I0GE). Thanks to the author!
Expand Down Expand Up @@ -264,7 +265,7 @@ A full list of LightRAG init parameters:
| **workspace** | str | Workspace name for data isolation between different LightRAG Instances | |
| **kv_storage** | `str` | Storage type for documents and text chunks. Supported types: `JsonKVStorage`,`PGKVStorage`,`RedisKVStorage`,`MongoKVStorage` | `JsonKVStorage` |
| **vector_storage** | `str` | Storage type for embedding vectors. Supported types: `NanoVectorDBStorage`,`PGVectorStorage`,`MilvusVectorDBStorage`,`ChromaVectorDBStorage`,`FaissVectorDBStorage`,`MongoVectorDBStorage`,`QdrantVectorDBStorage` | `NanoVectorDBStorage` |
| **graph_storage** | `str` | Storage type for graph edges and nodes. Supported types: `NetworkXStorage`,`Neo4JStorage`,`PGGraphStorage`,`AGEStorage` | `NetworkXStorage` |
| **graph_storage** | `str` | Storage type for graph edges and nodes. Supported types: `NetworkXStorage`,`Neo4JStorage`,`FalkorDBStorage`,`PGGraphStorage`,`AGEStorage` | `NetworkXStorage` |
| **doc_status_storage** | `str` | Storage type for documents process status. Supported types: `JsonDocStatusStorage`,`PGDocStatusStorage`,`MongoDocStatusStorage` | `JsonDocStatusStorage` |
| **chunk_token_size** | `int` | Maximum token size per chunk when splitting documents | `1200` |
| **chunk_overlap_token_size** | `int` | Overlap token size between two chunks when splitting documents | `100` |
Expand Down Expand Up @@ -819,6 +820,49 @@ see test_neo4j.py for a working example.
</details>

<details>

<summary> <b>Using FalkorDB for Storage</b> </summary>

* FalkorDB is a high-performance graph database that's Redis module compatible and supports the Cypher query language
* Running FalkorDB in Docker is recommended for seamless local testing
* See: https://hub.docker.com/r/falkordb/falkordb

```python
export FALKORDB_HOST="localhost"
export FALKORDB_PORT="6379"
export FALKORDB_PASSWORD="password" # optional
export FALKORDB_USERNAME="username" # optional
export FALKORDB_GRAPH_NAME="lightrag_graph" # optional, defaults to namespace

# Setup logger for LightRAG
setup_logger("lightrag", level="INFO")

# When you launch the project be sure to override the default KG: NetworkX
# by specifying graph_storage="FalkorDBStorage".

# Note: Default settings use NetworkX
# Initialize LightRAG with FalkorDB implementation.
async def initialize_rag():
rag = LightRAG(
working_dir=WORKING_DIR,
llm_model_func=gpt_4o_mini_complete, # Use gpt_4o_mini_complete LLM model
graph_storage="FalkorDBStorage", #<-----------override KG default
)

# Initialize database connections
await rag.initialize_storages()
# Initialize pipeline status for document processing
await initialize_pipeline_status()

return rag
```

see examples/falkordb_example.py for a working example.

</details>

<details>

<summary> <b>Using PostgreSQL Storage</b> </summary>

For production level scenarios you will most likely want to leverage an enterprise solution. PostgreSQL can provide a one-stop solution for you as KV store, VectorDB (pgvector) and GraphDB (apache AGE). PostgreSQL version 16.6 or higher is supported.
Expand Down Expand Up @@ -934,8 +978,9 @@ The `workspace` parameter ensures data isolation between different LightRAG inst
- **For databases that store data in collections, it's done by adding a workspace prefix to the collection name:** `RedisKVStorage`, `RedisDocStatusStorage`, `MilvusVectorDBStorage`, `QdrantVectorDBStorage`, `MongoKVStorage`, `MongoDocStatusStorage`, `MongoVectorDBStorage`, `MongoGraphStorage`, `PGGraphStorage`.
- **For relational databases, data isolation is achieved by adding a `workspace` field to the tables for logical data separation:** `PGKVStorage`, `PGVectorStorage`, `PGDocStatusStorage`.
- **For the Neo4j graph database, logical data isolation is achieved through labels:** `Neo4JStorage`
- **For the FalkorDB graph database, logical data isolation is achieved through labels:** `FalkorDBStorage`

To maintain compatibility with legacy data, the default workspace for PostgreSQL non-graph storage is `default` and, for PostgreSQL AGE graph storage is null, for Neo4j graph storage is `base` when no workspace is configured. For all external storages, the system provides dedicated workspace environment variables to override the common `WORKSPACE` environment variable configuration. These storage-specific workspace environment variables are: `REDIS_WORKSPACE`, `MILVUS_WORKSPACE`, `QDRANT_WORKSPACE`, `MONGODB_WORKSPACE`, `POSTGRES_WORKSPACE`, `NEO4J_WORKSPACE`.
To maintain compatibility with legacy data, the default workspace for PostgreSQL non-graph storage is `default` and, for PostgreSQL AGE graph storage is null, for Neo4j graph storage is `base`, and for FalkorDB graph storage is `base` when no workspace is configured. For all external storages, the system provides dedicated workspace environment variables to override the common `WORKSPACE` environment variable configuration. These storage-specific workspace environment variables are: `REDIS_WORKSPACE`, `MILVUS_WORKSPACE`, `QDRANT_WORKSPACE`, `MONGODB_WORKSPACE`, `POSTGRES_WORKSPACE`, `NEO4J_WORKSPACE`, `FALKORDB_WORKSPACE`.

## Edit Entities and Relations

Expand Down
7 changes: 7 additions & 0 deletions env.example
Original file line number Diff line number Diff line change
Expand Up @@ -261,6 +261,7 @@ OLLAMA_EMBEDDING_NUM_CTX=8192
# LIGHTRAG_DOC_STATUS_STORAGE=JsonDocStatusStorage
# LIGHTRAG_GRAPH_STORAGE=NetworkXStorage
# LIGHTRAG_VECTOR_STORAGE=NanoVectorDBStorage
# LIGHTRAG_GRAPH_STORAGE=FalkorDBStorage

### Redis Storage (Recommended for production deployment)
# LIGHTRAG_KV_STORAGE=RedisKVStorage
Expand Down Expand Up @@ -324,6 +325,12 @@ NEO4J_LIVENESS_CHECK_TIMEOUT=30
NEO4J_KEEP_ALIVE=true
# NEO4J_WORKSPACE=forced_workspace_name

# FalkorDB Configuration
FALKORDB_URI=falkordb://xxxxxxxx.falkordb.cloud
FALKORDB_GRAPH_NAME=lightrag_graph
# FALKORDB_HOST=localhost
# FALKORDB_PORT=6379

### MongoDB Configuration
MONGO_URI=mongodb://root:root@localhost:27017/
#MONGO_URI=mongodb+srv://xxxx
Expand Down
130 changes: 130 additions & 0 deletions examples/falkordb_example.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
#!/usr/bin/env python
"""
Example of using LightRAG with FalkorDB - Updated Version
=========================================================
Fixed imports and modern LightRAG syntax.

Prerequisites:
1. FalkorDB running: docker run -p 6379:6379 falkordb/falkordb:latest
2. OpenAI API key in .env file
3. Required packages: pip install lightrag falkordb openai python-dotenv
"""

import asyncio
import os
from dotenv import load_dotenv
from lightrag import LightRAG, QueryParam
from lightrag.llm.openai import gpt_4o_mini_complete, openai_embed
from lightrag.kg.shared_storage import initialize_pipeline_status

# Load environment variables
load_dotenv()


async def main():
"""Example usage of LightRAG with FalkorDB"""

# Set up environment for FalkorDB
os.environ.setdefault("FALKORDB_HOST", "localhost")
os.environ.setdefault("FALKORDB_PORT", "6379")
os.environ.setdefault("FALKORDB_GRAPH_NAME", "lightrag_example")
os.environ.setdefault("FALKORDB_WORKSPACE", "example_workspace")

# Initialize LightRAG with FalkorDB
rag = LightRAG(
working_dir="./falkordb_example",
llm_model_func=gpt_4o_mini_complete, # Updated function name
embedding_func=openai_embed, # Updated function name
graph_storage="FalkorDBStorage", # Specify FalkorDB backend
)

# Initialize storage connections
await rag.initialize_storages()
await initialize_pipeline_status()

# Example text to process
sample_text = """
FalkorDB is a high-performance graph database built on Redis.
It supports OpenCypher queries and provides excellent performance for graph operations.
LightRAG can now use FalkorDB as its graph storage backend, enabling scalable
knowledge graph operations with Redis-based persistence. This integration
allows developers to leverage both the speed of Redis and the power of
graph databases for advanced AI applications.
"""

print("Inserting text into LightRAG with FalkorDB backend...")
await rag.ainsert(sample_text)

# Check what was created
storage = rag.chunk_entity_relation_graph
nodes = await storage.get_all_nodes()
edges = await storage.get_all_edges()
print(f"Knowledge graph created: {len(nodes)} nodes, {len(edges)} edges")

print("\nQuerying the knowledge graph...")

# Test different query modes
questions = [
"What is FalkorDB and how does it relate to LightRAG?",
"What are the benefits of using Redis with graph databases?",
"How does FalkorDB support OpenCypher queries?",
]

for i, question in enumerate(questions, 1):
print(f"\n--- Question {i} ---")
print(f"Q: {question}")

try:
response = await rag.aquery(
question, param=QueryParam(mode="hybrid", top_k=3)
)
print(f"A: {response}")
except Exception as e:
print(f"Error querying: {e}")

# Show some graph statistics
print("\n--- Graph Statistics ---")
try:
all_labels = await storage.get_all_labels()
print(f"Unique entities: {len(all_labels)}")

if nodes:
print("Sample entities:")
for i, node in enumerate(nodes[:3]):
entity_id = node.get("entity_id", "Unknown")
entity_type = node.get("entity_type", "Unknown")
print(f" {i+1}. {entity_id} ({entity_type})")

if edges:
print("Sample relationships:")
for i, edge in enumerate(edges[:2]):
source = edge.get("source", "Unknown")
target = edge.get("target", "Unknown")
print(f" {i+1}. {source} → {target}")

except Exception as e:
print(f"Error getting statistics: {e}")


if __name__ == "__main__":
print("LightRAG with FalkorDB Example")
print("==============================")
print("Note: This requires FalkorDB running on localhost:6379")
print(
"You can start FalkorDB with: docker run -p 6379:6379 falkordb/falkordb:latest"
)
print()

# Check OpenAI API key
if not os.getenv("OPENAI_API_KEY"):
print("❌ Please set your OpenAI API key in .env file!")
print(" Create a .env file with: OPENAI_API_KEY=your-actual-api-key")
exit(1)

try:
asyncio.run(main())
except KeyboardInterrupt:
print("\n👋 Example interrupted. Goodbye!")
except Exception as e:
print(f"\n💥 Unexpected error: {e}")
print("🔧 Make sure FalkorDB is running and your .env file is configured")
Loading