GraphRAG Chatbot

RAG-based chatbot implementation using Graph RAG (Retrieval Augmented Generation) with Streamlit UI. This implementation uses Microsoft Research's GraphRAG approach, providing superior context awareness and reasoning capabilities compared to traditional RAG systems.

🤔 Why GraphRAG?

Traditional RAG systems, while useful, face several limitations when dealing with complex information retrieval and reasoning tasks. GraphRAG addresses these limitations through a structured, hierarchical approach:

Limitations of Traditional RAG:

Limited Connection Synthesis: Basic RAG struggles to "connect the dots" between related pieces of information that aren't explicitly linked.
Poor Holistic Understanding: Traditional approaches have difficulty comprehending and summarizing broad semantic concepts across large document collections.
Context Loss: Simple vector similarity search can miss important contextual relationships.

GraphRAG Advantages:

Knowledge Graph Structure:
- Creates an LLM-generated knowledge graph from your input corpus.
- Captures complex relationships between entities.
- Enables traversal of related concepts through shared attributes.
Hierarchical Understanding:
- Uses the Leiden technique for hierarchical clustering.
- Generates community-level summaries.
- Provides both granular and holistic views of information.
Advanced Query Processing:
- Global Search: For reasoning about holistic questions across the entire corpus.
- Local Search: For a detailed exploration of specific entities and their relationships.
Enhanced Context Window:
- Use community summaries for better context.
- Maintains relationship awareness during queries.
- Improves synthesis of information from multiple sources.

🚀 Quick Start

Prerequisites

Python 3.8 or higher
Git
OpenAI API key

Installation

Clone the repository

git clone [email protected]:Saifullah3711/graph_rag.git
cd graph_rag

Install required dependencies

pip install -r requirements.txt

Set up secrets.toml

Create a .streamlit folder in the root directory.
Inside .streamlit, create a secrets.toml file.
Add your OpenAI API key to the secrets.toml file:

[general]
GRAPHRAG_API_KEY = "your_openai_api_key"

This setup is especially useful for deploying the app to Streamlit Cloud.

Streamlit Cloud Deployment

To deploy the app on Streamlit Cloud for demo purposes:

Ensure your secrets.toml file is correctly set up as described above.
Push your code to a GitHub repository.
Link your repository to Streamlit Cloud and deploy the app.

💻 Usage

Running the Chatbot

To run the chatbot locally:

streamlit run st_chatbot.py

This will launch the Streamlit interface in your default web browser.

Using Your Own Data

To use your own data with the chatbot, follow these steps:

Prepare Your Data:
- Create an input folder in the root directory.
- Place your text documents (.txt files) in this folder.
Initialize GraphRAG:

python -m graphrag.index --init --root .

Index Your Documents:

python -m graphrag.index --root .

Add More Documents (Optional):
- Add new .txt files to the input folder.
- Re-run the indexing process:

python -m graphrag.index --root .

Launch the Chatbot:

streamlit run st_chatbot.py

🛠️ How It Works

GraphRAG follows a sophisticated process:

Indexing Phase:
- Document Slicing: Breaks down documents into manageable TextUnits.
- Entity Extraction: Identifies key entities, relationships, and claims.
- Clustering: Groups related information using the Leiden technique.
- Summary Generation: Creates hierarchical summaries of communities.
Query Phase:
- Global Search: For corpus-wide understanding.
- Local Search: For entity-specific exploration.
- Context Enhancement: Uses community structures for better responses.

🎯 Features

Interactive Streamlit UI.
Graph-based retrieval.
Hierarchical information processing.
Community-aware responses.
Source tracking.
Context-rich answers.
Entity relationship mapping.

⚠️ Limitations

Currently supports only .txt files.
Requires OpenAI API key, which could be costly for large documents.
Initial processing time for large documents.
Resource intensive for very large datasets.

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

📣 Support

If you encounter any issues or have questions, please open an issue in the GitHub repository.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
cache		cache
input		input
output		output
prompts		prompts
readme_imgs		readme_imgs
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
commands.txt		commands.txt
main_global.py		main_global.py
requirements.txt		requirements.txt
settings.yaml		settings.yaml
st_chatbot.py		st_chatbot.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

GraphRAG Chatbot

🤔 Why GraphRAG?

Limitations of Traditional RAG:

GraphRAG Advantages:

🚀 Quick Start

Prerequisites

Installation

Streamlit Cloud Deployment

💻 Usage

Running the Chatbot

Using Your Own Data

🛠️ How It Works

🎯 Features

⚠️ Limitations

🤝 Contributing

📣 Support

📚 References

Streamlit UI

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Saifullah3711/graph_rag

Folders and files

Latest commit

History

Repository files navigation

GraphRAG Chatbot

🤔 Why GraphRAG?

Limitations of Traditional RAG:

GraphRAG Advantages:

🚀 Quick Start

Prerequisites

Installation

Streamlit Cloud Deployment

💻 Usage

Running the Chatbot

Using Your Own Data

🛠️ How It Works

🎯 Features

⚠️ Limitations

🤝 Contributing

📣 Support

📚 References

Streamlit UI

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages