RAG Chatbot

A Retrieval-Augmented Generation (RAG) chatbot that processes URLs and allows users to ask questions about the content.

Features

Process multiple URLs to extract and clean content
Build a in-memory vector store for efficient similarity search
Interactive chat interface for asking questions
Context-aware responses using a RAG architecture
"No relevant information" response when no appropriate context found

Prerequisites

Miniconda/Anaconda installed locally to setup the environment (see Miniconda documentation for installation instructions)
Python 3.10+
Ollama installed locally with the deepseek-r1:1.5b model (see Ollama documentation for installation instructions)

Installation

Clone the repository:

git clone https://github.com/rohan25k/url_rag_chatbot.git
cd url_rag_chatbot

Create a virtual environment:

conda create -n my_env python=3.10
conda activate my_env

Install dependencies:
```
pip install -r requirements.txt
```

Make sure Ollama is running with the required model:

ollama pull deepseek-r1:1.5b
ollama serve  # If not already running as a service

Usage

Run the Streamlit app:
```
streamlit run app.py
```
Open your web browser to the URL shown in the terminal (typically http://localhost:8501)
Enter URLs to process in the text area (comma-separated)
Click "Process URLs" and wait for the processing to complete
Ask questions about the content in the chat interface

Project Structure

app.py: Main Streamlit application
src/data_processing.py: Functions for loading and processing URL content
src/rag_model.py: RAG model implementation
src/utils.py: Utility functions

How It Works

URL Processing:
- The application fetches content from specified URLs
- Content is cleaned to remove unnecessary HTML elements
- Text is split into manageable chunks
- Chunks are embedded and stored in a vector database
Query Processing:
- User questions are embedded and compared to stored chunks
- Most relevant chunks are retrieved
- LLM generates an answer based on the retrieved context
- If no relevant information is found, a specific message is shown

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
src		src
.gitignore		.gitignore
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RAG Chatbot

Features

Prerequisites

Installation

Usage

Project Structure

How It Works

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

rohan25k/url_rag_chatbot

Folders and files

Latest commit

History

Repository files navigation

RAG Chatbot

Features

Prerequisites

Installation

Usage

Project Structure

How It Works

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages