ReLaterBot: Chatbot as Curator for Collective Reflection

Developed with the software and tools below.

Developers: Sabrina Zaki Hansen and Amos Blanton

Overview

This project implements a Retrieval-Augmented Generation (RAG) with a large langauge model, to assist with questions and exploration of meeting transcripts, summaries, and project data.

The code is developed in collaboration and for the EER project. The chatbot integrates:

Document Retrieval: Retrieves data from meeting transcripts, summaries, and related documents.
Conversation History: References from past chatbot interactions.
LLM-Powered Summaries: Uses a Large Language Model (LLM) to generate summaries of transcripts.

For a full code walkthrough, check the src\RAG_tutorial.ipynb.

File Structure

Project.
├── .venv                           # Virtual environment directory
├── data                            # Directory for storing input data (transcripts PDFs)
├── src                             # Source code directory
│    ├── preprocessimg
│    │    ├── reformatting_data.py  # Transcript reformatting scripts
│    │    └──data_chunking.py       # Data processing and chunking logic                
│    ├── streamlit_rag_chatbot      # Directory for TimescaleDB integration
│    │    ├── main.py               # Core chatbot pipeline
│    │    └── streamlit_app.py      # Streamlit app for the chatbot  
│    └── upserting_transcripts      # Scripts for upserting transcripts to database
│         ├── a2t.py                # Incomplete script for the pipeline; currently focuses on adding transcripts to Pinecone
│         └── streamlit_a2t.py      # Streamlit app interface for managing the pipeline
├── .env                            # Environment variables (API keys for HuggingFace and Pinecone)
├── .gitignore                      # Excluded files and directories
├── Dockerfile                      # Docker configuration for deploying the app
├── LICENSE.txt                     # License for the project
├── README.md                       # Readme file
└── requirements.txt                # Dependencies for the project

Modules

Core Components

File	Summary
`src/streamlit_rag_chatbot/main.py`	Sets up the chatbot pipeline, integrating document retrieval with Pinecone and HuggingFace embeddings for advanced querying and summarization.
`src/streamlit_rag_chatbot/streamlit_app.py`	Implements the Streamlit-based user interface, enabling interaction with the chatbot and meeting summaries and referenced data.
`src/preprocessing/reformatting_data.py`	Automates cleaning and reformatting raw transcript files into a structured format (CSV), making them suitable for further processing.
`src/preprocessing/data_chunking.py`	Splits transcripts into manageable chunks and prepares them for storage in the vector store with metadata enrichment.

Getting Started

Prerequisites

Python 3.11.9
API keys for:
- HuggingFace
- Pinecone

Installation

Clone the repository:

git clone https://github.com/sabszh/EER-chatbot-UI/

Navigate to the project directory:
```
cd EER-chatbot-UI
```

Set up a virtual environment (optional but recommended):

python -m venv .venv
source .venv/bin/activate   # On Windows: .venv\Scripts\activate

Install dependencies:
```
pip install -r requirements.txt
```
Configure environment variables: Create a .env file in the root directory and add your API keys:
```
HUGGINGFACE_API_KEY=your_huggingface_api_key
PINECONE_API_KEY=your_pinecone_api_key
```

Running the Application

Using the Streamlit App

To launch the Streamlit app:

streamlit run src/streamlit_rag_chatbot/streamlit_app.py

Using Docker

Build the Docker image:
```
docker build -t eer-chatbot .
```
Run the Docker container:
```
docker run -p 8501:8501 eer-chatbot
```

Features

Meeting Summary Fetching

Fetch concise summaries of past meetings, filtered by specific dates. Summaries highlight discussion points, action items, and speaker lists.

Referenced Data Display

View data sources referenced by the chatbot in its answers, including meeting transcripts and related documents.

Conversation History Integration

Explore connections between your queries and those from other users, using past conversations for context-aware insights.

License

This project is licensed under the GNU General Public License (GPL).

TL;DR

Anyone can copy, modify and distribute this software.
You have to include the license and copyright notice with each and every distribution.
You can use this software privately.
You can use this software for commercial purposes.
If you dare build your business solely from this code, you risk open-sourcing the whole code base.
If you modify it, you have to indicate changes made to the code.
Any modifications of this code base MUST be distributed with the same license, GPLv3.
This software is provided without warranty.
The software author or license can not be held liable for any damages inflicted by the software.

Access the full license text in the LICENSE.txt file.

For more details on the terms of this license, please visit GNU Licenses.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ReLaterBot: Chatbot as Curator for Collective Reflection

Overview

File Structure

Modules

Core Components

Getting Started

Prerequisites

Installation

Running the Application

Using the Streamlit App

Using Docker

Features

Meeting Summary Fetching

Referenced Data Display

Conversation History Integration

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 137 Commits
src		src
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE.txt		LICENSE.txt
README.md		README.md
requirements.txt		requirements.txt

License

sabszh/ReLaterBot

Folders and files

Latest commit

History

Repository files navigation

ReLaterBot: Chatbot as Curator for Collective Reflection

Overview

File Structure

Modules

Core Components

Getting Started

Prerequisites

Installation

Running the Application

Using the Streamlit App

Using Docker

Features

Meeting Summary Fetching

Referenced Data Display

Conversation History Integration

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Languages

Packages