Skip to content

DeTAILS: Deep Thematic Analysis with Iterative LLM Support. DeTAILS is a toolkit for Thematic Analysis (and qualitative coding). By grounding LLMs in your research, DeTAILS automates the heavy lifting—scanning transcripts for quotes, generating codes and themes—while you (the researcher) guide the analysis to ensure rigor and depth.

License

Notifications You must be signed in to change notification settings

DaemonOnCode/DeTAILS

DeTAILS Icon

Deep Thematic Analysis with Iterative LLM Support

License: MIT

📦 Official Builds

For convenience, you can download our latest pre-built binaries/installer here:

✨ What is this?

DeTAILS (Deep Thematic Analysis with Iterative LLM Support) is a sophisticated desktop application designed to assist qualitative researchers in performing reflexive thematic analysis (TA) on large text datasets, such as social media posts from platforms like Reddit. By leveraging local large language models (LLMs), DeTAILS enables researchers to efficiently analyze vast amounts of unstructured data while preserving the interpretive depth and researcher reflexivity inherent in traditional TA. It is built on a modern, privacy-preserving architecture, running entirely on your local machine to ensure data security and confidentiality.

With DeTAILS, researchers can:

  • Input research questions and background literature to create a contextual "memory snapshot."
  • Load and filter datasets from Reddit.
  • Collaboratively code data with AI assistance, refining codes through iterative feedback.
  • Review and cluster codes.
  • Generate and refine overarching themes with interactive AI support.
  • Export structured reports for further analysis or publication.

🤔 Why?

DeTAILS addresses the challenges of scaling traditional thematic analysis, offering a researcher-centric tool that balances automation while maintaining human control. Here’s why it stands out:

  • Scalability: Analyze large datasets that would be impractical to code manually.
  • Efficiency: Reduce the time and effort required for coding and theme development with LLM assistance.
  • Depth: Maintain the interpretive richness and reflexivity of qualitative analysis.
  • Privacy & Security: Keep sensitive data local, with all processing performed on your machine.
  • Flexibility: Customize the analysis process to align with specific research needs.
  • Transparency: Interrogate and refine AI suggestions through interactive feedback loops, ensuring trustworthiness.

🏗️ Architecture Overview

The application follows a multi-layered architecture designed for modularity and efficient communication:

  1. 💻 UI Layer (Frontend - React + Electron):

    • Provides the graphical user interface using React.
    • Runs within an Electron container, enabling desktop integration.
    • Communicates with the backend via HTTP REST APIs.
    • Uses IPC (Inter-Process Communication) to interact with Electron-specific features.
    • Receives real-time updates from the backend via WebSocket messages relayed by Electron through IPC messages.
  2. ⚙️ Backend Layer (FastAPI - Data Modeling Server):

    • The central hub built with Python and FastAPI.
    • Exposes REST APIs for the frontend (React & Electron).
    • Manages a WebSocket endpoint for pushing real-time events to Electron.
    • Orchestrates interactions with various underlying services and data stores.
  3. 🛠️ Services & Data Stores:

    • Ollama: Runs open-source Large Language Models locally.
    • ChromaDB: Provides efficient vector storage and similarity search.
    • ripgrep: Enables fast text search across files.
    • Zstandard (zstd): Used for high-speed data compression/decompression.
    • SQLite: Embedded relational database for structured data storage.
  4. ☁️ External Service Connections:

    • Integrates with OpenAI and Gemini APIs (optional, via user-provided keys).
    • Uses Transmission daemon to download Reddit data via Academic Torrents.

Communication Flow:

%%{init:{
  "theme":"base",
  "themeVariables":{
    "background":"#fafafa",
    "primaryColor":"#ffffff",
    "clusterBkg":"#f8f9fa",
    "clusterBorder":"#cccccc",
    "edgeLabelBackground":"#ffffff",
    "lineColor":"#444444",
    "arrowheadColor":"#444444"
  },
  "flowchart":{
    "nodeSpacing":80,
    "rankSpacing":60
  }
}}%%
flowchart LR
  subgraph App["DeTAILS Application"]
    style App fill:#f5f5f5,stroke:#999999,stroke-width:2px,stroke-dasharray:5 5
    direction LR

    subgraph UI["Frontend (React + Electron)"]
      style UI fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
      direction LR

      R["React"]
      E["Electron"]

      R <-. IPC .-> E
    end

    subgraph BE["Backend (Server + Services)"]
      style BE fill:#e8f5e9,stroke:#388e3c,stroke-width:2px
      direction LR

      F["Data Modeling Server"]
      Transmission["Transmission Daemon"]
      Ollama["Ollama (Local LLM)"]
      ripgrep["ripgrep (Text Search)"]
      zstd["zstd (Compression/Decompression)"]
      OpenAI["OpenAI API"]
      Gemini["Gemini API"]

      %% Datastores as cylinders %%
      SQLite[("SQLite Database")]
      ChromaDB[("ChromaDB Vector Store")]
    end

    %% Cross‑cluster interactions %%
    R -->|HTTP REST| F
    E -->|HTTP REST| F
    E -->|Spawns & Manages| F
    E -->|Spawns & Manages| Ollama
    E -->|Spawns & Manages| ChromaDB

    F --> Transmission
    F --> ripgrep
    F --> zstd
    F --> SQLite
    F --> OpenAI
    F --> Gemini
    F --> Ollama
    F --> ChromaDB

    F ==>|WebSocket Push| E
  end
Loading

Technology Stack

  • Frontend: React, Electron, TypeScript, Tailwind CSS
  • Backend: Python, FastAPI
  • AI/LLM: Ollama (Local) [Source Code], OpenAI API, Gemini API, Vertex AI API (Google Cloud)
  • Search: ChromaDB (Vector) [Source Code], ripgrep (Text) [Source Code]
  • Database: SQLite
  • Utilities: Zstandard (Decompression of files) [Source Code], Transmission (Torrents - Academic Torrents Primary, Fallback)

📁 Folder Structure

.
├── frontend/  # React UI & Electron Wrapper
│   ├── electron/  # Electron-specific code (main process, handlers)
│   │   ├── handles/
│   │   ├── templates/
│   │   ├── utils/
│   └── src/ # React application source code
│       ├── components/
│       ├── constants/
│       ├── hooks/
│       ├── pages/
│       ├── reducers/
│       ├── router/
│       ├── styles/
│       ├── types/
│       ├── utility/
│       └── App.tsx     # Main React App component
├── backend/  # Backend services and tools source
│   ├── data_modeling_server/ # Main FastAPI application
│   │   ├── controllers/
│   │   ├── database/
│   │   ├── decorators/
│   │   ├── errors/
│   │   ├── headers/
│   │   ├── middlewares/
│   │   ├── models/
│   │   ├── routes/
│   │   ├── services/ # Integration logic (Transmission, LLMs, etc.)
│   │   ├── utils/
│   │   ├── app_ws.py # Websocket endpoint FastAPI entry point
│   │   ├── app_http.py # HTTP FastAPI entry point
│   │   └── main.py # Main entry point
│   ├── chroma/ # ChromaDB
│   ├── ollama-0.4.2/ # Ollama
│   ├── ripgrep/ # Ripgrep
│   └── zstd/ # Zstandard
├── executables_mac/ # Build scripts to make executables for MacOS
├── executables_linux/ # Build scripts to make executables for Linux
├── executables_windows/ # Build scripts to make executables for Windows
└── README.md

🔧 Build Instructions

Important: Building requires specific prerequisites (languages, compilers, libraries) installed on your system. The provided build scripts assume these are available. Pre-built executables for core backend services need to be built using the build scripts in the executables_* directories for convenience. Running npm run make-* command in the frontend folder builds the frontend as well as the backend services.

1. Frontend (React + Electron)

🔐 Google OAuth Setup (Client JSON File)

DeTAILS reads a single JSON credentials file from GCP to configure OAuth. Follow these steps:

  • Generate your OAuth client credentials

    • In the Google Cloud Console → APIs & Services → Credentials, click Create Credentials → OAuth client ID, choose Desktop application and fill out the rest of the required information
    • Download the resulting client-*.json file.
    • Add the client json file path in the .env file inside frontend folder
    • For builds, explicitly mention the path in frontend/electron/handles/authentication.js
  • Prerequisites: Node.js, npm

  • Steps:

    cd frontend
    npm install
  • To Run (Development):

    npm run dev
  • To Build Application:

    # For macOS
    npm run make-mac
    
    # For Windows
    npm run make-win
    
    # For Linux
    npm run make-linux

    (Output found in frontend/out)

2. Backend Services (Build from Source)

a) Data Modeling Server (FastAPI)

  • Prerequisites: Python >= 3.11, venv module

  • Steps:

    cd backend/data_modeling_server
    
    # Create virtual environment
    # macOS:
    python -m venv .venv
    # Linux
    python -m venv linenv
    # Windows:
    python -m venv winenv
    
    # Activate virtual environment
    # macOS/Linux:
    source .venv/bin/activate
    # Windows (PowerShell):
    .\winenv\Scripts\Activate.ps1
    # Windows (CMD):
    .\winenv\Scripts\activate.bat
    
    # Install dependencies (choose correct file)
    pip install -r requirements_mac.txt    # or _linux.txt or _windows.txt
    
    # To Build Executable:
    pyinstaller main.spec
    # Executable found in ./dist/main (or .\dist\main.exe on Windows)
    
    # To Run Server Directly (Development):
    python main.py

b) Ripgrep

  • Prerequisites: Rust, Cargo
  • Steps:
    cd backend/ripgrep
    cargo build --release --features 'pcre2'
    (Executable found in ./target/release/rg)

c) Zstandard (zstd)

  • Prerequisites (Unix - macOS/Linux): make, gcc

    cd backend/zstd
    make

    (Executable zstd found in backend/zstd)

  • Prerequisites (Windows): cmake, make (e.g., MinGW), ensure both are in PATH.

    cd backend/zstd/build/cmake
    mkdir builddir
    cd builddir
    # Using Command Prompt (cmd.exe)
    cmake -G "MinGW Makefiles" ..
    make

    (Executable zstd.exe found in .\build\cmake\builddir\programs\)

d) ChromaDB

  • Prerequisites: Python >= 3.11, venv module

  • Steps:

    cd backend/chroma
    
    # Create virtual environment
    # macOS:
    python -m venv env
    # Linux
    python -m venv linenv
    # Windows:
    python -m venv winenv
    
    # Activate virtual environment
    # macOS:
    source env/bin/activate
    # Linux
     source linenv/bin/activate
    # Windows (PowerShell):
    .\winenv\Scripts\Activate.ps1
    
    # Install dependencies (choose correct file)
    pip install -r requirements_exe.txt        # For macOS/Linux
    # pip install -r requirements_exe_windows.txt # For Windows
    
    # Navigate to CLI directory
    cd chromadb/cli
    
    # To Build Executable:
    pyinstaller cli.spec
    # Executable found in ./dist/cli (or .\dist\cli.exe on Windows)
    
    # To Run Server Directly (Development):
    # Ensure you are in the backend/chroma directory with venv activated
    cd backend/chroma
    python chromadb/cli/cli.py run --path /path/to/persist/chroma --host <your_ip> --port <port>
    # Example: python chromadb/cli/cli.py run --path ./chroma_data --host 127.0.0.1 --port 8001

e) Ollama

  • Prerequisites: Go >= 1.23.3, make, gcc.

    • NVIDIA GPU (Linux/Windows): NVIDIA CUDA Toolkit installed.
    • Windows: Visual Studio Build Tools (ensure cl.exe is in PATH).
    • macOS (Metal): No extra GPU requirements needed.
  • Full Instructions: See Ollama Development Guide: link

  • Steps:

    cd backend/ollama-0.4.2
    
    # For macOS:
    ./scripts/build.sh 0.4.2
    
    # For Linux:
    make -j$(nproc)
    go build -v -x .
    
    # For Windows (using Git Bash or similar):
    export CGO_ENABLED=1 # or set CGO_ENABLED=1 in CMD
    make -j%NUMBER_OF_PROCESSORS% # Use available cores
    go build -v -x .

    (Executable ollama or ollama.exe found in the root directory - backend/ollama-0.4.2/)

▶️ Running the Application

  1. Run Packaged Application: After building the application using npm run make-*, navigate to the output directory (frontend/out/...) and run the generated executable.

🤝 Contributing

Contributions are welcome! Please read our Contributing Guidelines before submitting pull requests.

📄 License

This project is licensed under GPL 3.0 only. It contains two MIT-licensed modules, one Apache 2.0 module (GPL 3.0 is compatible with Apache 2.0’s patent and notice terms), and one GPL 2.0 (or later) module which we relicense under GPL 3.0.

See the LICENSE file for details.


About

DeTAILS: Deep Thematic Analysis with Iterative LLM Support. DeTAILS is a toolkit for Thematic Analysis (and qualitative coding). By grounding LLMs in your research, DeTAILS automates the heavy lifting—scanning transcripts for quotes, generating codes and themes—while you (the researcher) guide the analysis to ensure rigor and depth.

Topics

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published