Skip to content

Kubernetes troubleshooting Slack bot with HolmesGPT integration for intelligent cluster analysis and incident response

Notifications You must be signed in to change notification settings

cpareek/k8s-slackbot

Repository files navigation

Kubernetes Slack Bot with GitHub Integration

A Slack bot for Kubernetes troubleshooting with HolmesGPT and GitHub repository integration. This bot helps teams quickly identify and diagnose Kubernetes issues by combining real-time cluster monitoring with repository context analysis.

Features

  • πŸ” Pod Troubleshooting: Analyze failed pods with AI-powered insights
  • πŸ“Š Cluster Monitoring: Get real-time status of your Kubernetes cluster
  • πŸ”„ Flux Integration: Monitor GitOps workflows and identify sync issues
  • πŸ€– HolmesGPT Integration: AI-powered analysis and recommendations
  • οΏ½ GitHub Repository Integration: Cross-reference issues with your Flux/GitOps repository
  • οΏ½πŸ’¬ Interactive Slack Interface: Easy-to-use commands and buttons
  • πŸ“ Log Analysis: Automated log collection and analysis
  • πŸ”— Context-Aware Troubleshooting: Leverage repository configuration and history

🚨 Important: Security Setup First

Before installation, please read SECURITY.md for important information about protecting your API keys and credentials.

Prerequisites

  • Node.js 16+
  • Access to a Kubernetes cluster
  • Slack app with bot permissions
  • HolmesGPT instance (optional - fallback analysis available)
  • GitHub repository with Flux/GitOps configurations (optional)
  • GitHub Personal Access Token (for repository integration)

Installation

  1. Clone and install dependencies:

    git clone <repository-url>
    cd k8s-bot
    npm install
  2. Set up environment variables:

    cp .env.example .env
    # Edit .env with your configuration
  3. Configure your Slack app:

    • Create a new Slack app at https://api.slack.com/apps
    • Enable Socket Mode
    • Add bot scopes: app_mentions:read, channels:read, chat:write, commands, im:read, im:write
    • Install the app to your workspace
    • Copy the tokens to your .env file
  4. Set up Kubernetes access:

    • Ensure your kubeconfig is properly configured
    • The bot uses the default kubeconfig location or the path specified in KUBECONFIG_PATH
  5. Configure GitHub Integration (Optional):

    ./setup-github-integration.sh

    Or see the detailed guide: GITHUB_INTEGRATION.md

Configuration

Environment Variables

Variable Description Required
SLACK_BOT_TOKEN Bot User OAuth Token from Slack Yes
SLACK_SIGNING_SECRET Signing Secret from Slack Yes
SLACK_APP_TOKEN App-Level Token for Socket Mode Yes
HOLMESGPT_API_URL HolmesGPT API endpoint No
HOLMESGPT_API_KEY HolmesGPT API key No
KUBECONFIG_PATH Path to kubeconfig file No
K8S_NAMESPACE Default namespace No
PORT Server port No
GIT_REPO GitHub repository (owner/repo) for integration No
GIT_CREDENTIALS GitHub Personal Access Token No
GIT_BRANCH GitHub branch to analyze No

Slack App Setup

  1. Create a new Slack app:

  2. Configure Socket Mode:

    • Go to "Socket Mode" in the sidebar
    • Enable Socket Mode
    • Generate an App-Level Token with connections:write scope
  3. Set up Bot User:

    • Go to "OAuth & Permissions"
    • Add these Bot Token Scopes:
      • app_mentions:read
      • channels:read
      • chat:write
      • commands
      • im:read
      • im:write
    • Install the app to your workspace
  4. Create Slash Commands:

    • Go to "Slash Commands"
    • Create these commands:
      • /k8s-pods - List pods in namespace
      • /k8s-debug - Debug specific pod
      • /k8s-flux - Check Flux resources
      • /k8s-help - Show help
  5. Configure Event Subscriptions:

    • Go to "Event Subscriptions"
    • Subscribe to these bot events:
      • app_mention
      • message.im

Usage

Slash Commands

# List pods in default namespace
/k8s-pods

# List pods in specific namespace
/k8s-pods production

# Debug a specific pod
/k8s-debug my-app-pod production

# Check Flux resources
/k8s-flux

# Get help
/k8s-help

Mentions

# Find failed pods across all namespaces
@k8s-bot failed pods

# Check Flux status
@k8s-bot flux issues

# Ask general questions
@k8s-bot why is my pod crashing?

# Get help
@k8s-bot help

Direct Messages

Send a direct message to the bot for private troubleshooting sessions.

HolmesGPT Integration

The bot integrates with HolmesGPT for advanced AI-powered analysis. If HolmesGPT is not available, the bot provides fallback analysis based on:

  • Pod status and conditions
  • Container restart counts
  • Kubernetes events
  • Resource requirements
  • Common failure patterns

Setting up HolmesGPT

  1. Deploy HolmesGPT in your cluster or as a standalone service
  2. Configure the HOLMESGPT_API_URL in your environment
  3. If authentication is required, set HOLMESGPT_API_KEY

GitHub Repository Integration

The bot can integrate with your Flux/GitOps repository to provide enhanced troubleshooting context.

Features with GitHub Integration

  • Configuration Analysis: Cross-references cluster issues with repository manifests
  • Change History: Analyzes recent commits that might affect deployments
  • Best Practices: Suggests fixes based on repository patterns
  • Pull Request Management: Can create PRs for fixes (future feature)

Quick Setup

# Run the setup script
./setup-github-integration.sh

Manual Setup

  1. Create GitHub Personal Access Token:

  2. Configure HolmesGPT:

    # Edit ~/.holmes/config.yaml
    toolsets:
      git:
        enabled: true
        config:
          git_repo: "your-org/your-flux-repo"
          git_credentials: "your-github-token"
          git_branch: "main"
  3. Refresh Toolsets:

    holmes toolset refresh

For detailed setup instructions, see GITHUB_INTEGRATION.md

Enhanced Commands

When GitHub integration is active:

  • /k8s-debug includes repository analysis
  • General questions automatically use repo context for deployment-related issues
  • Bot provides richer context by referencing your actual configurations

Development

Running Locally

# Development mode with auto-reload
npm run dev

# Production mode
npm start

Project Structure

src/
β”œβ”€β”€ app.js              # Main application entry point
β”œβ”€β”€ services/
β”‚   β”œβ”€β”€ kubernetes.js   # Kubernetes API interactions
β”‚   └── holmesgpt.js    # HolmesGPT API integration
└── handlers/
    └── slack.js        # Slack event handlers

Adding New Features

  1. New Kubernetes Resources: Extend KubernetesService class
  2. New Slack Commands: Add handlers in SlackHandlers class
  3. New Analysis Types: Extend HolmesGPTService class

Troubleshooting

Common Issues

  1. Bot not responding:

    • Check Slack tokens in .env
    • Verify Socket Mode is enabled
    • Check bot permissions
  2. Kubernetes connection failed:

    • Verify kubeconfig path
    • Check cluster connectivity
    • Ensure proper RBAC permissions
  3. HolmesGPT integration issues:

    • Check API URL and connectivity
    • Verify API key if required
    • Bot works with fallback analysis if HolmesGPT is unavailable

Logs

The bot provides detailed logging for troubleshooting:

# View logs in development
npm run dev

# Production logs
npm start

Security Considerations

  • Store sensitive tokens in environment variables
  • Use Kubernetes RBAC to limit bot permissions
  • Consider network policies for HolmesGPT communication
  • Regularly rotate API keys and tokens

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests if applicable
  5. Submit a pull request

License

MIT License - see LICENSE file for details

Support

For issues and questions:

  • Create an issue in the repository
  • Check the troubleshooting section
  • Review Slack API documentation

About

Kubernetes troubleshooting Slack bot with HolmesGPT integration for intelligent cluster analysis and incident response

Resources

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published