Skip to content

Juno445/ticket-classification-model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🎫 AI-Powered Ticket Classification System

An intelligent support ticket routing system that uses machine learning to automatically classify and assign support tickets to appropriate agent groups.

Python 3.8+ PyTorch License: MIT

🌟 Features

  • Automated Processing: Integrates with ticketing systems via REST API
  • Web Interface: Interactive Gradio-based demo for testing predictions
  • Flexible Architecture: Easily adaptable to different ticketing platforms
  • Production Ready: Includes error handling, logging, and configuration management

πŸš€ Quick Start

Prerequisites

  • Python 3.8 or higher
  • pip package manager

Installation

  1. Clone the repository

    git clone https://github.com/Juno445/ticket-classification-model.git
    cd ticket-classification-model
  2. Install dependencies

    pip install -r requirements.txt
  3. Prepare your data

    • Create inputdata.csv with columns: subject, agent_group, priority
    • Create workspace_agent_groups.csv with columns: group_name, group_id, workspace_id
  4. Set up environment variables

    cp .env.example .env
    # Edit .env with your configuration

πŸ“Š Usage

1. Data Preparation

Split your dataset into training and testing sets:

python data_splitter.py

This creates train.csv and test.csv from your input data with an 80/20 split.

2. Model Training

Train the multi-task neural network:

python train_model.py

This will:

  • Process and vectorize ticket subjects using TF-IDF
  • Train a neural network with two output heads (agent group + priority)
  • Save the trained model and preprocessing components
  • Use early stopping to prevent overfitting

Training Output:

Using device: cuda
Loaded 10000 records from inputdata.csv
Training samples: 8000, Validation samples: 2000
Model initialized with 3,276,842 parameters
Starting training...
Epoch 1/50, Training Loss: 2.1543, Validation Loss: 1.8932
...
New best model saved to ticket_multitask_model.pth
Training completed!

3. Interactive Testing

Launch the web interface to test predictions:

python gradio_demo.py

Open your browser to http://localhost:7860 and test the model with sample ticket subjects.

4. Production Deployment

Run the automated ticket processing system:

python main.py

This will:

  • Connect to your ticketing system API
  • Fetch unassigned tickets
  • Predict appropriate assignments using the trained model
  • Automatically update ticket assignments
  • Add explanatory notes to processed tickets

πŸ—οΈ Architecture

Model Architecture

Input: Ticket Subject (Text)
    ↓
TF-IDF Vectorization (25,000 features)
    ↓
Neural Network:
β”œβ”€β”€ Dense Layer (25,000 β†’ 128) + ReLU
β”œβ”€β”€ Dense Layer (128 β†’ 64) + ReLU
β”œβ”€β”€ Agent Head (64 β†’ num_agent_groups)
└── Priority Head (64 β†’ num_priorities)
    ↓
Outputs: Agent Group

System Components

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Data Splitter β”‚    β”‚  Model Training  β”‚    β”‚  Gradio Demo    β”‚
β”‚                 β”‚    β”‚                  β”‚    β”‚                 β”‚
β”‚ β€’ Train/Test    β”‚    β”‚ β€’ Multi-task NN  β”‚    β”‚ β€’ Web Interface β”‚
β”‚   Split         β”‚    β”‚ β€’ TF-IDF         β”‚    β”‚ β€’ Live Testing  β”‚
β”‚ β€’ 80/20 Ratio   β”‚    β”‚ β€’ Early Stop     β”‚    β”‚ β€’ Confidence    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚                       β”‚                       β”‚
         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                 β”‚
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚  Main System     β”‚
                    β”‚                  β”‚
                    β”‚ β€’ API Integrationβ”‚
                    β”‚ β€’ Auto-Assignmentβ”‚
                    β”‚ β€’ Batch Process  β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ“ Project Structure

ticket-classification-model/
β”œβ”€β”€ data_splitter.py          # Split dataset into train/test
β”œβ”€β”€ train_model.py            # Train the multi-task model
β”œβ”€β”€ main.py                   # Production ticket processing
β”œβ”€β”€ gradio_demo.py            # Interactive web interface
β”œβ”€β”€ requirements.txt          # Python dependencies
β”œβ”€β”€ .env.example             # Environment variables template
β”œβ”€β”€ .gitignore               # Git ignore rules
└── README.md                # This file

# Generated files (not in repo)
β”œβ”€β”€ inputdata.csv            # Your training data
β”œβ”€β”€ train.csv               # Training split
β”œβ”€β”€ test.csv                # Testing split
β”œβ”€β”€ workspace_agent_groups.csv  # Group/workspace mappings
β”œβ”€β”€ vectorizer.pkl          # Trained TF-IDF vectorizer
β”œβ”€β”€ agent_encoder.pkl       # Agent group label encoder
β”œβ”€β”€ priority_encoder.pkl    # Priority label encoder
β”œβ”€β”€ ticket_multitask_model.pth  # Trained model weights
└── .env                    # Your configuration (keep secret!)

βš™οΈ Configuration

Environment Variables

Create a .env file with your settings:

# API Configuration
FRESHSERVICE_API_KEY=your_api_key_here
FRESHSERVICE_DOMAIN=your_domain_here

# Contact Information
CONTACT_EMAIL=[email protected]
CONTACT_NAME=IT Support Team

# Processing Settings
SOURCE_TAG=auto-classify
PROCESSED_TAG=auto-classified
DEFAULT_WORKSPACE_ID=1

# Model Settings
MAX_FEATURES=25000
BATCH_SIZE=10
LEARNING_RATE=0.0005
EPOCHS=50

Data Format

Training Data (inputdata.csv):

subject,agent_group,priority
"Computer won't start",Hardware Support,High
"Password reset needed",IT Helpdesk,Medium
"New software request",Software Team,Low

Group Mappings (workspace_agent_groups.csv):

group_name,group_id,workspace_id
Hardware Support,101,1
IT Helpdesk,102,1
Software Team,103,2

πŸ”§ API Integration

The system supports integration with ticketing platforms through REST APIs. Currently includes:

  • Freshservice: Full integration with ticket fetching and updating
  • Generic REST API: Easily adaptable to other platforms

Adding New Integrations

  1. Implement API client in main.py
  2. Update configuration for your platform
  3. Modify ticket processing logic as needed

πŸ“ˆ Model Performance

Training Metrics

  • Multi-task Loss: Combined cross-entropy for both tasks
  • Early Stopping: Prevents overfitting with patience mechanism
  • Validation Split: 20% of data reserved for validation

Evaluation

Use the Gradio interface to:

  • Test edge cases and unusual ticket subjects
  • Evaluate prediction confidence scores
  • Compare against human assignments

πŸ› οΈ Development

Adding New Features

  1. New Prediction Tasks: Extend the model architecture

Testing

# Test individual components
python -c "from train_model import load_model_components; print('βœ“ Model loads correctly')"

# Validate data format
python -c "import pandas as pd; df = pd.read_csv('inputdata.csv'); print(f'βœ“ Data shape: {df.shape}')"

🚨 Security Considerations

  • API Keys: Never commit .env files or hardcode credentials
  • Data Privacy: Ensure training data doesn't contain sensitive information
  • Access Control: Limit API permissions to minimum required scope
  • Logging: Avoid logging sensitive ticket content

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Development Setup

# Install development dependencies
pip install -r requirements-dev.txt

πŸ“‹ Requirements

Core Dependencies

torch>=2.0.0
scikit-learn>=1.3.0
pandas>=2.0.0
numpy>=1.24.0
requests>=2.31.0
python-dotenv>=1.0.0
gradio>=4.0.0

Optional Dependencies

# For development
pytest>=7.0.0
black>=23.0.0
flake8>=6.0.0

# For enhanced features
transformers>=4.30.0  # For BERT-based models
plotly>=5.0.0        # For training visualizations

πŸ“Š Example Results

Sample Predictions

Ticket Subject Predicted Group Predicted Priority Confidence
"Server is down - urgent!" Infrastructure High 94%
"Password reset request" IT Helpdesk Medium 87%
"New laptop setup needed" Hardware Support Low 91%

Performance Metrics

  • Agent Group Accuracy: ~85-90% on validation set
  • Priority Accuracy: ~80-85% on validation set
  • Processing Speed: ~100 tickets/minute
  • API Response Time: <2 seconds per ticket

πŸ› Troubleshooting

Common Issues

ModuleNotFoundError: Install missing dependencies

pip install -r requirements.txt

CUDA out of memory: Reduce batch size in configuration

CONFIG['batch_size'] = 5  # Reduce from default 10

API authentication errors: Check your .env file

# Verify API key is correct
curl -u "your_api_key:X" https://yourdomain.freshservice.com/api/v2/tickets

Model file not found: Train the model first

python train_model.py

πŸ“œ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

πŸ“ž Support


⭐ If this project helps you, please consider giving it a star!

About

AI model trained to predict where tickets on a support team belong

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages