An intelligent support ticket routing system that uses machine learning to automatically classify and assign support tickets to appropriate agent groups.
- Automated Processing: Integrates with ticketing systems via REST API
- Web Interface: Interactive Gradio-based demo for testing predictions
- Flexible Architecture: Easily adaptable to different ticketing platforms
- Production Ready: Includes error handling, logging, and configuration management
- Python 3.8 or higher
- pip package manager
-
Clone the repository
git clone https://github.com/Juno445/ticket-classification-model.git cd ticket-classification-model
-
Install dependencies
pip install -r requirements.txt
-
Prepare your data
- Create
inputdata.csv
with columns:subject
,agent_group
,priority
- Create
workspace_agent_groups.csv
with columns:group_name
,group_id
,workspace_id
- Create
-
Set up environment variables
cp .env.example .env # Edit .env with your configuration
Split your dataset into training and testing sets:
python data_splitter.py
This creates train.csv
and test.csv
from your input data with an 80/20 split.
Train the multi-task neural network:
python train_model.py
This will:
- Process and vectorize ticket subjects using TF-IDF
- Train a neural network with two output heads (agent group + priority)
- Save the trained model and preprocessing components
- Use early stopping to prevent overfitting
Training Output:
Using device: cuda
Loaded 10000 records from inputdata.csv
Training samples: 8000, Validation samples: 2000
Model initialized with 3,276,842 parameters
Starting training...
Epoch 1/50, Training Loss: 2.1543, Validation Loss: 1.8932
...
New best model saved to ticket_multitask_model.pth
Training completed!
Launch the web interface to test predictions:
python gradio_demo.py
Open your browser to http://localhost:7860
and test the model with sample ticket subjects.
Run the automated ticket processing system:
python main.py
This will:
- Connect to your ticketing system API
- Fetch unassigned tickets
- Predict appropriate assignments using the trained model
- Automatically update ticket assignments
- Add explanatory notes to processed tickets
Input: Ticket Subject (Text)
β
TF-IDF Vectorization (25,000 features)
β
Neural Network:
βββ Dense Layer (25,000 β 128) + ReLU
βββ Dense Layer (128 β 64) + ReLU
βββ Agent Head (64 β num_agent_groups)
βββ Priority Head (64 β num_priorities)
β
Outputs: Agent Group
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β Data Splitter β β Model Training β β Gradio Demo β
β β β β β β
β β’ Train/Test β β β’ Multi-task NN β β β’ Web Interface β
β Split β β β’ TF-IDF β β β’ Live Testing β
β β’ 80/20 Ratio β β β’ Early Stop β β β’ Confidence β
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β β β
βββββββββββββββββββββββββΌββββββββββββββββββββββββ
β
ββββββββββββββββββββ
β Main System β
β β
β β’ API Integrationβ
β β’ Auto-Assignmentβ
β β’ Batch Process β
ββββββββββββββββββββ
ticket-classification-model/
βββ data_splitter.py # Split dataset into train/test
βββ train_model.py # Train the multi-task model
βββ main.py # Production ticket processing
βββ gradio_demo.py # Interactive web interface
βββ requirements.txt # Python dependencies
βββ .env.example # Environment variables template
βββ .gitignore # Git ignore rules
βββ README.md # This file
# Generated files (not in repo)
βββ inputdata.csv # Your training data
βββ train.csv # Training split
βββ test.csv # Testing split
βββ workspace_agent_groups.csv # Group/workspace mappings
βββ vectorizer.pkl # Trained TF-IDF vectorizer
βββ agent_encoder.pkl # Agent group label encoder
βββ priority_encoder.pkl # Priority label encoder
βββ ticket_multitask_model.pth # Trained model weights
βββ .env # Your configuration (keep secret!)
Create a .env
file with your settings:
# API Configuration
FRESHSERVICE_API_KEY=your_api_key_here
FRESHSERVICE_DOMAIN=your_domain_here
# Contact Information
CONTACT_EMAIL=[email protected]
CONTACT_NAME=IT Support Team
# Processing Settings
SOURCE_TAG=auto-classify
PROCESSED_TAG=auto-classified
DEFAULT_WORKSPACE_ID=1
# Model Settings
MAX_FEATURES=25000
BATCH_SIZE=10
LEARNING_RATE=0.0005
EPOCHS=50
Training Data (inputdata.csv
):
subject,agent_group,priority
"Computer won't start",Hardware Support,High
"Password reset needed",IT Helpdesk,Medium
"New software request",Software Team,Low
Group Mappings (workspace_agent_groups.csv
):
group_name,group_id,workspace_id
Hardware Support,101,1
IT Helpdesk,102,1
Software Team,103,2
The system supports integration with ticketing platforms through REST APIs. Currently includes:
- Freshservice: Full integration with ticket fetching and updating
- Generic REST API: Easily adaptable to other platforms
- Implement API client in
main.py
- Update configuration for your platform
- Modify ticket processing logic as needed
- Multi-task Loss: Combined cross-entropy for both tasks
- Early Stopping: Prevents overfitting with patience mechanism
- Validation Split: 20% of data reserved for validation
Use the Gradio interface to:
- Test edge cases and unusual ticket subjects
- Evaluate prediction confidence scores
- Compare against human assignments
- New Prediction Tasks: Extend the model architecture
# Test individual components
python -c "from train_model import load_model_components; print('β Model loads correctly')"
# Validate data format
python -c "import pandas as pd; df = pd.read_csv('inputdata.csv'); print(f'β Data shape: {df.shape}')"
- API Keys: Never commit
.env
files or hardcode credentials - Data Privacy: Ensure training data doesn't contain sensitive information
- Access Control: Limit API permissions to minimum required scope
- Logging: Avoid logging sensitive ticket content
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
# Install development dependencies
pip install -r requirements-dev.txt
torch>=2.0.0
scikit-learn>=1.3.0
pandas>=2.0.0
numpy>=1.24.0
requests>=2.31.0
python-dotenv>=1.0.0
gradio>=4.0.0
# For development
pytest>=7.0.0
black>=23.0.0
flake8>=6.0.0
# For enhanced features
transformers>=4.30.0 # For BERT-based models
plotly>=5.0.0 # For training visualizations
Ticket Subject | Predicted Group | Predicted Priority | Confidence |
---|---|---|---|
"Server is down - urgent!" | Infrastructure | High | 94% |
"Password reset request" | IT Helpdesk | Medium | 87% |
"New laptop setup needed" | Hardware Support | Low | 91% |
- Agent Group Accuracy: ~85-90% on validation set
- Priority Accuracy: ~80-85% on validation set
- Processing Speed: ~100 tickets/minute
- API Response Time: <2 seconds per ticket
ModuleNotFoundError: Install missing dependencies
pip install -r requirements.txt
CUDA out of memory: Reduce batch size in configuration
CONFIG['batch_size'] = 5 # Reduce from default 10
API authentication errors: Check your .env
file
# Verify API key is correct
curl -u "your_api_key:X" https://yourdomain.freshservice.com/api/v2/tickets
Model file not found: Train the model first
python train_model.py
This project is licensed under the MIT License - see the LICENSE file for details.
- Built with PyTorch for deep learning
- Gradio for the interactive interface
- scikit-learn for preprocessing utilities
- π§ Email: [email protected]
- π¬ Issues: GitHub Issues
- π Documentation: Wiki
β If this project helps you, please consider giving it a star!