This repository contains a production-ready Gradient Boosting Machine (GBM) model deployed as an API service using Docker.
- The model has been approved by the business partner and should not be modified
- Do not add or use additional data
- The API should be able to handle both single requests and batch requests of up to 10,000 rows
- Code should work with Python 3.10 and 3.11
- A value of 1 will be assigned if the threshold of 0.75 is met or exceeded
- A value of 0 will be assigned if the threshold is not met
The output JSON contains the following fields:
business_outcome
: Value of either 0 or 1 (Based on the threshold)prediction
: The predicted probabilityfeature_inputs
: The specific feature input values used in the classification
- Python 3.10 or 3.11
- Docker
- Required Python packages listed in
requirements.txt
- Port 8080 for API service
.
├── src/ # Source code for the API and model
│ ├── api.py # FastAPI implementation
│ ├── model.py # Model loading and prediction logic
│ └── utils.py # Helper functions
├── tests/ # Unit tests
├── docs/ # Additional documentation
├── output/ # Model predictions
│ └── verify_test_inference.json # Test output file
└── requirements.txt # Python dependencies
- Clone the repository:
git clone <repository-url>
cd gbm-model-api
- Build and run the Docker container:
./run_api.sh
The API accepts both single and batch predictions on port 8080. The API is designed to handle up to 10,000 rows in a single batch call or as individual calls.
curl --request POST \
--url http://localhost:8080/predict \
--header 'content-type: application/json' \
--data '{"x_0": "81.81515", "x_1": "0.66585831", ..., "x_99": "0.055098569"}'
curl --request POST \
--url http://localhost:8080/predict \
--header 'content-type: application/json' \
--data '[{"x_0": "81.81", ...}, {"x_0": "-57.78", ...}]'
{
"business_outcome": 1,
"prediction": 0.85,
"feature_inputs": {
"x_0": "81.81515",
"x_1": "0.66585831"
}
}
Run the test suite:
python -m pytest tests/
Test your API with the included candidate_27_test_inference.json
file:
python sample_api_invoke.py
The output will be saved as verify_test_inference.json
in the output directory.
-
Docker Containerization
- Ensures consistent deployment across environments
- Enables easy scaling through container orchestration
-
FastAPI Framework
- Async support for high-performance concurrent requests
- Automatic OpenAPI documentation
- Built-in validation and serialization
-
Batch Processing
- Supports both single and batch predictions
- Optimized for handling large datasets efficiently
- Capable of processing 10,000 rows per batch
-
Load Balancing
- Implement Kubernetes for container orchestration
- Use horizontal pod autoscaling based on CPU/memory metrics
-
Caching Layer
- Add Redis for caching frequent predictions
- Implement model prediction caching strategy
-
Monitoring and Logging
- Implement Prometheus metrics
- Set up ELK stack for centralized logging
- Add detailed performance monitoring
-
API Gateway
- Rate limiting
- Request throttling
- Authentication/Authorization
-
Model Serving Optimization
- Model quantization for faster inference
- GPU support for larger batch processing
- Model versioning and A/B testing capability
- PEP 8 compliant
- Type hints for better code maintainability
- Comprehensive docstrings
- Unit test coverage for critical components
- Input validation and sanitization
- Rate limiting
- Error handling without exposing internal details
- Secure dependencies with regular updates
- Request/response logging
- Performance metrics
- Error tracking
- Resource utilization monitoring
- Ensure Docker is installed and running
- Clone the repository
- Navigate to the project directory
- Run the deployment script:
./run_api.sh
Important: For security scanning during submission, rename code file extensions:
- Python files:
.py
→.txt
(e.g.,preprocess.py
→preprocess.txt
) - Shell scripts:
.sh
→.txt
(e.g.,run_api.sh
→run_api.txt
) - Model files:
.pkl
→.txt
(e.g.,model.pkl
→model.txt
)
Required submission files:
- Python files for API, data prep, tests
- Pickle files (if applicable)
- Dockerfile
requirements.txt
run_api.sh
startup scriptverify_test_inference.json
output file
Common issues and solutions:
-
Port 8080 already in use:
- Change the port in Docker configuration
- Stop any existing services using port 8080
-
Memory issues during batch processing:
- Adjust batch size in the API configuration
- Increase Docker container memory limits
-
Slow prediction times:
- Check system resources
- Monitor API metrics
- Consider scaling horizontally
- Fork the repository
- Create a feature branch
- Commit your changes
- Push to the branch
- Create a Pull Request
[Your License Here]