This project implements an automated CI/CD Pipeline to integrate and deploy a sentiment analysis LLM from HuggingFace to continuosly evaluate a company's sentiment throughout social media posts.
The application is composed mainly of 2 parts:
- The track and schedule server, which runs Apache Airflow and MLFlow
- The ModelRunner server, on which the inferencing and training endpoints run
For the scope of this project, both components were put in the same repository under different subdirectories as it made testing easier via Docker Compose.
However, for a production-ready deployment, one would have had to separate them on different repositories as to not include code from one application in the deployment base of the other one.
Airflow is mainly used for scheduling different operations:
- Weekly re-training of the LLM
- Intended to be used with fresh Datasets containing new samples, which is not implemented at this stage
- Sends requests to the ModelRunner Server's Endpoints to initiate the training and continuosly polls for status updates until the training completes (or fails)
- If training fails, an alert is sent via email
- Model performance evaluation
- Checks whether the accuracy of the model over the last X trainings has worsened by a given threshold. In that case, it will alert via email.
- Sentiment Inspector
- Checks the Social Medias' posts to evaluate the customers' sentiment over the company
- Currently, only Twitter API is implemented but it may easily be extended with other platforms
- Could also be used to collect samples for new datasets used in weekly re-trainings (Not implemented)
MLFlow keeps track of the training logs and metrics, giving an overall idea of how the model is performing on sample datasets.
Inference metrics will also be stored here, allowing the company to see what people think or feel about it.
ModelRunner uses FastAPI and Uvicorn to host a self-contained Training and Inference infrastructure.
When Airflow sends a training request, ModelRunner receives it and schedules a training right away, unless one is already running. In either case, it will give direct feedback whether the request was fulfilled or discarded by returning appropriate responses.
Training may take a long time, depending on the server performance on which the application is deployed on. For this reason, a "get_state" endpoint was added so that Airflow can poll it at regular intervals to see if the training is still ongoing, if it has finished or if it ended with an error.
Once the training completes successfully, the updated model is pushed to HuggingFace and the Inference class reloads it to keep things up-to-date.
For simplicity, it pulls the model from HuggingFace but a more efficient way would be to directly re-use the trained model at that point in time.
The Pipeline makes use of Github Actions which takes care of the following:
- For Pull Requests, run our ModelRunner python integration tests
- For merged PRs, sync the repository with HuggingFace Space
HuggingFace Space is where the application (the ModelRunner) is deployed to and will run within a Docker Container. It is accessible via http://xxx:7860/xxxx. Everytime a PR is merged to this Github Repository, a CI Job will automatically take care of pushing it to HuggingFace, keeping the deployed application automatically in sync.
Resources used for this Project:
- https://github.com/peter-evans/docker-compose-actions-workflow
- https://github.com/marketplace/actions/run-pytest
- https://huggingface.co/docs/hub/spaces-github-actions
- https://developer.x.com/en/docs/x-api
- https://huggingface.co/docs/hub/spaces-sdks-docker-examples
- https://huggingface.co/spaces/SpacesExamples/fastapi_dummy/tree/main
- https://mlflow.org/docs/latest/tracking/tutorials/remote-server/
- https://stackoverflow.com/questions/75118992/docker-error-response-from-daemon-could-not-select-device-driver-with-capab
- https://huggingface.co/docs/transformers/en//training
- https://mlflow.org/docs/latest/llms/transformers/tutorials/fine-tuning/transformers-fine-tuning
This is the final project I've made in scope of Profession.AI's AI Engineering course, module 5 "MLOps and Machine Learning in production".