- An exploratory project that supports labeling for object detection and segmentation datasets
- Leverages SAM2 and YOLO-World models to automatically generate bounding boxes and segmentation masks
Note
This is an experimental project for learning purpose
demo-yolo-world-auto-label.mp4
Tip
Future Work: To reduce auto-labeling latency, consider implementing model warm-up or a caching strategy.
demo-yolo-world-auto-label-gray-bg.mp4
Tech stack
- Python & FastAPI: Backend code and API
- MongoDB: NoSQL database for flexible data schema
- GraphQL: Flexible API query language
- Docker: Containerization for development and deployment
- Ultralytics: Support computer vision models
- Google Cloud Platform: Cloud infrastructure and services
- GitHub Actions: CD pipeline automation
Build image and deploy services to GCP
-
Build images and upload to registry
- Use GitHub actions to build new images when corresponding changes are made
- Upload image to GCP artifcat registry
-
Deploy
- Backend app -> Serverless (Cloud Run)
- Inference gateway -> Serverless (Cloud Run)
- Inference services -> VM (Compute Engine)
- Utilze WatchTower for updating to the latest images.
Running locally, with GCS resources on the cloud
1. Set up environment variables
cp backend/app/.env.example backend/app/dev.env
# And fill in the env vars2. Set up credentials
- Place the GCP service account JSON at
backend/app/gcp_service_account.json - Place the Clerk JWT Signed key at
backend/app/jwt_public.pem
3. Start Docker containers for backends
docker-compose up4. Start frontend
cd frontend/app
npm i
npm run dev- Integrate YOLOWorld with SAM to support segmentation masks
- Performance optimizations
- Model warm-up or caching strategy to reduce inference latency
- Pagination (since dataset might have a lot of images)
- Feature to export labeled datasets
