-
Notifications
You must be signed in to change notification settings - Fork 322
[Feat] Add production-ready vLLM Nebius MK8s terraform tutorial #748
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[Feat] Add production-ready vLLM Nebius MK8s terraform tutorial #748
Conversation
Summary of ChangesHello @brokedba, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly expands the vLLM Production Stack's cloud provider coverage by introducing a production-ready Terraform stack for Nebius MK8s. It enables users to deploy GPU-accelerated vLLM inference environments on Nebius, complete with automated infrastructure provisioning, secure ingress, and comprehensive monitoring, offering a turnkey solution for cost-efficient AI/ML deployments. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request adds a comprehensive Terraform stack for deploying vLLM on Nebius MK8s, which is a great addition. The code is well-structured and covers infrastructure provisioning, Kubernetes add-ons, and the vLLM application stack. My review focuses on several critical issues that could prevent the stack from deploying correctly, such as incorrect provider configurations and hardcoded values that should be variables. I've also pointed out several areas where the documentation and comments are misleading due to copy-pasting from other cloud provider examples (AWS, EKS, AKS), which could cause significant confusion for users. Finally, there are some suggestions for code cleanup and modernization, like removing commented-out code and replacing deprecated data sources. Addressing these points will significantly improve the robustness, maintainability, and user-friendliness of this new Nebius tutorial.
tutorials/terraform/nebius/config/llm-stack/helm/gpu/gpu-operator-values.yaml
Outdated
Show resolved
Hide resolved
tutorials/terraform/nebius/config/llm-stack/helm/gpu/gpu-tinyllama-light-ingress-nebius.tpl
Outdated
Show resolved
Hide resolved
dc22dde to
36e8f53
Compare
zerofishnoodles
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, would you be able to show a demo for our next community meeting?
|
@zerofishnoodles Absolutely. looking forward to it. |
zerofishnoodles
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
Hi, can you update the branch? |
Includes: - GPU autoscaling support - Secure ingress + TLS - Prometheus + Grafana monitoring - Built-in vLLM Grafana dashboards - Terraform + Helm integration Signed-off-by: Kosseila (CloudThrill) <[email protected]>
36e8f53 to
f4cbeb3
Compare
|
Just did . it should be good. No conflicts with base branch. |
📋 Summary
This PR adds a complete Nebius MK8s deployment tutorial for the vLLM Production Stack - extending support to another modern Kubernetes cloud provider with GPU acceleration.
Contributed on behalf of CloudThrill — Cloud infrastructure specialists focused on production-grade AI/ML deployments.
🎯 What This Adds
✅ New tutorial path structure
🚀 Core Features
🏗️ Technical Highlights
Additional Notes:
✅ Why This Matters
📚 Included Documentation
terraform.tfvarsexample