-
Notifications
You must be signed in to change notification settings - Fork 326
Add max_model_len field support to router #638
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,44 @@ | ||
| name: Build Custom Router Image | ||
|
|
||
| on: | ||
| push: | ||
| branches: | ||
| - fix-max-model-len | ||
| - main | ||
| workflow_dispatch: | ||
|
|
||
| jobs: | ||
| build: | ||
| permissions: | ||
| contents: read | ||
| packages: write | ||
| runs-on: ubuntu-latest | ||
| steps: | ||
| - name: Checkout repository | ||
| uses: actions/checkout@v4 | ||
| with: | ||
| fetch-depth: 0 | ||
|
|
||
| - name: Set up Docker Buildx | ||
| uses: docker/setup-buildx-action@v3 | ||
|
|
||
| # Login to GitHub Container Registry (GHCR) | ||
| - name: Login to GHCR | ||
| uses: docker/login-action@v3 | ||
| with: | ||
| registry: ghcr.io | ||
| username: ${{ github.actor }} | ||
| password: ${{ secrets.GITHUB_TOKEN }} | ||
|
|
||
| - name: Build and push image | ||
| uses: docker/build-push-action@v5 | ||
| with: | ||
| context: . | ||
| file: docker/Dockerfile | ||
| push: true | ||
| tags: | | ||
| ghcr.io/${{ github.repository }}/router:latest | ||
| ghcr.io/${{ github.repository }}/router:max-model-len-fix | ||
| ghcr.io/${{ github.repository }}/router:${{ github.sha }} | ||
| cache-from: type=registry,ref=ghcr.io/${{ github.repository }}/router:buildcache | ||
| cache-to: type=registry,ref=ghcr.io/${{ github.repository }}/router:buildcache,mode=max |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -50,6 +50,7 @@ class ModelInfo: | |
| root: Optional[str] = None | ||
| parent: Optional[str] = None | ||
| is_adapter: bool = False | ||
| max_model_len: Optional[int] = None | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. While |
||
|
|
||
| @classmethod | ||
| def from_dict(cls, data: Dict) -> "ModelInfo": | ||
|
|
@@ -62,6 +63,7 @@ def from_dict(cls, data: Dict) -> "ModelInfo": | |
| root=data.get("root", None), | ||
| parent=data.get("parent", None), | ||
| is_adapter=data.get("parent") is not None, | ||
| max_model_len=data.get("max_model_len", None), | ||
| ) | ||
|
|
||
| def to_dict(self) -> Dict: | ||
|
|
@@ -74,6 +76,7 @@ def to_dict(self) -> Dict: | |
| "root": self.root, | ||
| "parent": self.parent, | ||
| "is_adapter": self.is_adapter, | ||
| "max_model_len": self.max_model_len, | ||
| } | ||
|
|
||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change correctly passes the
max_model_lento theModelCard. To ensure this functionality is robust and to prevent future regressions, it would be beneficial to add a unit test for the/v1/modelsendpoint. The test should verify that when a model'sModelInfoincludes amax_model_len, this value is correctly included in the API response.