-
Notifications
You must be signed in to change notification settings - Fork 131
Open
Description
Hi Team, I am trying to follow the guide here to use mBART model on triton inference server - https://github.com/triton-inference-server/tensorrtllm_backend/blob/main/docs/encoder_decoder.md
and the output from my model is empty. On further debug, I realized that the example triton server configs provided here - https://github.com/triton-inference-server/tensorrtllm_backend/blob/main/docs/encoder_decoder.md#4-prepare-tritonserver-configs-
from tensorrt_llm/triton_backend/all_models/inflight_batcher_llm doesn't actually use the encoder anywhere. The inputs to the tensorrt_llm model are input_ids from the preprocessor, neither preprocessor, nor tensorrt_llm use the parameter for encoder from config.pbtxt.
Am I missing something?
Metadata
Metadata
Assignees
Labels
No labels