Replies: 2 comments
-
Thanks for opening an issue, @zhemaituk, I was able to reproduce this on both Scout and Maverick with the example provided. While this is an unusual case, Llama4 appears to be consciously including these tokens as part of the raw model output. It's unclear to me whether this is expected behavior, so I'm hesitant to have The workaround via stop sequences is much appreciated! Will keep this issue open for visibility. |
Beta Was this translation helpful? Give feedback.
-
Moving to discussion. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
When using
meta.llama4-scout-17b-instruct-v1:0
the model responses may contain General Tokens, such as<|eot_id|>
.Example:
Prompts simplified to make the example shorter but still to keep it reproducible:
Actual output:
The same problem is not observed when using
us.meta.llama3-2-11b-instruct-v1:0
Workaround:
Adding
stop=["<|eot_id|>"]
somewhat helps for this particular example, but not sufficient for all the cases. For other cases - different General Tokens are produced. To shut off all the occurrences for my cases I had to add all of these:stop=["<|eot_id|>", "<|e.generation_suffix|>", "<|end_header_id|>", "<|eassistant<|header_end|>"]
Update
As Converse API limits the stop list to just 4 elements, this worked much better:
stop=["<|"]
Beta Was this translation helpful? Give feedback.
All reactions