Is your feature request related to a problem? Please describe.
Currently, the dataset formatting utilities (ChatDataset, ColumnMappedTextInstructionDataset) do not truncate sequences that exceed certain length. seq_length is only used for padding, not truncation.
Describe the solution you'd like
Truncation support so that user can specify max sequencelength.