One quick follow-up – I just realized that the message earlier is just a warning, and not an error, which comes from the tokenizer portion. I then get an error on the model portion:
IndexError: index out of range in self
So I have two questions:
- Is there a way to just add an argument somewhere that does the truncation automatically?
- Is there a way for me to split out the tokenizer/model, truncate in the tokenizer, and then run that truncated in the model?
Thank you!