Hi all,
Thanks for making this forum!
I have a list of tests, one of which apparently happens to be 516 tokens long. I have been using the feature-extraction pipeline to process the texts, just using the simple function:
nlp = pipeline('feature-extraction')
When it gets up to the long text, I get an error:
Token indices sequence length is longer than the specified maximum sequence length for this model (516 > 512). Running this sequence through the model will result in indexing errors
Alternately, if I do the sentiment-analysis pipeline (created by nlp2 = pipeline('sentiment-analysis')
, I did not get the error.
Is there a way for me put an argument in the pipeline
function to make it truncate at the max model input length? I tried reading this, but I was not sure how to make everything else in pipeline
the same/default, except for this truncation.