Pretrained model with stride doesn't predict long text

My objective is to annotate long documents with bioformer-8L. I have been said to use stride and truncation so I don’t have to split my documents in chunks of 512 tokens.

In the training phase, I called the tokenizer like this:

tokenizer = AutoTokenizer.from_pretrained(model_checkpoint, stride = 128, return_overflowing_tokens=True, model_max_length=512, truncation=True, is_split_into_words=True)

For the prediction I do:

model = AutoModelForTokenClassification.from_pretrained(model_path)
tokenizer = AutoTokenizer.from_pretrained(model_path, stride = 128, return_overflowing_tokens=True, model_max_length=512, truncation=True, is_split_into_words=True)
ner = pipeline(“token-classification”, model=model, tokenizer=tokenizer, aggregation_strategy=“first”)

But it does not work, the model stops providing annotations in the middle of the text. For the test I duplicated several time the same sentence that I know contains annotations I am looking for.

Help please.

Solution is to move stride from from_trained to pipeline.

tokenizer = AutoTokenizer.from_pretrained(model_path, return_overflowing_tokens=True, model_max_length=512, truncation=True, is_split_into_words=True)

ner = pipeline(“token-classification”, model=model, tokenizer=tokenizer, aggregation_strategy=“first”, stride = 128)