Hello, I am trying to create a pipeline from a trained model. From what I understand I need to provide a tokenizer so that my new input will be tokenised. I guess, it should look like this;
from transformers import pipeline, AutoModel model_name = "TestModel" model = AutoModelForSequenceClassification.from_pretrained(model_name) tokenizer = AutoTokenizer.from_pretrained(model_name) classifier = pipeline('sentiment-analysis', model=model, tokenizer=tokenizer, return_all_scores=True)
My question is where do the other steps from th tokenisation process take place, like the padding and truncation. During training, my sequences where processed as follows;
train_encodings = tokenizer(seq_train, truncation=True, padding=True, max_length=1024, return_tensors="pt")
Is that no longer needed?