I am trying to load the model “roberta-large-mnli” from the disk to the pipeline module. The only way I can manage to do so is by doing the following:
model = AutoModelForSequenceClassification.from_pretrained('/path-to-roberta-large-mnli') tokenizer = AutoTokenizer.from_pretrained('/path-to-roberta-large-mnli') pipe = pipeline(task='zero-shot-classification', tokenizer=tokenizer, model=model) candidate_labels = ['NEUTRAL', 'ENTAILMENT', 'CONTRADICTION'] output = pipe(data_to_test, candidate_labels)
However, doing so returns significantly different and poor prediction scores compared to if I load the model directly from the web and feed it directly to the pipeline module,
pipe = pipeline(model="roberta-large-mnli", device=device) output = pipe(data_to_test)
I think the zero-shot setting is not appropriate for my case, because I am using a model trained for NLI and I am testing it also for entailment-type data.
The best case for me is if I could simply specify the model path in the model parameter of the pipeline module. However, if I do that, I get the following error:
“RuntimeError: Instantiating a pipeline without a task set raised an error: Repo id must be in the form ‘repo_name’ or ‘namespace/repo_name’: ‘path-to-roberta-large-mnli’'. Use
repo_type argument if needed.”