A few questions about beginning with Huggingface

Hi all, I picked up the ‘NLP with Transformers’ book released recently and have begun reading it. This is my first foray into NLP and Transformers so I have a few questions:

  • When I have used PyTorch previously, I have been able to use GridSearch for parameters by wrapping it in a Skorch wrapper. Are we able to GridSearch the parameters for a model when we train transformer models?

  • I used the follow code to train my first text classification model, but in order to actually run this on custom sentences I had to first push the model to my hub and then load it as a classifier. Is there a way to begin testing the model directly after training, instead of first pushing it to the hub as a classifier?

#Now we set the trainer up

trainer = Trainer(model = model, args = training_args, compute_metrics = compute_metrics,
                 train_dataset = emotions_encoded['train'], eval_dataset = emotions_encoded['validation'],
                 tokenizer = tokenizer)

trainer.train();

#Load the model and test it out
model_id = 'Jimchoo91/distilbert-base-uncased-finetuned-emotion'
classifier = pipeline('text-classification', model = model_id)

#Load custom sentence
custom_tweet = 'I really love you to pieces'
preds = classifier(custom_tweet, return_all_scores = True)

Thanks all!

Hey there, welcome to the community!

The model parameter in pipeline() can be be a string (i.e. the model ID like you used), or you can also just pass it the trained model itself, along with the tokenizer:

classifier = pipeline('text-classification', model=trainer.model, tokenizer=trainer.tokenizer)

Let me know if that doesn’t end up working for you!