Well, in both cases you need to instantiate a Trainer, with slightly different arguments. Something like this, a bit simplified.
For training:
# training arguments for Trainer
training_args = TrainingArguments(
output_dir = OUTPUT_DIR,
do_train = True,
do_eval = True,
per_device_train_batch_size = BATCH_SIZE,
learning_rate = 2e-5,
num_train_epochs = 10,
dataloader_drop_last = False
)
# init trainer (model is the model you want to fine-tune)
trainer = Trainer(
model = model,
args = training_args,
train_dataset = train_dataset,
eval_dataset = valid_dataset,
compute_metrics = compute_metrics
)
trainer.train()
model_to_save = trainer.model.module if hasattr(trainer.model, 'module') else trainer.model # Take care of distributed/parallel training
model_to_save.save_pretrained(OUTPUT_DIR)
For inference:
# loading the model you previously trained
model = AutoModelForSequenceClassification.from_pretrained(OUTPUT_DIR)
# arguments for Trainer
test_args = TrainingArguments(
output_dir = OUTPUT_DIR,
do_train = False,
do_predict = True,
per_device_eval_batch_size = BATCH_SIZE,
dataloader_drop_last = False
)
# init trainer
trainer = Trainer(
model = model,
args = test_args,
compute_metrics = compute_metrics)
test_results = trainer.predict(test_dataset)
Then, from test_results
you can easily derive predicted labels and probabilities.
Of course you will need to set your own constants/parameters and there are many more training arguments that can be passed to Trainer, but the main ideas are there.