Saving-Loading Model in Colab and Making Predictions

drew2019 · June 11, 2021, 12:41pm

I’m fairly new to Python and HuggingFace and have what is probably a simple question about saving and loading a model. I can’t figure out how to save a trained classifier model and then reload so to make target variable predictions on new data. As an example, I trained a model to predict imbd ratings with an example from the HuggingFace resources, shown below. I’ve tried a number of ways (save_model, save_pretrained) and either am struggling to save it at all or when loaded, can’t figure out what to call to get predictions. Any help would be incredibly appreciated on the steps that involve saving/loading/predicting new scores.

#example mainly from here: https://huggingface.co/transformers/training.html
!pip install transformers
!pip install datasets

from datasets import load_dataset
raw_datasets = load_dataset("imdb")

from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("bert-base-cased")

def tokenize_function(examples):
    return tokenizer(examples["text"], max_length = 128, padding="max_length", truncation=True) 

tokenized_datasets = raw_datasets.map(tokenize_function, batched=True)

#choosing small datasets for example#
small_train_dataset = tokenized_datasets["train"].shuffle(seed=42).select(range(1000))
small_eval_dataset = tokenized_datasets["test"].shuffle(seed=42).select(range(500))

### TRAINING classification ###
from transformers import AutoModelForSequenceClassification
model = AutoModelForSequenceClassification.from_pretrained("bert-base-cased", num_labels=2)

from transformers import TrainingArguments
from transformers import Trainer

training_args = TrainingArguments("test_trainer", evaluation_strategy="epoch", num_train_epochs=2, weight_decay=.0001, learning_rate=0.00001, per_device_train_batch_size=32) 

trainer = Trainer(model=model, args=training_args, train_dataset=small_train_dataset, eval_dataset=small_eval_dataset)
trainer.train()

y_test_predicted_original = model_loaded.predict(small_eval_dataset)

#### Saving ###
from google.colab import drive
drive.mount('/content/gdrive')
%cd /content/gdrive/My\ Drive/FOLDER

trainer.save_pretrained ("Trained model") #assumed this would save but did not
model.save_pretrained ("Trained model") #did save

### Loading Model and Creating Predicted Scores ###

#perhaps this....#
from transformers import BertConfig, BertModel
conf = BertConfig.from_pretrained("Trained model", num_labels=2)
model_loaded = AutoModelForSequenceClassification.from_pretrained("Trained model", config=conf)

#or...#
model_loaded = AutoModelForSequenceClassification.from_pretrained("Trained model", local_files_only=True)
model_loaded 

#with ultimate goal of getting predicted scores (not sure what to call here)...
y_test_predicted_loaded = model_loaded.predict(small_eval_dataset)

drew2019 · June 15, 2021, 12:54pm

Any insights on this? I can’t find any examples start to finish, which seems like it should be straightforward

drew2019 · June 15, 2021, 3:21pm

This may work I think:

After training I saved

trainer.save_model ("gdrive/My Drive/LOCATION")

Then, you can start a new session and running all previous code prior to the training, then running this:

model = AutoModelForSequenceClassification.from_pretrained("gdrive/My Drive/LOCATION", local_files_only=True)
trainer = Trainer(model=model)
trainer.model = model.cuda()
y = trainer.predict(small_eval_dataset)

Topic		Replies	Views
How to save my model to use it later Beginners	15	174403	November 10, 2024
Trainer "load_best_model_at_end" doesn't load the best model Intermediate	0	2521	February 21, 2023
How do I load an SFTTrainer model finetuned falcon-7b-sharded-bf16 using custom dataset, and make prediction with it Beginners	2	1265	August 1, 2023
Reloading a saved fine-tuned model trained using the Trainer Object from Huggingface does not yield correct predictions Beginners	0	730	March 15, 2022
How to save and load fine-tune model 🤗Transformers	4	24634	October 25, 2021

Saving-Loading Model in Colab and Making Predictions

Related topics