I see I’m not the first one to have this problem, but unfortunately it looks like previous users with similar troubles didn’t get responses. Here’s hoping I’ll be luckier!
I’m attempting to tune my hyperparameters to fine tune a BERT model (specifically, distilbert-base-uncased)
I’ve tried many versions of the code for the objective function as I’ve searched online, with and without a model_init. A few have given me KeyError: 142224. I know this number is in fact random, because it’s consistently the same number until I change my random seed. My training set has over 250,000 rows. People with smaller datasets in the forums seem to get smaller random numbers. If I shorten the training dataset, the KeyError number changes, so I strongly suspect it’s trying to access a particular row.
My indices in the dataframe were originally random numbers up into the millions (because of the way I selected the dev set from my full dataset) but the error persists even if I reset_index().
I guess without further ado, here’s what I’ve got. Here’s a sample of my dataset so you can see if it’s in the correct format:
input_ids attention_mask labels
0 [101, 1037, 3803... [1, 1, 1, 1... 2
1 [101, 2307, 2326... [1, 1, 1, 1... 2
2 [101, 1996, 2326...
3 [101, 2077, 1045... [1, 1, 1, 1... 1
4 [101, 3083, 3319... [1, 1, 1, 1... 1
And here’s the code leading up to the error:
def model_init(trial):
# Define hyperparameters
learning_rate = trial.suggest_float("learning_rate", 1e-5, 5e-5, log=True)
num_train_epochs = trial.suggest_int("num_train_epochs", 1, 3)
gradient_accumulation_steps = trial.suggest_int("gradient_accumulation_steps", 1, 8)
per_device_train_batch_size = trial.suggest_int("per_device_train_batch_size", 4, 16)
evaluation_strategy = trial.suggest_categorical("evaluation_strategy", ['steps', 'epoch'])
per_device_eval_batch_size = trial.suggest_int("per_device_eval_batch_size", 4, 16)
warmup_steps = trial.suggest_int("warmup_steps", 100, 500)
weight_decay = trial.suggest_float("weight_decay", 0.0, 0.1)
model = AutoModelForSequenceClassification.from_pretrained('distilbert-base-uncased',num_labels=2)
return model
def objective(trial):
# Define training arguments
training_args = TrainingArguments(
output_dir='drive/MyDrive/BERT Sentiment/output',
seed=42,
logging_dir='drive/MyDrive/BERT Sentiment/output/logs',
logging_steps=1000
)
print("Defined the training arguments")
model = model_init(trial)
print("Initialized the model")
trainer = Trainer(
model=model,
args=training_args,
train_dataset=train_set,
eval_dataset=eval_set)
print("Created the trainer")
trainer.train()
print("Trained the model")
results = trainer.hyperparameter_search(model=None, direction='maximize',args=training_args,model_init=model_init)
print(results.metrics['f1'])
study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=1)
best_hyperparameters = study.best_params
print("Best hyperparameters" + str(best_hyperparameters))
And here’s the error all of that throws.
[I 2023-10-28 05:20:15,502] A new study created in memory with name: no-name-681288ba-c1a7-4d00-bd14-ccb4fad8cdac
Defined the training arguments
Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['pre_classifier.bias', 'pre_classifier.weight', 'classifier.weight', 'classifier.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Initialized the model
Created the trainer
[W 2023-10-28 05:20:16,477] Trial 0 failed with parameters: {'learning_rate': 3.948249070738038e-05, 'num_train_epochs': 3, 'gradient_accumulation_steps': 2, 'per_device_train_batch_size': 7, 'evaluation_strategy': 'steps', 'per_device_eval_batch_size': 6, 'warmup_steps': 333, 'weight_decay': 0.048816647569152063} because of the following error: KeyError(142224).
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/optuna/study/_optimize.py", line 200, in _run_trial
value_or_values = func(trial)
File "<ipython-input-18-715f9f522d66>", line 25, in objective
trainer.train()
File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 1591, in train
return inner_training_loop(
File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 1870, in _inner_training_loop
for step, inputs in enumerate(epoch_iterator):
File "/usr/local/lib/python3.10/dist-packages/accelerate/data_loader.py", line 451, in __iter__
current_batch = next(dataloader_iter)
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 630, in __next__
data = self._next_data()
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 674, in _next_data
data = self._dataset_fetcher.fetch(index) # may raise StopIteration
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/fetch.py", line 51, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/usr/local/lib/python3.10/dist-packages/pandas/core/frame.py", line 3807, in __getitem__
indexer = self.columns.get_loc(key)
File "/usr/local/lib/python3.10/dist-packages/pandas/core/indexes/base.py", line 3804, in get_loc
raise KeyError(key) from err
KeyError: 142224
[W 2023-10-28 05:20:16,483] Trial 0 failed with value None.
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
/usr/local/lib/python3.10/dist-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
3801 try:
-> 3802 return self._engine.get_loc(casted_key)
3803 except KeyError as err:
17 frames
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 142224
The above exception was the direct cause of the following exception:
KeyError Traceback (most recent call last)
/usr/local/lib/python3.10/dist-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
3802 return self._engine.get_loc(casted_key)
3803 except KeyError as err:
-> 3804 raise KeyError(key) from err
3805 except TypeError:
3806 # If we have a listlike key, _check_indexing_error will raise
KeyError: 142224
There’s any number of places I could’ve made an error, of course. I’ve tried a few different tutorials on HuggingFace, as well as asking ChatGPT to explain/fix the errors, but so far no progress on this one.
I’m working on Google Colab, if it’s relevant. Please let me know if there’s any other information I could give you that would help with diagnosis.
Thanks for reading!