Using hyperparameter-search in Trainer

Hey @dunalduck0 one usually just tracks the loss or perplexity for GPT-like models. You can compute the losses by adapting the evaluation code in this example :slight_smile:

I have a question, if I want to test diffrent learning rate should I write : ā€œlearning_rateā€: tune.loguniform(1e-4, 2e-5, 5e-5,1e-5, 1e-2), or tune.loguniform(1e-4, 1e-2) will used diffrent learning rate

Hello,

I am using this code to find the best parameters for my model.

from ray.tune.schedulers import PopulationBasedTraining
from ray.tune import uniform
from random import randint

scheduler = PopulationBasedTraining(
    mode = "max",
    metric='exact_match', # mean_accuracy
    perturbation_interval=2,
    hyperparam_mutations={
        "weight_decay": lambda: uniform(0.0, 0.3),
        "learning_rate": lambda: uniform(1e-5, 5e-5),
        "per_gpu_train_batch_size": [3, 4, 5],
        "num_train_epochs": [10,11,12],
        "warmup_steps":lambda: randint(0, 500)
    }
)

best_trial = trainer.hyperparameter_search(
    direction="maximize",
    backend="ray",
    n_trials=4,
    keep_checkpoints_num=2,
    scheduler=scheduler
)

However, I am having this miskate. Do you have an advice?

/usr/local/lib/python3.7/dist-packages/pyarrow/io.pxi in pyarrow.lib.Buffer.__reduce_ex__()

AttributeError: module 'pickle' has no attribute 'PickleBuffer'

Some people recommend to use python 3.8 instead of python 3.7, however, this workaround did not help me to resolve the issue. I am working in Google Colab.

Thanks in advance.

2 Likes

I have a strange behaviour when I am using custom HP function.
The results are the same on all trails and epoches.

default example:


def compute_metrics(eval_preds):
  metric = load_metric("f1")
  logits, labels = eval_preds
  predictions = np.argmax(logits, axis=-1)
  #evaluate(labels, predictions)
  return metric.compute(predictions=predictions, references=labels,average='weighted')

args = TrainingArguments(
    MODEL_NAME,
    evaluation_strategy = "epoch",
    save_strategy = "epoch",
    learning_rate=2e-5,
    per_device_train_batch_size=TR_BATCH_SIZE,
    per_device_eval_batch_size=TEST_BATCH_SIZE,
    num_train_epochs=5,
    weight_decay=0.01,
    load_best_model_at_end=True,
    metric_for_best_model='f1',
    push_to_hub=False,
)
train_dataset = tokenized_train["train"].shard(index=1, num_shards=10) 
trainer = Trainer(
    model_init=model_init,
    args=args,
    train_dataset=train_dataset,
    eval_dataset=tokenized_test['train'],
    tokenizer=tokenizer,
    compute_metrics=compute_metrics
)
best_run = trainer.hyperparameter_search(n_trials=10, direction="maximize")

the results are :
image

but when I am using custom :

def my_hp_space(trial):
    return {
        "learning_rate": trial.suggest_float("learning_rate", 1e-4, 1e-2, log=True),
        "num_train_epochs": trial.suggest_int("num_train_epochs", 1, 3),
        "seed": trial.suggest_int("seed", 1, 40),
        "per_device_train_batch_size": trial.suggest_categorical("per_device_train_batch_size", [1, 2, 4,6, 8]),
    }
trainer.hyperparameter_search(direction="maximize", hp_space=my_hp_space)

image

This helped me on Google Colab:
!pip install pickle5
Then
import pickle5 as pickle
After the first run there will be the pickle warning to restart the notebook and the same error. After the second ā€œRestart and run allā€ the ray tune hyperparameter search begins.

Hey @sgugger , do you know if it’s possible to use cross validation with optuna for the hyperparameter-search ?
I found this which resemble what I’m looking for. I was wondering if it is implemented inside the Trainer ?
https://optuna.readthedocs.io/en/stable/reference/generated/optuna.integration.OptunaSearchCV.html

Thanks !