I am attempting to fine-tune a model on AzureML Notebook by setting training arguments as outlined in chapter 2 of Natural Language Processing with Transformers. I have push_to_hub
argument set to True
(doesn’t happen when set to False
). I am receiving this error:
OSError: WARNING: 'git lfs clone' is deprecated and will not be updated
with new flags from 'git clone'
'git clone' has been updated in upstream Git to have comparable
speeds to 'git lfs clone'.
Cloning into '.'...
I am using an AzureML Notebook on GPU Tesla K80 instance.
OS: Ubuntu/focus
Cell:
from transformers import Trainer, TrainingArguments
batch_size = 64
logging_steps = len(emotions_encoded["train"]) // batch_size
model_name = f"./{model_ckpt}-finetuned-emotion"
training_args = TrainingArguments(output_dir=model_name,
num_train_epochs=2,
learning_rate=2e-5,
per_device_train_batch_size=batch_size,
per_device_eval_batch_size=batch_size,
weight_decay=0.01,
evaluation_strategy="epoch",
disable_tqdm=False,
logging_steps=logging_steps,
push_to_hub=True,
log_level="error")
Next Cell:
from transformers import Trainer
trainer = Trainer(model=model, args=training_args,
compute_metrics=compute_metrics,
train_dataset=emotions_encoded["train"],
eval_dataset=emotions_encoded["validation"],
tokenizer=tokenizer)
trainer.train();
I have:
- updated all libraries
- tried with conda virtual environments
- emulated Colab setup from NLP Processing with Transformers Github Repo
- ensured git-lfs installation
Can anybody advise whether they have been successful on AzureML Notebook with these training_args
or should I just be waiting to push to hub after completion?
Traceback (segment):
/anaconda/envs/azureml_py38/lib/python3.8/site-packages/huggingface_hub/repository.py:725: FutureWarning: Creating a repository through 'clone_from' is deprecated and will be removed in v0.12. Please create the repository first using `create_repo(..., exists_ok=True)`.
warnings.warn(
Cloning https://huggingface.co/<username>/distilbert-base-uncased-finetuned-emotion into local empty directory.
---------------------------------------------------------------------------
CalledProcessError Traceback (most recent call last)
File /anaconda/envs/azureml_py38/lib/python3.8/site-packages/huggingface_hub/repository.py:754, in Repository.clone_from(self, repo_url, token)
752 env.update({"GIT_LFS_SKIP_SMUDGE": "1"})
--> 754 run_subprocess(
755 f"{'git clone' if self.skip_lfs_files else 'git lfs clone'} {repo_url} .",
756 self.local_dir,
757 env=env,
758 )
759 else:
760 # Check if the folder is the root of a git repository