Trainer error on Azure Notebook - git lfs clone deprecated - Ch.2

I am attempting to fine-tune a model on AzureML Notebook by setting training arguments as outlined in chapter 2 of Natural Language Processing with Transformers. I have push_to_hub argument set to True (doesn’t happen when set to False). I am receiving this error:

OSError: WARNING: 'git lfs clone' is deprecated and will not be updated
          with new flags from 'git clone'

'git clone' has been updated in upstream Git to have comparable
speeds to 'git lfs clone'.
Cloning into '.'...

I am using an AzureML Notebook on GPU Tesla K80 instance.
OS: Ubuntu/focus

Cell:

from transformers import Trainer, TrainingArguments

batch_size = 64
logging_steps = len(emotions_encoded["train"]) // batch_size
model_name = f"./{model_ckpt}-finetuned-emotion"
training_args = TrainingArguments(output_dir=model_name,
                                  num_train_epochs=2,
                                  learning_rate=2e-5,
                                  per_device_train_batch_size=batch_size,
                                  per_device_eval_batch_size=batch_size,
                                  weight_decay=0.01,
                                  evaluation_strategy="epoch",
                                  disable_tqdm=False,
                                  logging_steps=logging_steps,
                                  push_to_hub=True,
                                  log_level="error")

Next Cell:

from transformers import Trainer

trainer = Trainer(model=model, args=training_args, 
                  compute_metrics=compute_metrics,
                  train_dataset=emotions_encoded["train"],
                  eval_dataset=emotions_encoded["validation"],
                  tokenizer=tokenizer)
trainer.train();

I have:

Can anybody advise whether they have been successful on AzureML Notebook with these training_args or should I just be waiting to push to hub after completion?

Traceback (segment):

/anaconda/envs/azureml_py38/lib/python3.8/site-packages/huggingface_hub/repository.py:725: FutureWarning: Creating a repository through 'clone_from' is deprecated and will be removed in v0.12. Please create the repository first using `create_repo(..., exists_ok=True)`.
  warnings.warn(
Cloning https://huggingface.co/<username>/distilbert-base-uncased-finetuned-emotion into local empty directory.
---------------------------------------------------------------------------
CalledProcessError                        Traceback (most recent call last)
File /anaconda/envs/azureml_py38/lib/python3.8/site-packages/huggingface_hub/repository.py:754, in Repository.clone_from(self, repo_url, token)
    752             env.update({"GIT_LFS_SKIP_SMUDGE": "1"})
--> 754         run_subprocess(
    755             f"{'git clone' if self.skip_lfs_files else 'git lfs clone'} {repo_url} .",
    756             self.local_dir,
    757             env=env,
    758         )
    759 else:
    760     # Check if the folder is the root of a git repository

1 Like

I have exactly the same error. Any solutions yet?

Just solved it for me: before you run the code, you have to make a repository on huggingface with the name of the model.

1 Like

Nope, doesn’t work for me. As far as I can tell the problem is with huggingface_hub.utils.run_subprocess() used to clone the repo from the hub. But I can’t figure out the exact issue with it. :slightly_frowning_face: