Fine-tuning Wav2Vec2 for English ASR with 🤗 on local machine Transformers

I’m running https://huggingface.co/blog/fine-tune-wav2vec2-english#training–evaluation example on my local machine and getting Training Loss=nan

Step Training Loss Validation Loss Wer Runtime Samples Per Second
200 nan 13.842948 2.703102 204.199500 8.227000
400 nan 13.842948 2.703102 204.301000 8.223000
600 nan 13.842948 2.703102 204.371700 8.220000

Local modifications to the example

training_args = TrainingArguments(
output_dir="./wav2vec2-base-timit-demo",
group_by_length=True,
per_device_train_batch_size=4, ** changes from 32 **
…
save_steps=200,
eval_steps=200,
logging_steps=100,
…
)

Update - by moving the job to the CPU the Training loss is getting values.
steps so far

  1. make cuda unavailable
    import torch

torch.cuda.is_available = lambda : False

  1. disabling Mixed precision
    training_args = TrainingArguments(
    …
    # fp16=True,
    …)

*Mixed precision training with AMP or APEX (--fp16) and FP16 evaluation can only be used on CUDA devices. *