I am trying to run run_translation.py with mt5-large and DeepSpeed enabled. I use ds_config_zero3.json as the config file. However, when I try to run this, I get the following error:
ValueError: fp16 is enabled but the following parameters have dtype that is not fp16: lm_head.weight
Is there some config setting I’m missing that could help resolve this issue?