Non-consecutive added token '<s>' found

‘’‘python run_speech_recognition_ctc.py --dataset_name=“mozilla-foundation/common_voice_8_0” --model_name_or_path=“facebook/wav2vec2-xls-r-300m” --dataset_config_name=“it” --output_dir="./" --overwrite_output_dir --num_train_epochs=“50” --per_device_train_batch_size=“8” --per_device_eval_batch_size=“8” --gradient_accumulation_steps=“4” --learning_rate=“7.5e-5” --warmup_steps=“2000” --length_column_name=“input_length” --evaluation_strategy=“steps” --text_column_name=“sentence” --save_steps=“500” --eval_steps=“500” --logging_steps=“100” --layerdrop=“0.0” --activation_dropout=“0.1” --save_total_limit=“3” --freeze_feature_encoder --feat_proj_dropout=“0.0” --mask_time_prob=“0.75” --mask_time_length=“10” --mask_feature_prob=“0.25” --mask_feature_length=“64” --chars_to_ignore , ? . ! - ; : " “ % ‘ ” � — ’ … – --gradient_checkpointing --use_auth_token --fp16 --group_by_length --do_train --do_eval --push_to_hub’’’
File “run_speech_recognition_ctc.py”, line 514, in main
tokenizer = AutoTokenizer.from_pretrained(
File “/opt/conda/lib/python3.8/site-packages/transformers/models/auto/tokenization_auto.py”, line 460, in from_pretrained
return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
File “/opt/conda/lib/python3.8/site-packages/transformers/tokenization_utils_base.py”, line 1773, in from_pretrained
return cls._from_pretrained(
File “/opt/conda/lib/python3.8/site-packages/transformers/tokenization_utils_base.py”, line 1958, in _from_pretrained
raise ValueError(
ValueError: Non-consecutive added token ‘’ found. Should have index 174 but has index 35 in saved vocabulary.

1 Like