AttributeError: 'str' object has no attribute 'dtype' when pretraining wav2vec2

Hi @omar47. I’m not sure we have the same original issue.

I see two alternative issues that may cause this:

  1. Passing the class Wav2Vec2FeatureExtractor to DataCollatorForWav2Vec2Pretraining. Solution: instantiate the feature extractor before passing it to the data collator instance:
    model = Wav2Vec2ForPreTraining(args.model_path)
    feature_extractor = Wav2Vec2FeatureExtractor.from_pretrained(args.model_path)

    data_collator = DataCollatorForWav2Vec2Pretraining(
        model=model, 
        feature_extractor=feature_extractor
    )
  1. The dataset is still a dictionary:
    This problem appeared again for me because I did not remove unused columns in the preprocessing step (using prepare_dataset). Because of this the data is still in dictionary form, which I think is not expected by the padding function in the Data Collator. Make sure that you keep the line remove_columns=raw_datasets["train"].column_names when mapping the prepare_dataset function to your dataset:
vectorized_datasets = raw_datasets.map(
            prepare_dataset,
            num_proc=args.preprocessing_num_workers,
            remove_columns=raw_datasets["train"].column_names,
            cache_file_names=cache_file_names,
        )

These are the two things that I could identify that alleviated the issue in my case. Good luck!