Getting error in DataCollatorCTCWithPadding class while running the notebook in my local computer

danielbubiola · April 24, 2022, 5:06pm

Hello. I am having a problem while fine-tuning my hugging face transformers model. I have successfully ran my notebook on google colab and fine-tuned my model. However, when I try to run the same code on my local PC it gave me the following error(attached image).
Here is my DataCollator class.

@dataclass
class DataCollatorCTCWithPadding:
    processor: Wav2Vec2Processor
    padding: Union[bool, str] = True
    max_length: Optional[int] = None
    max_length_labels: Optional[int] = None
    pad_to_multiple_of: Optional[int] = None
    pad_to_multiple_of_labels: Optional[int] = None

    def __call__(self, features: List[Dict[str, Union[List[int], torch.Tensor]]]) -> Dict[str, torch.Tensor]:
        # split inputs and labels since they have to be of different lenghts and need
        # different padding methods
        input_features = [{"input_values": feature["input_values"]} for feature in features]
        label_features = [{"input_ids": feature["labels"]} for feature in features]

        batch = self.processor.pad(
            input_features,
            padding=self.padding,
            max_length=self.max_length,
            pad_to_multiple_of=self.pad_to_multiple_of,
            return_tensors="pt",
        )
        with self.processor.as_target_processor():
            labels_batch = self.processor.pad(
                label_features,
                padding=self.padding,
                max_length=self.max_length_labels,
                pad_to_multiple_of=self.pad_to_multiple_of_labels,
                return_tensors="pt",
            )

        # replace padding with -100 to ignore loss correctly
        labels = labels_batch["input_ids"].masked_fill(labels_batch.attention_mask.ne(1), -100)

        batch["labels"] = labels

        return batch

Can you tell me where am I doing wrong? Thank you

Topic		Replies	Views
DataCollatorWithPadding: TypeError Course	1	2016	November 21, 2021
PAD with Collator 🤗Datasets	1	650	June 4, 2021
DataCollator not padding as expected Intermediate	0	678	August 17, 2022
Can't iterate a DataLoader 🤗Datasets	3	1441	February 25, 2022
Error with DataCollator for SpeechT5 🤗Transformers	2	395	September 26, 2023

Getting error in DataCollatorCTCWithPadding class while running the notebook in my local computer

Related topics