Hi,
I am applying a map function of my datasets. ex:
df, _ = df.map(preprocessor, batched=True, num_proc=self.num_cores)
When there are empty values for a particular column and all empty values fall into a single batch while applying the preprocessor using map in batch mode and num_proc, it identifies the feature type as null for the column in that particular batch and fails with ValueError: Features must match for all datasets
Is there way to force the feature type or ignoring feature type as a parameter to ‘map’ so that it won’t check it during concatenation? @lhoestq