Amharic BERT Training

@yjernite Problem While training Amharic Language BERT on oscar dataset

Colab Link

1 Like

You need to remove the id column in the dataset:

tokenized_datasets = datasets.map(tokenize_function, batched=True, num_proc=4, remove_columns=["id", "text"])
1 Like

That solved it , thank you

1 Like