Amharic BERT Training

israel · February 23, 2021, 10:42pm

@yjernite Problem While training Amharic Language BERT on oscar dataset

yjernite · February 23, 2021, 10:58pm

You need to remove the id column in the dataset:

tokenized_datasets = datasets.map(tokenize_function, batched=True, num_proc=4, remove_columns=["id", "text"])

israel · February 23, 2021, 11:57pm

That solved it , thank you

Topic		Replies	Views
Amharic NLP - Train BERT-style model Models	3	346	March 1, 2021
Amharic NLP: Newbie where do I start Languages at Hugging Face	13	2483	February 27, 2021
Train a Bert Classifier with more than 2 Input Text Columns Beginners	4	1883	October 27, 2023
BERT embeddings on big dataset 🤗Datasets	3	121	August 28, 2024
Habesha BERT Amharic Model cards	0	1696	March 5, 2021