Any Model for NER on French

Hi all,

I have been looking for a model to run a NER task in French. I see there are Camembert and RobertA models for token classification but these models are not fine-tuned for any NER tasks. Any suggestions on this? If there is not any model, is there any French dataset tagged for NER?

Thank you,
Sergul

I just asked Pedro (https://github.com/pjox), maybe he knows some good NER datasets for French that are publicly available for fine-tuning.

If someone could give me access to FTB dataset (see CamemBERT paper), I could fine-tune a model + upload it to the model hub :sweat_smile:

Alternatives would be to use “silver standard” datasets like WikiANN/Panx or WikiNER (that include French) :slight_smile:

Thank you @stefan-it. If I use WikiNER, do you know if there is a good way to convert it to ConLL format?

For the last time I worked with WikiNER I wrote an own script that converts the dataset into a CoNLL-like format - you can find it here:

Excellent! Thank you so much and please let me know if you upload a NER model in French :slight_smile:

Oh one more thing, @stefan-it do you have an evaluation script (F1, accuracy, etc scores) for ConLL type?

Found it here https://github.com/huggingface/transformers/tree/master/examples/token-classification