@thomwolf I wanna contribute too
Ok, added you all, check your spam folder if you feel like you didnāt receive an invitation email!
The PR is already on its way: https://github.com/huggingface/datasets/pull/1129
Iām working on integrating the full_text of articles to extend the possibilities. Next steps would be to integrate the bib references and the doc embeddings.
@thomwolf I have a 400 GB German High Quality Web Text Corpus available and would be happy to contribute it. Is this possible or is it too large?
I am genuinely interested to be a part of this effort. Please add me to the Slack channel.
Hey, @thomwolf I already planned to contribute with some argument mining datasets and would like to use this opportunity to final do so!
@thomwolf Great project! We worked in the past on classification for low-resource languages and Iād like to add datasets from that area.
I would be glad to add one entity type prediction data and one knowledge graph dataset
That is very interesting! How can we participate?
Invited you all to the slack channel!
If you think you didnāt received the invitation, check you your spam folder
See you on the slack
Iām gonna add one (or more) Dutch datasets! Sign me up please
Youāre added Niels! Welcome
Hi,
I would like to participate. Please add me.
Thanks.
Great initiative! Hope itās not late, Iād like to join as well and contribute a dataset in Hebrew (have some news dataset)
Hi ,
I am interested in contributing, i know other languages like Hindi and Punjabi.
Hi, I could add some Malayalam and Urdu datasets. Please sign me up!
Hey! I want to participate.
Hello! I am interested in contributing. Please add me
You are now all invited to the slack channel!
If you donāt see the invitation, check you your spam folder
Talk to you on the (super active) slack!