Iād love to participate!
Iād like to join!
Hey Iād love to participate!
Iād love to join! I can contribute resources for the low-resourced Tagalog language.
Email: jan_christian_cruz@dlsu.edu.ph
I think this initiative is awesome on so many levels. I have need for more high definition āstack-setsā too for what I am trying to accomplish but it might be ideal to have multiple libraries of open-cite datasets.
If the world wants to converge natural language here. Iāll answer that call gladly.
Hello World Converge on this convergence. All people. All language. All speech. Converge on this. All our systems of expression contemplation and communication. I was told Apache Arrow Table can scale quite nicely? Letās converge on this and put that to the test shall we?
Anyone feel like developing a hugging face -ax extension API to integrate this lovely Rosseta stone with Jax? How about neo4j graph platform intergration?
I have a dataset to submit from Repeval2019:
CODAH: An adversarially-Authored Question Answering Dataset for Common Sense.
I just discovered earlier today that datasets could be tools for the flip side of the model: validation & testing as well.
In the overlap of linguistics, philosophy, and mathematics letās get some logic connector and discourse marker datasets in the library as well. While on the subject of logic letās experiment with 3VL or many-valued logic models in here too? Bonus points for finding fundamental logic insights by multi linguistic discourse marker compare and contrast and run models thru infinite Gaussian process I learned about on the distill.pub 2019 visual exploration of Gaussian process.
One more thingā¦with this emergence of a wide angle broad spectrum multilingual Rosseta stone library ā take the opportunity to analyze all the makers especially interpersonal markers across many diverse languages. compare/contrast that too on dual axis infinite Gaussian processing too please and thanks in advance.
Create inference engine for crowd sourced platforms naturally create a hypercore protocol blockchain aspect from stream crypto mining for open cite data?
Sorry for thinking outloud.
DW
cubytes@gmail.com
cubytes Twitter
I would like to contribute
Hi @thomwolf, I would love to contribute and do not want to be left aside in this historic sprint .
I would like to participate
Added you all! Ping me if you didnāt receive the invitation!
Interested too !
Great initiative. I would love to contribute and add some Arabic datasets to the library.
Here are two Arabic datasets:
I am interested in arabic dataset
Not sure it counts purely as healthcare but I added the CORD-19 dataset made by AllenAI (see https://www.semanticscholar.org/cord19) in this PR: https://github.com/huggingface/datasets/pull/1129
Itās a work in progress (only metadata loaded for now) but Iām working on adding article full text and pre-computed document embeddings.
Hey, amazing initiative! I would love to join as well, and maybe contribute a dataset in Hebrew
I would love to join and participate !
Iād like to help out for this one if youāre open to it!
Count me in, please
Great work, I would love to contribute some African language datasets. Thanks
Hi @thomwolf , I am interested in helping out. Please add me to the slack channel and send any other important information to contribute.