Hi there!
I want to load just the ‘tweets’ part of this dataset: https://huggingface.co/datasets/jorgeortizfuentes/chilean-spanish-corpus/viewer/default/train?p=100000
In another words, I wanted to know if there an option to specify to huggingface that i just want the rows where the ‘source’ = ‘twitter’. I didn’t know if there was a way to do this from the load_dataset() method. Any guidance would be super helpful. I am trying to create word embeddings with a word that only occurs in informal speech, so I do not need the rest of the dataset. I wanted to be able to load just the tweets in quickest way possible.
Thanks,
Joe