Dataset access with `use_auth_token`

I’m trying to get the following dataset (linked here). So wondering what to do with use_auth_token. I’m running this on a kaggle kernel, and I can store secrets, so is there a way of setting an environment variable to skip this authentication? Otherwise how can I authenticate to get access?

from datasets import load_dataset
pmd = load_dataset("facebook/pmd", use_auth_token=True)

Hi! You can use huggingface_hub’s notebook_login to log in:

from huggingface_hub import notebook_login
notebook_login()

from datasets import load_dataset
pmd = load_dataset("facebook/pmd", use_auth_token=True)

This method writes the user’s credentials to the config file and is the preferred way of authenticating inside a notebook/kernel.

Another approach is to directly specify a token (not suitable for public notebooks):

from datasets import load_dataset
pmd = load_dataset("facebook/pmd", use_auth_token="<token_string>")
4 Likes

Where do you put the use_auth_token=True when you load the your custom dataset with the loading script and not with load_dataset () function?

The loading scripts are meant to be loaded with load_dataset/load_dataset_builder, so please provide more info about your custom loading procedure.

@mariosasko, thanks,