Dataset repo requires arbitrary Python code execution

The viewer is disabled because this dataset repo requires arbitrary Python code execution. Please consider removing the loading script and relying on automated data support. If this is not possible, please open a discussion for direct help.

Why? It worked well in before. By the way load_datasets works! and viewer works yesterday as well!

The dataset:

  1. cis-lmu/udhr-lid ¡ Datasets at Hugging Face
  2. cis-lmu/glotlid-corpus ¡ Datasets at Hugging Face
2 Likes

I have the exact same problem and cannot tell where the error is, given I have not changed anything and this used to work. Hope we can get some assistance.

3 Likes

I have the exact same problem

2 Likes

@lhoestq, care to join? I don’t know who to tag.

1 Like

We had to disable the viewer for datasets with a script for now, because some people were abusing it. Sorry for the inconvenience.

We’re seeing if there is a viable long-term solution.

In the meantime, if you want the dataset viewer to work you need to remove the dataset script and use a supported data format (csv, parquet, etc.) Personnally I’d recommend uploading the dataset using the datasets library and push_to_hub().

4 Likes

If you have a popular datasets (> 100 likes or downloads) that is affected, please let us know here – we can allowlist popular datasets.

Thanks.

I don’t have that much publicity. I was hoping to gain some :))
I fixed one of my datasets, deleted the other one, and I’m still trying to figure out what’s wrong with the third one since the viewer doesn’t work, even though I remove the load script and move to the automatic HuggingFace structure.

cc @severo maybe can help on that last one!

The viewer is being created on your third dataset, it will be available soon :slight_smile:

3 Likes

Nice

Hi @julien-c, @lhoestq :wave:

Is it possible to allow my dataset “lampent/IRFL” to use the dataset viewer with a script?
In my previous work (“nlphuji/vasr”), it worked great, and I want to customize this dataset as well.

Thank you.
Ron.

I see you made it work using data-only files, congrats!

1 Like

We recently updated the docs to make it easier to structure your dataset without a dataset script:

1 Like

Yes, and it works great! However I would like to use the dataset viewer with a script to enable image display instead of strings.

As you can see in the fields “distractors” and “answer” this are actually image identifiers and I would like to load them into the dataset viewer as images. The only methods I am aware of is the one we used in “nlphuji/vasr” (see image below).

Indeed. We have an issue to handle that case, feel free to chime in, or +1.

Hi @severo @lhoestq

Is it possible to allow my dataset “leosocy/palmnet” to use the dataset viewer with a script?

Thank you.
Leosocy.

Hi, @severo @lhoestq
We have a dataset Wenetspeech4TTS/WenetSpeech4TTS, is it possible to auto-convert this dataset to parquet and enable the dataset viewer?

The last datasets release (2.19.0) provides a CLI tool to convert to data-only (parquet): Command Line Interface (CLI)

Please tell us if it works well for you!

Hi, we are the owner of Wenetspeech4TTS/WenetSpeech4TTS, we tried to use the CLI interface of datasets to convert the dataset, but in china we are faceing network connection issues.

cc @Wauplin @lhoestq maybe