Can the parquet-converter bot handle list[str] dtypes found in a webdataset?

I have a large dataset in webdatasets format and one of the columns is a list of strings.

I see here that the dataset viewer doesn鈥檛 seem capable of handling the list[str] dtype at the moment, but I am wondering if the parquet-converter bot will modify the dtype to one that works when it converts the webdatasets files to parquet files.

Edit: This repo seems to show that list dtype work? HuggingFaceH4/OpenHermes-2.5-1k-longest 路 Datasets at Hugging Face

Do you have a dataset to share, so that we can investigate?

My dataset is available here: ProGamerGov/synthetic-dataset-1m-high-quality-captions 路 Datasets at Hugging Face

I set the YAML metadata to match what was done here, by using sequence: string instead of dtype: string: davanstrien/dataset-tldr-preference-dpo 路 Datasets at Hugging Face, but I鈥檓 not sure if that鈥檚 correct as I haven鈥檛 been able to find a ton of documentation for how to do things.

Have you looked at Data files Configuration, maybe it would help to configure the YAML.

Anyway, maybe @lhoestq can give you a hand.

So I鈥檝e looked at the configuration and I have been unable to figure out the issue.