I’m doing some transformations over a dataset with a labels
column where some values are None
but after the first .map
transformation over a new field, the None
values are converted into empty lists.
It’s a normal behaviour? How can I preserve the None
values?
Thanks in advance!
Hi ! yes this is a known bug, see `None` replaced by `[]` after first batch in map · Issue #3676 · huggingface/datasets · GitHub
This can be fixed once Apache Arrow has the feature we requested here: [ARROW-15839] [C++][Python] Allow to reconstruct a ListArray with ListArray.from_arrays and keep the nulls - ASF JIRA
Feel free to post a message/vote for this issue on Arrow’s JIRA to express your need, this can probably help the Arrow team to prioritize this.
In the meantime we’re looking at workarounds to fix this, I’ll let you know what we come up with
Thanks @lhoestq for response.
I didn’t find the issue in the repo. I’ve voted the issue in pyarrow jira.
Thanks for letting me know about a workaround when ready.