Thanks for the reply.
I proposed a solution in Ability to split a dataset in multiple files. 路 Issue #3544 路 huggingface/datasets 路 GitHub
In brief, we would edit state.json to keep track of the new columns added as files. What do you think?
Note that this does not solve the issue when we update a value in the dataset.
I am currently using a versioning mechanism every time I modify the dataset and the workers load the latest version.
Open to suggestions