Datasets.load_datasets fails

Character code errors still occur in 2024…
Apparently there are cases where it can be avoided by explicitly specifying it at load time.
If this does not work, there may be another cause.

dataset = datasets.load_dataset(
“jxu124/OpenX-Embodiment”,
“berkeley_gnm_cory_hall”,
streaming=False,
split=“train”,
cache_dir=ds_root,
trust_remote_code=True,
encoding="utf-16",
)

or

dataset = datasets.load_dataset(
“jxu124/OpenX-Embodiment”,
“berkeley_gnm_cory_hall”,
streaming=False,
split=“train”,
cache_dir=ds_root,
trust_remote_code=True,
encoding="utf-8",
)