I think this doc is just a bit confusing, in particular it mixes “formatting as TF” and “converting to TF” which is not the same thing
- format as TF in
datasets
: callingwith_format("tf")
doesn’t load in RAM, it only sets the output type of the Dataset to TF tensors (but the data still lives on disk and is memory mapped) - convert to TF in
tf.data
: by loading the full data in memory using e.g. tf.data.Dataset.from_tensor_slices()
Would be great to rephrase it a bit to make it clearer though, the docs can be modified here: datasets/docs/source/use_with_tensorflow.mdx at main · huggingface/datasets · GitHub