Memory error while loading custom dataset

If you do like in snippet I provided above, with yielding {"path": stem_path} in audio columns, they will be stored as paths and searched on disk. But you can write bytes instead by yielding {"path": None, "bytes": stem_path_file.read()}. But make sure not to provide real full paths in "path" then (you can set it to relative audio path/filename, to preserve file extension) - to ensure that bytes are written instead of paths.

1 Like