Here’ codes that precompute data using main_process_first()
which use 1st gpu to caculate all precomputation and shares results across all processes. However, using one gpu to precompute on big data is slow, can we use all gpu multi-processing-ly and finally merge data in one piece which shares across all processes.
with accelerator.main_process_first():
from datasets.fingerprint import Hasher
# fingerprint used by the cache for the other processes to load the result
# details: https://github.com/huggingface/diffusers/pull/4038#discussion_r1266078401
new_fingerprint = Hasher.hash(args)
new_fingerprint_for_vae = Hasher.hash(vae_path)
train_dataset_with_embeddings = train_dataset.map(
compute_embeddings_fn, batched=True, new_fingerprint=new_fingerprint