Hugging Face Forums
Distributed Training run_summarization.py
Amazon SageMaker
philschmid
July 29, 2021, 7:15am
3
For me it worked with
ml.p3.16xlarge
and batch_size of 2
show post in topic
Related topics
Topic
Replies
Views
Activity
Distributed Training on Sagemaker
Amazon SageMaker
13
2728
August 5, 2021
Sagemaker gpt-j train file error
Amazon SageMaker
27
2912
February 22, 2024
Multi Instance Training Error
Amazon SageMaker
5
1584
October 29, 2021
OutOfMemoryError: CUDA out of memory while trying to replicate this notebook on sagemaker: https://github.com/huggingface/notebooks/blob/main/sagemaker/24_train_bloom_peft_lora/sagemaker-notebook.ipynb
Amazon SageMaker
4
1687
June 16, 2023
Distributed training with Sagemaker
🤗Transformers
0
305
June 26, 2023