# How does one do full fine-tuning on Falcon 180B?

From the blogpost, Spread Your Wings: Falcon 180B is here there’s a breakdown on instance needed and memory needed for full fine-tuning.

Is there some guide on how to do that in Sagemaker?

The questions in parts:

• Does running a simple `train_clm.py` script work for full fine-tuning?

• What configurations are available when using the “distribution” argument for full fine-tuning?

• This won’t work easily for 180B, right? `"smdistributed": {"dataparallel": {"enabled": True}}`
• “8 x 8 x A100” would be 8 counts of `ml.p4d.24xlarge` on Sagemaker, is that correct?

• 7,000,000 GPU hours on “8 x 8 x A100”, would that equate to

• 7,000,000 / 8 instance counts / 8 GPUs = 109,375 hours on “8 x 8 x A100”
• 109,375 / 24 hours a day / 365 days a year ~= 12 years on “8 x 8 x A100”
• so, to do full fine-tune as much as the training day on “8 x 8 x A100” would take 12 years?
• and to acheive the same amount of training in lets say 1 year, we have to do 96 instance counts of `ml.p4d.24xlarge`?
• and if we take \$19.22 per hour on the instance, with 96 instance for a full year, the sum cost to train the model would be around \$19.22 per instance per hour * 24 hours a day * 365 days * 96 instances ~= US\$16 million (if we’re budgeting for a full fine-tuning a similar model, would \$16M be an appropriate number?)

Thank you in advance for the information! Look forward to anyone with more information on how to do full fine-tuning on the 180B model.

2 Likes