How do you calculate max steps

imeese8 · May 17, 2023, 3:35pm

I am looking to understand the math behind how max steps is calculated when left alone, I’ve tried to work backwards from making changes to epoch, batch, and micro-batch to see if I could figure out the formula but I haven’t had any luck. I have also looked at the documentation for the transformers trainer but I haven’t been able to find it, I’m looking to be able to predict the max steps using the knowledge of epochs, batch, micro-batch, and how many samples are in the dataset.
hopefully I’m just missing something simple thanks for your help!

bozden · May 19, 2023, 4:42am

As a newbie, I was just trying to figure out these relations. I don’t know about micro-batching, but I wanted epoch-based calculations.

    evaluation_strategy = "epoch",
    save_strategy = "epoch",

    per_device_train_batch_size = 64,
    gradient_accumulation_steps = 4,
    per_device_eval_batch_size = 64,

My train set was 31,091 these were giving me 605 steps. So I “back-propagated”…

epochs = 5
train_size = 31091
train_batch_size = 64
ga_steps = 4
virtual_batch_size = train_batch_size * ga_steps   # "invented name" => 256
per_epoch_steps = int(train_size / virtual_batch_size + 0.5) # round => 121
total_steps = epochs * per_epoch_steps # => 605

If I’m not mistaken…

jasicarose75 · July 28, 2023, 7:22pm

I need to try

Topic		Replies	Views
Explicitly set number of training steps using Trainer 🤗Transformers	5	9375	September 16, 2020
How does `max_steps` affect the number of samples the model "sees"? Beginners	4	3821	January 19, 2024
Regarding max steps, streaming in language modeling 🤗Optimum	3	235	April 13, 2024
TrainingArguments class - max_steps formula when using streaming dataset 🤗Transformers	1	3682	September 14, 2023
Why are there only 3 steps per epoch when the dataset has 2500 rows and batch_size is 1 Beginners	0	166	March 19, 2024

How do you calculate max steps

Related topics