Transformers and Hyperparameter search using Optuna

Steps with transformers are not epochs. In my understanding, 1 step = 1 batch
So if you have a train set with, say, 1000 samples, and you train in batches of 64 samples, then it will take you 16 steps (16 batches) to push the whole train set through training once, i.e. to do 1 epoch. In this example, 1 epoch = 16 steps. If you do the math for your script, you should find that, in your case, 500 steps = 1 epoch.

Not sure I understand which .json file you are referring to. What is the complete path?
Log fiiles I have are organized like this

There is one directory for every run (what Optuna calls “trials”), and every run contains its checkpoints, in my case they are saved every 32 steps because here I have 1 epoch = 32 steps . In every checkpoint, trainer_state.json contains all the info for that specific run up to the given checkpoint included.

If you are looking for a single, overall log file with all the runs together, no I don’t have it. I get one log file trainer_state.json for every checkpoint of every run.