Re-Initialize Trainer object in for loop, does model update itself?

tanaha2002 · September 15, 2024, 10:06am

Hello everyone, I save all my tokenizer dataset to shard in local and using for loop to load each shard to train. My question is the weights of model object is keeping update in memory so even if I don’t manually update itself, the model still using the latest weights, right?

For more clearly, here is my code:

# Loop through shards and train on each one
# Start from the shard after the last one trained (last_shard + 1)
shard_start = last_shard + 1 if last_checkpoint else 0

for i in range(shard_start, 129):  # Assume shards go up to 128
    shard_path = f"./tokenized_dataset/tokenized_shard_{i}"
    
    # Check if the shard path exists
    if os.path.exists(shard_path):
        tokenized_shard = load_from_disk(shard_path)
        
        # Create a new output directory for each shard
        shard_output_dir = f"{base_output_dir}/shard_{i}"
        os.makedirs(shard_output_dir, exist_ok=True)
        
        # Update output_dir for this shard
        training_args.output_dir = shard_output_dir
        
        # Initialize Trainer object
        trainer = Trainer(
            model=model,
            args=training_args,
            train_dataset=tokenized_shard,
            data_collator=data_collator,
        )
        
        # Train on the current shard
        trainer.train()
        
        # Save the model after training on this shard
        trainer.save_model(shard_output_dir)
        
    else:
        print(f"Shard {i} not found: {shard_path}")

# Save the final model and tokenizer
trainer.save_model(f"{base_output_dir}/final_model")
tokenizer.save_pretrained(f"{base_output_dir}/final_model")

Summary question: Do I need to put model = LlamaForCausalLM.from_pretrained(shard_output_dir) after trainer.save_model(shard_output_dir) to model object update the newest weights or it automatic using the newest weights?

Topic		Replies	Views
Trainer API weights initialization 🤗Transformers	2	62	February 10, 2025
Updating model and tokenizers inside Trainer.train Models	0	34	August 23, 2024
Do we use pre-trained weights in Trainer? Beginners	2	430	January 7, 2022
Checkpoint vs model weight Beginners	2	4786	October 12, 2020
If I use trainer.train() and then save the model, is that still useful? Beginners	4	2779	June 24, 2022

Re-Initialize Trainer object in for loop, does model update itself?

Related topics