Hello everyone, I save all my tokenizer dataset to shard in local and using for loop to load each shard to train. My question is the weights of model object is keeping update in memory so even if I don’t manually update itself, the model still using the latest weights, right?
For more clearly, here is my code:
# Loop through shards and train on each one
# Start from the shard after the last one trained (last_shard + 1)
shard_start = last_shard + 1 if last_checkpoint else 0
for i in range(shard_start, 129): # Assume shards go up to 128
shard_path = f"./tokenized_dataset/tokenized_shard_{i}"
# Check if the shard path exists
if os.path.exists(shard_path):
tokenized_shard = load_from_disk(shard_path)
# Create a new output directory for each shard
shard_output_dir = f"{base_output_dir}/shard_{i}"
os.makedirs(shard_output_dir, exist_ok=True)
# Update output_dir for this shard
training_args.output_dir = shard_output_dir
# Initialize Trainer object
trainer = Trainer(
model=model,
args=training_args,
train_dataset=tokenized_shard,
data_collator=data_collator,
)
# Train on the current shard
trainer.train()
# Save the model after training on this shard
trainer.save_model(shard_output_dir)
else:
print(f"Shard {i} not found: {shard_path}")
# Save the final model and tokenizer
trainer.save_model(f"{base_output_dir}/final_model")
tokenizer.save_pretrained(f"{base_output_dir}/final_model")
Summary question: Do I need to put model = LlamaForCausalLM.from_pretrained(shard_output_dir)
after trainer.save_model(shard_output_dir)
to model object update the newest weights or it automatic using the newest weights?