Hey there, i’m encountering the same issue, while using structured pruning on a t5 based model architecture.
end_step = args.num_train_epochs * (
len(tokenized_datasets['train']) // args.per_device_train_batch_size
)
configs = [
{
'pattern': "4x1", # Default pruning pattern.
'target_sparsity': 0.9, # Target sparsity ratio of modules.
"pruning_scope": "local",
"end_step": end_step
}
]
pruning_config = WeightPruningConfig(configs)
model = INCModelForSeq2SeqLM.from_pretrained("Salesforce/codet5-small");
trainer = INCSeq2SeqTrainer(
model=model,
pruning_config=pruning_config,
args=args,
train_dataset=tokenized_datasets['train'],
eval_dataset=tokenized_datasets['validation'],
data_collator=data_collator,
tokenizer=tokenizer,
compute_metrics=compute_metrics,
callbacks = [EarlyStoppingCallback(early_stopping_patience=2)]
)
Besides reading the paper you suggested and observing the official documentation, I can’t seem to find any solution for that. I’d be thankful for any hint that helps me solve this problem.