How to Prune Transformer based Model?

Hey there, i’m encountering the same issue, while using structured pruning on a t5 based model architecture.

end_step = args.num_train_epochs * (
                len(tokenized_datasets['train']) // args.per_device_train_batch_size
            )
configs = [
    {
        'pattern': "4x1", # Default pruning pattern.
        'target_sparsity': 0.9,   # Target sparsity ratio of modules.
        "pruning_scope": "local",
        "end_step": end_step
    }
]

pruning_config = WeightPruningConfig(configs)
model = INCModelForSeq2SeqLM.from_pretrained("Salesforce/codet5-small");

trainer = INCSeq2SeqTrainer(
    model=model,
    pruning_config=pruning_config,
    args=args,
    train_dataset=tokenized_datasets['train'],
    eval_dataset=tokenized_datasets['validation'],
    data_collator=data_collator,
    tokenizer=tokenizer,
    compute_metrics=compute_metrics,
    callbacks = [EarlyStoppingCallback(early_stopping_patience=2)]
)

Besides reading the paper you suggested and observing the official documentation, I can’t seem to find any solution for that. I’d be thankful for any hint that helps me solve this problem.