How to Prune Transformer based Model?

fwhsv · August 25, 2023, 11:53am

Hey there, i’m encountering the same issue, while using structured pruning on a t5 based model architecture.

end_step = args.num_train_epochs * (
                len(tokenized_datasets['train']) // args.per_device_train_batch_size
            )
configs = [
    {
        'pattern': "4x1", # Default pruning pattern.
        'target_sparsity': 0.9,   # Target sparsity ratio of modules.
        "pruning_scope": "local",
        "end_step": end_step
    }
]

pruning_config = WeightPruningConfig(configs)
model = INCModelForSeq2SeqLM.from_pretrained("Salesforce/codet5-small");

trainer = INCSeq2SeqTrainer(
    model=model,
    pruning_config=pruning_config,
    args=args,
    train_dataset=tokenized_datasets['train'],
    eval_dataset=tokenized_datasets['validation'],
    data_collator=data_collator,
    tokenizer=tokenizer,
    compute_metrics=compute_metrics,
    callbacks = [EarlyStoppingCallback(early_stopping_patience=2)]
)

Besides reading the paper you suggested and observing the official documentation, I can’t seem to find any solution for that. I’d be thankful for any hint that helps me solve this problem.

Topic		Replies	Views
How to prune a transformer? 🤗Transformers	1	1660	February 24, 2023
Should pruning shrink model?; adjusting sparsity didn't change inference time 🤗Optimum	2	787	February 29, 2024
Hugging Face Reads - 01/2021 - Sparsity and Pruning Research	14	7499	June 3, 2025
Optimum Pruning and Quantization Current Limitation 🤗Transformers	4	986	April 26, 2022
Why are embedding / pooler layers excluded from pruning comparisons? Research	7	795	February 16, 2021

How to Prune Transformer based Model?

Related topics