Functorch with transformers

jpcorb20 · November 15, 2022, 5:25pm

I want to accelerate per-sample gradient computations with functorch. How do we compile models from transformers to use with functorch?

Thanks

pkadambi · December 19, 2023, 8:18pm

Hi @jpcorb20, were you able to solve this issue? I’m trying to do the same right now, getting batched per-sample gradients using functorch (for BERT, ViT, and ResNet)

jpcorb20 · December 21, 2023, 4:40pm

@pkadambi hello, unfortunately no, I ended up recomputing per-sample gradients on the side in pure torch, which is far from efficient but my research only required the actual numbers in the end …

Topic		Replies	Views
Idea for building transformer from scratch Beginners	0	168	August 10, 2023
Can we parallelize transformers fine-tuning on a Hadoop cluster? 🤗Transformers	0	342	April 7, 2023
Transformers with protein data Beginners	0	323	July 6, 2022
Typical sampling decoding technique Intermediate	1	1674	April 28, 2023
Transformers for small datasets? Beginners	3	72	October 9, 2024

Functorch with transformers

Related topics