Functorch with transformers

I want to accelerate per-sample gradient computations with functorch. How do we compile models from transformers to use with functorch?

Thanks