Model quantization

I am trying to do the static quantization on the T5 model(flexudy/t5-small-wav2vec2-grammar-fixer) for reducing the inference time .

Code :
import torch
import transformers
from transformers import T5Tokenizer, T5ForConditionalGeneration
model_name = “flexudy/t5-small-wav2vec2-grammar-fixer”
torch_device = ‘cuda’ if torch.cuda.is_available() else ‘cpu’
model = T5ForConditionalGeneration.from_pretrained(model_name).to(torch_device)
model.qconfig = torch.quantization.get_default_qconfig(‘fbgemm’)
model_fused = torch.quantization.fuse_modules(model,[[‘linear’, ‘linear’]])

But it says that
AttributeError: ‘T5ForConditionalGeneration’ object has no attribute ‘linear’

use dynamic quan.