I’m converting the SwitchTransformer model from HuggingFace to TorchScript.
But I encountered the following error message
ValueError: You have to specify either decoder_input_ids or decoder_inputs_embeds
.
In some other QnAs in StackOverflow, it said that Enc-Dec model cannot be converted at once and it should be divided into encoder and decoder, then it can be converted, but I still cannot understand. The model itself is integrated at the function level.
Does that mean that I should divide it making a new function?
Here are the two links that I referred to.
- python - ValueError: You have to specify either decoder_input_ids or decoder_inputs_embeds - Stack Overflow
- python - how to convert HuggingFace's Seq2seq models to onnx format - Stack Overflow
Also, in the answer in the above link, someone said that the converted model doesn’t have generate
function, then how can I make an inference?
Here’s the code that I’m using now.
from transformers import AutoTokenizer, SwitchTransformersForConditionalGeneration
from transformers import AutoTokenizer, SwitchTransformersConfig
import torch
torch.set_printoptions(threshold=10_000)
# Tokenizer
tokenizer = AutoTokenizer.from_pretrained(
"google/switch-base-8", resume_download=True, torchscript=True)
input_text = "A <extra_id_0> walks into a bar a orders a <extra_id_1> with <extra_id_2> pinch of <extra_id_3>."
input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to(0)
# To use GPU, add .to(0)
input_ids = tokenizer(input_text, return_tensors="pt", padding=True).input_ids.to(0)
# To use GPU, add args device_map="auto"
model = SwitchTransformersForConditionalGeneration.from_pretrained(
"google/switch-base-8",
device_map="auto",
resume_download=True, torch_dtype=torch.bfloat16,
torchscript = True,
)
# This is for TorchScript
model.eval()
model = torch.jit.trace(model, (input_ids))
outputs = model.generate(
input_ids,
decoder_start_token_id=0,
bos_token_id=2,
)