Hello, I am currently experimenting with the CodeGen model, but I don’t know how to use the inputs_embeds parameter with the .generate function.
Currently, my code is the following:
tokenizer = AutoTokenizer.from_pretrained("Salesforce/codegen-2B-mono")
model = AutoModelForCausalLM.from_pretrained("Salesforce/codegen-2B-mono").to(device)
inputs = tokenizer(text, return_tensors="pt").to(device)
inputs = {
'inputs_embeds': extract_custom_embeddings(inputs['input_ids']),
'attention_mask': inputs['attention_mask']
}
output = model.generate(**inputs)
However, if I run it, I get the following error:
ValueError: If inputs_embeds is passed as model-specific keyword input, then model has to be an encoder-decoder and not a CodeGenForCausalLM.
I want to do this because I am currently trying to improve on the existing embedding extraction method.
Is there maybe somewhere some tutorial on how to do this?