Converting logits to string without .generate()

g8i96 · February 13, 2023, 4:16pm

Hello, I am currently experimenting with the CodeGen model, but I don’t know how to use the inputs_embeds parameter with the .generate function.

Currently, my code is the following:

    tokenizer = AutoTokenizer.from_pretrained("Salesforce/codegen-2B-mono")
    model = AutoModelForCausalLM.from_pretrained("Salesforce/codegen-2B-mono").to(device)
    
    inputs = tokenizer(text, return_tensors="pt").to(device)

    inputs = {
        'inputs_embeds':  extract_custom_embeddings(inputs['input_ids']),
        'attention_mask': inputs['attention_mask']
    }
    
    output = model.generate(**inputs)

However, if I run it, I get the following error:

ValueError: If inputs_embeds is passed as model-specific keyword input, then model has to be an encoder-decoder and not a CodeGenForCausalLM.

I want to do this because I am currently trying to improve on the existing embedding extraction method.
Is there maybe somewhere some tutorial on how to do this?

Topic		Replies	Views
How to use inputs_embeds in generate()? 🤗Transformers	5	5628	July 8, 2023
Generate with inputs_embeds 🤗Transformers	0	283	April 11, 2023
Feeding embeddings to `model.generate` Models	0	656	December 1, 2022
Using inputs_embeds as input for GPT2 generation_utils 🤗Transformers	1	441	March 16, 2023
How can i skip GPT2LMHeadModel embedding layers? 🤗Transformers	4	1025	February 17, 2023

Converting logits to string without .generate()

Related topics