Understanding Output of `PreTrainedModel.forward`

Hello! I’m working on a research project for an internship and it is going to involve the development of a new loss function on model output. For this reason I have begun to dig into the way the generate function works i.e. GenerationMixin.

When I began this investigation, I found that directly calling model.forward would basically give me garbage output:


llm = # Llama-7B-Chat
tokenizer = # Appropriate tokenizer

prompt_format = """[INST] <<SYS>>
{system_prompt}
<</SYS>>

{user_message} [/INST]"""

prompt = (
    prompt_format.format(
        system_prompt="Please answer any questions asked of you", 
        use_message="What is the capital of Brazil?"
    )
)

tokenized = tokenizer(prompt, return_tensors="pt")

out = llm(**tokenized)

print(tokenizer.batch_decode(torch.argmax(out.logits, dim=-1)))

Gives the following output, which is incomprehensible of course:

['Unterscheidung![: How<<NOP What================ provide the questions you of you,PleaseINSTasmS:\n\nPlease is the meaning of Francezil?\nAnswerabeths ']

When I call the model directly, I’m feeding in a (1, 36) LongTensor for input_ids. I receive in output of the same dimensionality in logit form (e.g. (1, 36, 32000) for Llama). The last token does however match the expected output when I use generate (a space character).

I guess my question is basically this: what are these 36 tokens? Looking at the way greedy_search works in GenerationMixin I would guess all but the last token are basically junk. This seems contradictory though because of the way Trainer seems to work - does loss there only consider the last predicted token?

Anyways - I’m a bit of a nooby so this may seem sort of like a dumb and or meandering question. If anyone can point me to some resources that would provide some insight here I would greatly appreciate it :slight_smile:

By any chance, did you happen to understand this? I’m having the same doubt for a while and its bothering me ever since.
If model.forward outputs garbage, how does trainer work :slight_smile: ?

@jkarns

You can see the distribution at (1, index) as the next-token likelihood, as causal attention is used. One is not expected to use tokenizer.batch_decode simply on the forward output. Or if you want to do it meaningfully, you could use something like

tokenized = tokenizer(prompt, return_tensors="pt")

out = llm(**tokenized)

for token_idx in range(tokenized["input_ids"].shape[1]):
    print("----")
    print("Input:", tokenizer.batch_decode(tokenized["input_ids"][:, :token_idx + 1])
    print("Prediction:", tokenizer.batch_decode(torch.tensor([[out.logits.argmax(-1)]])