Understanding Output of `PreTrainedModel.forward`

Hello! I’m working on a research project for an internship and it is going to involve the development of a new loss function on model output. For this reason I have begun to dig into the way the generate function works i.e. GenerationMixin.

When I began this investigation, I found that directly calling model.forward would basically give me garbage output:

llm = # Llama-7B-Chat
tokenizer = # Appropriate tokenizer

prompt_format = """[INST] <<SYS>>

{user_message} [/INST]"""

prompt = (
        system_prompt="Please answer any questions asked of you", 
        use_message="What is the capital of Brazil?"

tokenized = tokenizer(prompt, return_tensors="pt")

out = llm(**tokenized)

print(tokenizer.batch_decode(torch.argmax(out.logits, dim=-1)))

Gives the following output, which is incomprehensible of course:

['Unterscheidung![: How<<NOP What================ provide the questions you of you,PleaseINSTasmS:\n\nPlease is the meaning of Francezil?\nAnswerabeths ']

When I call the model directly, I’m feeding in a (1, 36) LongTensor for input_ids. I receive in output of the same dimensionality in logit form (e.g. (1, 36, 32000) for Llama). The last token does however match the expected output when I use generate (a space character).

I guess my question is basically this: what are these 36 tokens? Looking at the way greedy_search works in GenerationMixin I would guess all but the last token are basically junk. This seems contradictory though because of the way Trainer seems to work - does loss there only consider the last predicted token?

Anyways - I’m a bit of a nooby so this may seem sort of like a dumb and or meandering question. If anyone can point me to some resources that would provide some insight here I would greatly appreciate it :slight_smile: