What is the index for last layer hidden states in CausalLMOutputWithPast?
I suppose it is -1? The doubt is because when used with decoder models output has 1 more than number of layers
What is the index for last layer hidden states in CausalLMOutputWithPast?
I suppose it is -1? The doubt is because when used with decoder models output has 1 more than number of layers