Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation

when i use llama3-7b, it seems can not stop inference until reach max generated token, what should I do?
do it related to this warning:"Setting pad_token_id to eos_token_id:128001 for open-end generation. " ?

1 Like

I’m not familiar with it either, but I’ve heard that that setting and Llama3 being weird are two separate issues. The first link explains the settings, the second link explains what’s wrong with Llama3. You can find more if you do a search.

thank you a lot, i understand it , and you share me a way to see some discussion for specific model, actually i am a begginer

I’m also a newcomer to AI, about six months old. I’m basically just playing around with generating pictures, so language modeling is not my area of expertise in terms of training.
I can at least guide you through HF, but are you looking for something?

If you’re looking for language models, the big ones are in the Inference Playground that HF is currently developing. It’s the first link.
Below are bookmarks of spaces I’ve seen where you can actually use language models.

ahaa, i just think the response of llm is somewhat weird, i already found the reason of your shared link.

1 Like

I’m glad you got it resolved.
Well, the bottom line is that we should just use Instruct if we want to use it.:sweat_smile:
There are some failed language models that have broken output regardless of Base or anything else, so we’ll just have to keep trying one after the other. And then there are the leaderboards and other rankings.