I keep getting extra unnecessary noise, or whatever it is, to fill in the max_new_token value. What am I doing wrong? I’m using bloom, and I tried various versions(560m, 1b7, 3b, 7b1) but they are all like this.
Here’s an example:
Input: Al is a robotic AI. User: Hey Al, how's it going? Al: I'm fine, User. User: Something going on? Al: Output: Al is a robotic AI. User: Hey Al, how's it going? Al: I'm fine, User. User: Something going on? Al: Yeah! al... Hello everyone! In the latest release of Python 3 (the GNU PACKAGING), I added an optimization to improve performance during testing: This article analyzes different features and benefits available for Linux users in terms testing software running under specific circumstances such as production-driven environments or web server applications that use specialized libraries. Testing also includes debugging utilities like Vtest which are typically provided by cross-platform specialists called Testing Teams:
Here’s my code:
tokenizer = AutoTokenizer.from_pretrained("bigscience/bloom-560m", cache_dir="./data/", local_files_only = True) model = BloomForCausalLM.from_pretrained("bigscience/bloom-560m", cache_dir="./data/", local_files_only = True) generator = pipeline("text-generation", model=model, tokenizer=tokenizer, min_length = min_len, max_new_tokens=max_len, temperature = temp, top_k=100, top_p = 1, repetition_penalty = 1.5) output = generator(inputstr, do_sample=True)
Also, what’s the max value of repetition_penalty, top_k, and top_p? I can’t seem to find it in the documents. Thanks in advance!