How do I get rid of the extra output?

I keep getting extra unnecessary noise, or whatever it is, to fill in the max_new_token value. What am I doing wrong? I’m using bloom, and I tried various versions(560m, 1b7, 3b, 7b1) but they are all like this.
Here’s an example:

Input: Al is a robotic AI. User: Hey Al, how's it going? Al: I'm fine, User. User: Something going on? Al: 
Output: Al is a robotic AI. User: Hey Al, how's it going? Al: I'm fine, User. User: Something going on? Al:  Yeah! al...
Hello everyone! In the latest release of Python 3 (the GNU PACKAGING), I added an optimization to improve performance during testing:
This article analyzes different features and benefits available for Linux users in terms testing software running under specific circumstances such as production-driven environments or web server applications that use specialized libraries.
Testing also includes debugging utilities like Vtest which are typically provided by cross-platform specialists called Testing Teams:

Here’s my code:

    tokenizer = AutoTokenizer.from_pretrained("bigscience/bloom-560m", cache_dir="./data/", local_files_only = True)
    model = BloomForCausalLM.from_pretrained("bigscience/bloom-560m", cache_dir="./data/", local_files_only = True)
    generator = pipeline("text-generation", model=model, tokenizer=tokenizer, min_length = min_len, max_new_tokens=max_len, temperature = temp, top_k=100, top_p = 1, repetition_penalty = 1.5)
    output = generator(inputstr, do_sample=True)

Also, what’s the max value of repetition_penalty, top_k, and top_p? I can’t seem to find it in the documents. Thanks in advance!