BART seq2seq -100 tokens in prediction

Hello

I am using facebook/BART for seq2sea task. I followed along this tutorial Translation - Hugging Face NLP Course and found something weird.

When I use BART to predict a samples there are -100 tokens in the array of output

array([    2,     0,  8800,  3850, 37589,  1000,  3675, 23054,  7778,
           2,     1,     1,     1,     1,     1,     1,     1,     1,
           1,     1,     1,     1,     1,     1,  -100,  -100,  -100,
        -100,  -100,  -100,  -100,  -100,  -100,  -100,  -100,  -100,
        -100,  -100,  -100,  -100,  -100,  -100,  -100,  -100,  -100,
        -100,  -100,  -100,  -100,  -100,  -100,  -100,  -100,  -100,
        -100,  -100,  -100,  -100,  -100,  -100,  -100,  -100,  -100,
        -100,  -100,  -100,  -100,  -100,  -100,  -100,  -100,  -100,
        -100,  -100,  -100,  -100,  -100,  -100,  -100])

You can see that it already has 1 as a BART’s padding token. So, where -100 come from??
The model used in tutorial is Marian and It does predict any -100.

this is my training args

args = Seq2SeqTrainingArguments(
    save_folder,
    overwrite_output_dir=True,
    logging_strategy="epoch",
    evaluation_strategy="epoch",
    save_strategy="epoch",
    learning_rate=2e-6,
    per_device_train_batch_size=batch_size,
    per_device_eval_batch_size=batch_size,
    save_total_limit=3,
    num_train_epochs=20,
    predict_with_generate=True,
    fp16=True,
    report_to="none",
    load_best_model_at_end=True,
    seed=65,
    generation_max_length=128, 
    generation_num_beams=10,
)

When -100 are in predictions, it break this function

def postprocess(predictions, labels):
    predictions = predictions.cpu().numpy()
    labels = labels.cpu().numpy()

    decoded_preds = tokenizer.batch_decode(predictions, skip_special_tokens=True)

    # Replace -100 in the labels as we can't decode them.
    labels = np.where(labels != -100, labels, tokenizer.pad_token_id)
    decoded_labels = tokenizer.batch_decode(labels, skip_special_tokens=True)

    # Some simple post-processing
    decoded_preds = [pred.strip() for pred in decoded_preds]
    decoded_labels = [[label.strip()] for label in decoded_labels]
    return decoded_preds, decoded_labels

I also would like to know how to control length of predicted text?? right now my predictions are shape of (m, 79). Why 79??