How to get better results with DistilGPT2?

I Finetuned DistilGPT2 on the ArXiv Papers dataset: CShorten/ML-ArXiv-Papers · Datasets at Hugging Face , but got output like:

Regression is a popular technique in deep neural networks (DNNs).
However, it is not well understood why DNNs are vulnerable. In this paper, we
propose a deep neural network (DNN) based on deep convolutional neural
networks (CNNs) based on deep convolutional neural networks (DNNs). We
propose a deep neural network (DNN) based on deep convolutional neural
networks (DNNs) based on deep convolutional neural networks (DNNs). We
propose a DNN based on deep convolutional neural networks (DNNs). We
propose a DNN based on deep convolutional neural networks (DNNs). We
propose a DNN based on deep convolutional neural networks (DNNs). We
propose a DNN based on deep convolutional neural networks (DNNs). We
propose a DNN based on deep convolutional neural networks (DNNs). We
propose a DNN based on deep convolutional neural networks (DNNs). We
propose a DNN based on deep convolutional neural networks (DNNs). We
pro

here’re my arguments:

batch_size = 32

args = TrainingArguments(
    output_dir="arxiv-papers-ds",
    per_device_train_batch_size=batch_size,
    per_device_eval_batch_size=batch_size,
    evaluation_strategy="epoch",
    gradient_accumulation_steps=8,
    num_train_epochs=3,
    weight_decay=0.1,
    warmup_steps=200,
    lr_scheduler_type="cosine",
    optim='adamw_torch',
    learning_rate=3e-5,
    save_steps=500,
    fp16=True,  
    push_to_hub=False,
    report_to='none',
)

Full code: AI-projects/generate-arxiv-papers-with-distilgpt-2.ipynb at main · jrreda/AI-projects (github.com)

How to prevent this hallucination, and get better results?