I Finetuned DistilGPT2 on the ArXiv Papers dataset: CShorten/ML-ArXiv-Papers · Datasets at Hugging Face , but got output like:
Regression is a popular technique in deep neural networks (DNNs).
However, it is not well understood why DNNs are vulnerable. In this paper, we
propose a deep neural network (DNN) based on deep convolutional neural
networks (CNNs) based on deep convolutional neural networks (DNNs). We
propose a deep neural network (DNN) based on deep convolutional neural
networks (DNNs) based on deep convolutional neural networks (DNNs). We
propose a DNN based on deep convolutional neural networks (DNNs). We
propose a DNN based on deep convolutional neural networks (DNNs). We
propose a DNN based on deep convolutional neural networks (DNNs). We
propose a DNN based on deep convolutional neural networks (DNNs). We
propose a DNN based on deep convolutional neural networks (DNNs). We
propose a DNN based on deep convolutional neural networks (DNNs). We
propose a DNN based on deep convolutional neural networks (DNNs). We
pro
here’re my arguments:
batch_size = 32
args = TrainingArguments(
output_dir="arxiv-papers-ds",
per_device_train_batch_size=batch_size,
per_device_eval_batch_size=batch_size,
evaluation_strategy="epoch",
gradient_accumulation_steps=8,
num_train_epochs=3,
weight_decay=0.1,
warmup_steps=200,
lr_scheduler_type="cosine",
optim='adamw_torch',
learning_rate=3e-5,
save_steps=500,
fp16=True,
push_to_hub=False,
report_to='none',
)
Full code: AI-projects/generate-arxiv-papers-with-distilgpt-2.ipynb at main · jrreda/AI-projects (github.com)
How to prevent this hallucination, and get better results?