Feasibility of Fine-Tuning GPT2-XL Model on 3060 RTX GPU for Academic Misinformation Identification

Hi everyone,

I’m working on a personal GPT -related project that aims to identify misinformation within academic content. The idea is to use GPT-2 XL to examine claims within academic texts and generate outputs that go beyond simple classification, providing more contextually relevant content.

For example, if a claim in a research paper states that “a specific drug has been proven effective in treating a disease,” the model would use the text of other academic papers relevant to the topic and generate a detailed output that either supports or refutes the claim based on the evidence, with a brief explanation. The idea is to use a sliding window to analyze the text content of the other research papers in order to conform to the text-input limitations.

I only have access to an RTX GeForce 3060 GPU (12 GB of memory), and while GPT-2 XL runs efficiently on my setup, I’ve noticed that the output quality is rather poor.

I’m considering fine-tuning GPT-2 XL on a custom dataset focused on academic language and misinformation to improve its performance for this specific task. However, I’m concerned about the feasibility given my limitations.

Is it possible/practical to fine-tune GPT-2 XL on an RTX 3060 for this purpose, or would the process be too computationally expensive?