Feasibility of Fine-Tuning GPT2-XL Model on 3060 RTX GPU for Academic Misinformation Identification

ipryzk · August 22, 2024, 1:59am

Hi everyone,

I’m working on a personal GPT -related project that aims to identify misinformation within academic content. The idea is to use GPT-2 XL to examine claims within academic texts and generate outputs that go beyond simple classification, providing more contextually relevant content.

For example, if a claim in a research paper states that “a specific drug has been proven effective in treating a disease,” the model would use the text of other academic papers relevant to the topic and generate a detailed output that either supports or refutes the claim based on the evidence, with a brief explanation. The idea is to use a sliding window to analyze the text content of the other research papers in order to conform to the text-input limitations.

I only have access to an RTX GeForce 3060 GPU (12 GB of memory), and while GPT-2 XL runs efficiently on my setup, I’ve noticed that the output quality is rather poor.

I’m considering fine-tuning GPT-2 XL on a custom dataset focused on academic language and misinformation to improve its performance for this specific task. However, I’m concerned about the feasibility given my limitations.

Is it possible/practical to fine-tune GPT-2 XL on an RTX 3060 for this purpose, or would the process be too computationally expensive?

Topic		Replies	Views
Fine tuning and retokenizing Beginners	0	589	May 29, 2022
Time and memory taken to fine-tune GPT-2 Models	0	771	February 22, 2021
Task-specific fine-tuning of GPT2 Research	0	1045	April 22, 2021
Best practices for improving text generation speed? Beginners	0	2748	April 30, 2022
Error: Fine-tune GPT2 model for question answer task 🤗Transformers	1	791	June 16, 2023

Feasibility of Fine-Tuning GPT2-XL Model on 3060 RTX GPU for Academic Misinformation Identification

Related topics