Stf train problem

kimsin · January 21, 2025, 2:26pm

I’m trying to train a model using the STF Trainer. I’m using the A100 40GB of Google Colab. However, even though I ran stf trainer, the model seems to be unable to properly run the dataset I want to fine-tune, so I am having difficulty. I am using a Korean-only llama model called MLP-KTLim/llama-3-Korean-Bllossom-8B, and the stf trianer is trainer = SFTTrainer(
model=BASE_MODEL,
train_dataset=train_data,
args=TrainingArguments(
output_dir=‘outputs’,
max_steps=120,
per_device_train_batch_size=2,
gradient_accumulation_steps=4,
optim=‘paged_adamw_8bit’,
warmup_steps=1,
learning_rate=2e-4,
fp16=True,
logging_steps=10,
push_to_hub=False,
report_to=‘none’,
),
peft_config=peft_config,
formatting_func=prompts,)
code. Please check if the settings of the STF train itself are incorrect. In addition, I would appreciate it if you could tell me how to check whether the STF train has been applied correctly.

Topic		Replies	Views
Trouble running SFT with PEFT model Beginners	2	1015	March 19, 2024
Fine tune a finetuned model Beginners	1	563	December 16, 2024
Attempting to unscale FP16 gradients 🤗Transformers	3	7983	June 10, 2024
Train huggingface Beginners	2	391	November 10, 2023
SFTTrainer training very slow on GPU. Is this training speed expected? 🤗Transformers	4	294	February 8, 2025

Stf train problem

Related topics