Stf train problem

I’m trying to train a model using the STF Trainer. I’m using the A100 40GB of Google Colab. However, even though I ran stf trainer, the model seems to be unable to properly run the dataset I want to fine-tune, so I am having difficulty. I am using a Korean-only llama model called MLP-KTLim/llama-3-Korean-Bllossom-8B, and the stf trianer is trainer = SFTTrainer(
model=BASE_MODEL,
train_dataset=train_data,
args=TrainingArguments(
output_dir=‘outputs’,
max_steps=120,
per_device_train_batch_size=2,
gradient_accumulation_steps=4,
optim=‘paged_adamw_8bit’,
warmup_steps=1,
learning_rate=2e-4,
fp16=True,
logging_steps=10,
push_to_hub=False,
report_to=‘none’,
),
peft_config=peft_config,
formatting_func=prompts,)
code. Please check if the settings of the STF train itself are incorrect. In addition, I would appreciate it if you could tell me how to check whether the STF train has been applied correctly.

1 Like