A10G not using VRAM after generating training split in AutoTrain

John6666 · August 12, 2025, 4:19am

I just want to confirm if this is normal behavior for the split step, and why the actual training might not be starting.

Yeah. Maybe. I think AutoTrain is designed to stop as quickly as possible when any error occurs.

BTW, that JSON settings may be for the parameters of the old version of Trainer. How about like this (in YAML)?

task: llm
base_model: nothingiisreal/MN-12B-Celeste-V1.9
project_name: mn12b-celeste-espanol-stage1
log: tensorboard

data:
  path: josecannete/large_spanish_corpus
  train_split: train
  valid_split: null
  chat_template: null
  column_mapping:
    text_column: text

params:
  trainer: sft
  block_size: -1
  model_max_length: 4096
  epochs: 1
  batch_size: 1
  gradient_accumulation: 16
  lr: 5e-5
  warmup_ratio: 0.1
  optimizer: adamw_torch
  scheduler: linear
  weight_decay: 0.01
  logging_steps: 25
  eval_strategy: epoch
  save_total_limit: 2
  mixed_precision: fp16

  # QLoRA
  peft: true
  quantization: int4
  target_modules: all-linear
  lora_r: 16
  lora_alpha: 32
  lora_dropout: 0.10

  padding: right
  seed: 42

hub:
  username: SlayerL99
  push_to_hub: true

Topic		Replies	Views
Autotrain I've wasted my money on it but it doesn't work 🤗AutoTrain	0	673	June 27, 2024
Failed to verify token Spaces	1	71	March 22, 2025
Need help setting up autotrain locally Beginners	1	481	May 21, 2024
Train a model with autotrain on huggingface using the API 🤗AutoTrain	4	339	May 30, 2024
My dataset is 250 docs, is multiple hours tuning normal? Beginners	17	321	January 29, 2025

A10G not using VRAM after generating training split in AutoTrain

Related topics