Need Help Interpreting AutoTrain Error Log

Syllogem · January 22, 2024, 6:53pm

Hi! Basically, I’m having trouble with the AutoTrain feature of the Hugging Face website and I don’t know what I’m doing wrong. I created a space and followed the instructions as best as I could but I keep getting the same error message(s) and can’t proceed further. I have included the error log below in the hope that someone with more familiarity with this process can zero in on the problem. After that are the contents (without the columns) of the dataset I was trying to incorporate into the HuggingFaceH4/zephyr-7b-alpha LLM. It’s from a very small CSV file and was created from one scientific article just to see if a Q&A session with the LLM was possible. Also, the last answer to the last question in the dataset is missing as it was essentially the entire scientific article and isn’t included in the interest of saving space.

Any help would be greatly appreciated. Thanks!

Error Log:

"===== Application Startup at 2024-01-22 05:06:52 =====

==========
== CUDA ==

CUDA Version 12.1.1

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

WARNING: The NVIDIA Driver was not detected. GPU functionality will not be available.
Use the NVIDIA Container Toolkit to start this container with GPU support; see.

INFO: Will watch for changes in these directories: [‘/app’]
WARNING: “workers” flag is ignored when reloading is enabled.
INFO: Uvicorn running on http://0.0.0.0:7860 (Press CTRL+C to quit)
INFO: Started reloader process [35] using StatReload

INFO Authenticating user…
WARNING Parameters not supplied by user and set to default: model_max_length, text_column, prompt_text_column, lora_dropout, auto_find_batch_size, warmup_ratio, trainer, logging_steps, lr, rejected_text_column, gradient_accumulation, max_grad_norm, use_flash_attention_2, evaluation_strategy, lora_r, disable_gradient_checkpointing, model_ref, lora_alpha, weight_decay, merge_adapter, project_name, seed, optimizer, username, add_eos_token, data_path, push_to_hub, batch_size, apply_chat_template, save_total_limit, train_split, save_strategy, valid_split, dpo_beta, model, scheduler, repo_id, token
WARNING Parameters not supplied by user and set to default: text_column, project_name, seed, warmup_ratio, auto_find_batch_size, optimizer, username, data_path, push_to_hub, logging_steps, lr, batch_size, max_grad_norm, gradient_accumulation, save_total_limit, train_split, save_strategy, evaluation_strategy, valid_split, epochs, target_column, model, max_seq_length, scheduler, token, repo_id, weight_decay, log
WARNING Parameters not supplied by user and set to default: project_name, seed, warmup_ratio, auto_find_batch_size, optimizer, username, data_path, push_to_hub, logging_steps, lr, batch_size, max_grad_norm, gradient_accumulation, save_total_limit, train_split, save_strategy, evaluation_strategy, valid_split, epochs, target_column, model, scheduler, token, repo_id, image_column, weight_decay, log
WARNING Parameters not supplied by user and set to default: text_column, lora_dropout, warmup_ratio, auto_find_batch_size, target_modules, logging_steps, lr, gradient_accumulation, max_grad_norm, evaluation_strategy, lora_r, epochs, lora_alpha, weight_decay, project_name, seed, max_target_length, optimizer, username, data_path, push_to_hub, batch_size, save_total_limit, train_split, save_strategy, quantization, valid_split, target_column, model, max_seq_length, token, repo_id, scheduler, peft
WARNING Parameters not supplied by user and set to default: task, project_name, seed, username, data_path, push_to_hub, num_trials, numerical_columns, train_split, time_limit, id_column, categorical_columns, valid_split, model, token, repo_id, target_columns
WARNING Parameters not supplied by user and set to default: num_validation_images, prior_generation_precision, revision, adam_weight_decay, adam_beta1, warmup_steps, allow_tf32, center_crop, dataloader_num_workers, num_cycles, max_grad_norm, adam_beta2, class_labels_conditioning, local_rank, epochs, tokenizer, prior_loss_weight, bf16, validation_epochs, checkpoints_total_limit, validation_images, project_name, seed, pre_compute_text_embeddings, xl, text_encoder_use_attention_mask, username, validation_prompt, class_image_path, sample_batch_size, num_class_images, resume_from_checkpoint, lr_power, tokenizer_max_length, rank, push_to_hub, image_path, checkpointing_steps, prior_preservation, model, scheduler, adam_epsilon, token, repo_id, logging, class_prompt, scale_lr
INFO: Started server process [37]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: 10.16.20.172:19094 - “GET /?__sign=eyJhbGciOiJFZERTQSJ9.eyJyZWFkIjp0cnVlLCJvbkJlaGFsZk9mIjp7Il9pZCI6IjY1YTcyZDkwZTE3ODdhMjQxOWU3MWU1NSIsInVzZXIiOiJTeWxsb2dlbSJ9LCJpYXQiOjE3MDU5NDU2NDMsInN1YiI6Ii9zcGFjZXMvU3lsbG9nZW0vUGFpZFNlcnZpY2UiLCJleHAiOjE3MDYwMzIwNDMsImlzcyI6Imh0dHBzOi8vaHVnZ2luZ2ZhY2UuY28ifQ.8YoBPdvFWdeEyJ2RsVtVIsODG3yC1QrwxkT13yd_jOcepHP1RABcXDVQ__d9SDwp9_DObxkMSzlySovj-vbPAw HTTP/1.1” 200 OK
INFO Task: llm:sft
INFO: 10.16.20.172:12542 - “GET /params/llm%3Asft HTTP/1.1” 200 OK
INFO: 10.16.20.172:22882 - “GET /model_choices/llm%3Asft HTTP/1.1” 200 OK
===== Application Startup at 2024-01-22 05:06:52 =====

==========
== CUDA ==

CUDA Version 12.1.1

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

WARNING: The NVIDIA Driver was not detected. GPU functionality will not be available.
Use the NVIDIA Container Toolkit to start this container with GPU support; see

INFO: Will watch for changes in these directories: [‘/app’]
WARNING: “workers” flag is ignored when reloading is enabled.
INFO: Uvicorn running on http://0.0.0.0:7860 (Press CTRL+C to quit)
INFO: Started reloader process [35] using StatReload

INFO Authenticating user…
WARNING Parameters not supplied by user and set to default: model_max_length, text_column, prompt_text_column, lora_dropout, auto_find_batch_size, warmup_ratio, trainer, logging_steps, lr, rejected_text_column, gradient_accumulation, max_grad_norm, use_flash_attention_2, evaluation_strategy, lora_r, disable_gradient_checkpointing, model_ref, lora_alpha, weight_decay, merge_adapter, project_name, seed, optimizer, username, add_eos_token, data_path, push_to_hub, batch_size, apply_chat_template, save_total_limit, train_split, save_strategy, valid_split, dpo_beta, model, scheduler, repo_id, token
WARNING Parameters not supplied by user and set to default: text_column, project_name, seed, warmup_ratio, auto_find_batch_size, optimizer, username, data_path, push_to_hub, logging_steps, lr, batch_size, max_grad_norm, gradient_accumulation, save_total_limit, train_split, save_strategy, evaluation_strategy, valid_split, epochs, target_column, model, max_seq_length, scheduler, token, repo_id, weight_decay, log
WARNING Parameters not supplied by user and set to default: project_name, seed, warmup_ratio, auto_find_batch_size, optimizer, username, data_path, push_to_hub, logging_steps, lr, batch_size, max_grad_norm, gradient_accumulation, save_total_limit, train_split, save_strategy, evaluation_strategy, valid_split, epochs, target_column, model, scheduler, token, repo_id, image_column, weight_decay, log
WARNING Parameters not supplied by user and set to default: text_column, lora_dropout, warmup_ratio, auto_find_batch_size, target_modules, logging_steps, lr, gradient_accumulation, max_grad_norm, evaluation_strategy, lora_r, epochs, lora_alpha, weight_decay, project_name, seed, max_target_length, optimizer, username, data_path, push_to_hub, batch_size, save_total_limit, train_split, save_strategy, quantization, valid_split, target_column, model, max_seq_length, token, repo_id, scheduler, peft
WARNING Parameters not supplied by user and set to default: task, project_name, seed, username, data_path, push_to_hub, num_trials, numerical_columns, train_split, time_limit, id_column, categorical_columns, valid_split, model, token, repo_id, target_columns
WARNING Parameters not supplied by user and set to default: num_validation_images, prior_generation_precision, revision, adam_weight_decay, adam_beta1, warmup_steps, allow_tf32, center_crop, dataloader_num_workers, num_cycles, max_grad_norm, adam_beta2, class_labels_conditioning, local_rank, epochs, tokenizer, prior_loss_weight, bf16, validation_epochs, checkpoints_total_limit, validation_images, project_name, seed, pre_compute_text_embeddings, xl, text_encoder_use_attention_mask, username, validation_prompt, class_image_path, sample_batch_size, num_class_images, resume_from_checkpoint, lr_power, tokenizer_max_length, rank, push_to_hub, image_path, checkpointing_steps, prior_preservation, model, scheduler, adam_epsilon, token, repo_id, logging, class_prompt, scale_lr
INFO: Started server process [37]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: 10.16.20.172:19094 - “GET /?__sign=eyJhbGciOiJFZERTQSJ9.eyJyZWFkIjp0cnVlLCJvbkJlaGFsZk9mIjp7Il9pZCI6IjY1YTcyZDkwZTE3ODdhMjQxOWU3MWU1NSIsInVzZXIiOiJTeWxsb2dlbSJ9LCJpYXQiOjE3MDU5NDU2NDMsInN1YiI6Ii9zcGFjZXMvU3lsbG9nZW0vUGFpZFNlcnZpY2UiLCJleHAiOjE3MDYwMzIwNDMsImlzcyI6Imh0dHBzOi8vaHVnZ2luZ2ZhY2UuY28ifQ.8YoBPdvFWdeEyJ2RsVtVIsODG3yC1QrwxkT13yd_jOcepHP1RABcXDVQ__d9SDwp9_DObxkMSzlySovj-vbPAw HTTP/1.1” 200 OK
INFO Task: llm:sft
INFO: 10.16.20.172:12542 - “GET /params/llm%3Asft HTTP/1.1” 200 OK
INFO: 10.16.20.172:22882 - “GET /model_choices/llm%3Asft HTTP/1.1” 200 OK"

Dataset info:

prompt
What structure of borophene is used in the study?
What is the cohesive energy of TCB borophene?
Why was TCB borophene chosen over TFB borophene?
What is the ideal dopant for borophene according to the study?
How can doping nitrogen change the properties of borophene?
What happens to the lattice constant with increasing N doping?
How much does the lattice constant increase with 100% N doping?
What happens to the corrugation height with increasing N doping?
How does the electronic structure change from borophene to h-BN?
What happens to the prominent peak near the Fermi level in borophene with increasing N doping?
Why does the structure change from corrugated to flat with 100% N doping?
What causes the negative frequencies in the phonon band structure of borophene?
What happens to these negative frequencies with N doping?
Why does borophene have unoccupied sp2 orbitals?
How do N atoms bond with these unoccupied orbitals?
What is the bond length of B-N bonds in h-BN?
What is the structural anisotropy of borophene?
What happens to this anisotropy with 100% N doping?
What are the lattice constants of borophene?
What is the lattice constant of h-BN?
What are the B-B bond lengths in borophene?
What causes the corrugated structure of borophene?
How does N doping break the B-B bonds?
What is the corrugation height of pristine borophene?
What are the flexural acoustic modes in the phonon structure?
What is the flexural mode frequency at K for borophene?
What is the flexural mode frequency at M for borophene?
How do the flexural frequencies change with N doping?
What indicates the stability of h-BN?
What is the electronegativity difference between B and N?
How does this affect their bonding?
What is the mass difference between B and N?
How does this mass difference affect the phonon modes?
What is the electronic configuration of B?
How many valence electrons does each B atom have?
How many nearest neighbors does each B atom have in TCB?
Why does this lead to unoccupied orbitals?
What causes the peak near EF in borophene?
Why do these states move below EF?
What makes the unoccupied states prone to accept electrons?
What causes the insulating band gap in h-BN?
How are the doped adatoms studied?
What adatoms are considered?
What is the main finding about doping adatoms?
Summarize the key findings and insights from this scientific article.

response
Triangular corrugated borophene (TCB)
-6.65 eV/atom
TCB is more stable than TFB.
Nitrogen (N)
It can change borophene from being corrugated and metallic to flat and insulating like h-BN.
The lattice constant increases with N doping concentration.
54.00%
It initially increases then decreases, becoming 0 for 100% N doping.
Borophene is metallic while h-BN is insulating with a band gap.
It diminishes and disappears with 100% N doping.
The N atoms pull electrons from B atoms, breaking B-B bonds and forming B-N bonds in a planar configuration.
The corrugated structure leads to unstable flexural acoustic modes.
They shift upward and become positive with increased N doping.
Each B atom has only 3 valence electrons but 6 nearest neighbors.
They share electrons in the unoccupied orbitals, forming B-N bonds.
1.44 Å
It has different lattice constants and bond lengths in different directions.
It disappears and the structure becomes planar and homogeneous.
a1 = 1.61 Å, a2 = 2.86 Å
2.50 Å
B1-B2 = 1.87 Å, B1-B1 (B2-B2) = 1.61 Å
A mixture of in-plane and out-of-plane hybrid orbitals.
By sharing electrons in the unoccupied sp2 orbitals.
0.91 Å
The out-of-plane modes with negative frequencies.
-430 cm-1
-320 cm-1
They increase and become positive.
The lack of negative frequencies in the phonon band structure.
N is more electronegative by 1.0.
N pulls electrons from B atoms.
N is heavier.
It causes the appearance of a phonon gap.
2s2 2p1
3
6
Each B atom has fewer valence electrons than nearest neighbors.
The mixture of in-plane and out-of-plane hybrid orbitals.
Due to the symmetry-reducing distortion that enhances binding.
Their availability to share electrons and form bonds.
The separation of bonding and antibonding orbitals.
By examining their electronic properties when doped into borophene.
C, N, O, Fe, Co, Ni.
Both metallic and nonmetallic adatoms can tune the electronic properties of borophene.
8, 155–168."

Topic		Replies	Views
AutoTrain Advanced: FATAL ERROR: NVIDIA Management Library (NVML) not found 🤗AutoTrain	14	1893	April 28, 2024
Autotrain on Nvidia DGX Cloud not working 🤗AutoTrain	4	255	April 17, 2024
Huggingface dataset install 🤗Datasets	13	2472	January 15, 2025
Crash during training - rate limit Spaces	0	466	December 19, 2023
AutoTrain - Space Automatically Pauses During Training 🤗AutoTrain	4	170	March 21, 2025

Need Help Interpreting AutoTrain Error Log

========== == CUDA ==

========== == CUDA ==

Related topics

==========
== CUDA ==

==========
== CUDA ==