How is the eval dataset processed in a trainer?
|
|
0
|
516
|
August 28, 2023
|
Quantization Aware Training Error
|
|
0
|
527
|
August 28, 2023
|
Speech to Text concern
|
|
0
|
387
|
August 27, 2023
|
Can't save model to hub by TrainerCallback
|
|
0
|
164
|
August 27, 2023
|
What is the difference between gptqConfig and bitsAndBytesConfig
|
|
0
|
337
|
August 27, 2023
|
Using huggingface generate() with custom model
|
|
1
|
2140
|
August 26, 2023
|
Different model.generate() predictions between batched and unbatched/padded token inputs
|
|
2
|
2283
|
August 26, 2023
|
ValueError: The model did not return a loss from the inputs, only the following keys: logits. For reference, the inputs it received are input_ids,attention_mask
|
|
1
|
1188
|
August 25, 2023
|
Can't instantiate deberta model in Tensorflow with 'mixed_float16' global policy
|
|
0
|
212
|
August 25, 2023
|
Upload Bert style classifier with mean pooling
|
|
0
|
353
|
August 25, 2023
|
Training llama with Lora on multiple GPUs may exist bug
|
|
10
|
9675
|
August 25, 2023
|
Custom training set
|
|
0
|
145
|
August 25, 2023
|
Owl-vit bounding box format
|
|
0
|
387
|
August 25, 2023
|
Error when training with `peft` + `lora`
|
|
1
|
1357
|
August 25, 2023
|
KV cache sizing
|
|
0
|
769
|
August 24, 2023
|
How to use PEFT's LoRA with Optimum's BetterTransformer?
|
|
0
|
911
|
August 24, 2023
|
FSDP OOM issue and comparision to DeepSpeed
|
|
3
|
2430
|
August 23, 2023
|
Unable to load the transformer.trainer
|
|
0
|
1488
|
August 23, 2023
|
How to compute per-token loss when doing language modeling?
|
|
3
|
3406
|
August 23, 2023
|
Knowledge distillation for NER task
|
|
0
|
292
|
August 23, 2023
|
DetrImageProcessor.normalize_annotation() got an unexpected keyword argument 'input_data_format'
|
|
1
|
403
|
August 22, 2023
|
I am not able to install pipeline
|
|
1
|
483
|
August 22, 2023
|
Why "with training_args.main_process_first()" not exit?
|
|
0
|
607
|
August 22, 2023
|
RWKV for CTC decoding
|
|
0
|
86
|
August 22, 2023
|
`flan-alpaca-xl` model does not appear to have a file named `pytorch_model.bin` despite sharded model present
|
|
0
|
397
|
August 22, 2023
|
SFTTrain error with code from website
|
|
0
|
417
|
August 21, 2023
|
Does Trainer load checkpoints from previous fold in k-fold Cross Validation?
|
|
0
|
599
|
August 20, 2023
|
Gradient Checkpointing with FSDP efficiency
|
|
0
|
567
|
August 20, 2023
|
Training with IterableDataset is very slow when using a large number of workers
|
|
0
|
1312
|
August 19, 2023
|
Inquiry for adding a new layer for transformer model
|
|
0
|
159
|
August 18, 2023
|