Train/Test/Val Split for Informer model
|
|
1
|
413
|
August 3, 2024
|
How to evaluate before first training step?
|
|
10
|
6759
|
August 3, 2024
|
Loading a peft model which is saved on multiple nodes using sharded_state_dict?
|
|
0
|
43
|
August 2, 2024
|
Dynamically resizing input for Huggingface's generate()
|
|
0
|
35
|
August 2, 2024
|
Difference between AutoModelForCausalLM and peft_model.merge_and_unload() for a LoRA model during inference
|
|
2
|
1345
|
August 2, 2024
|
Accelerating inference for local HuggingFacePipeline of Llama3
|
|
0
|
90
|
August 1, 2024
|
Custom QA model over fitting
|
|
0
|
9
|
August 1, 2024
|
ValueError: cannot reshape array of size (GGUF)
|
|
4
|
808
|
July 31, 2024
|
Change Generation Config of Transformer Model without getting UserWarning
|
|
0
|
164
|
July 31, 2024
|
Trainer not passing all features of the dataset?
|
|
3
|
41
|
July 30, 2024
|
How does one create a custom hugging face model with a already working tokenizer?
|
|
1
|
983
|
July 29, 2024
|
How to load a model fine-tuned with QLoRA
|
|
2
|
6867
|
July 29, 2024
|
PEFT LoRA GPT-NeoX - Backward pass failing
|
|
7
|
7291
|
July 29, 2024
|
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
|
|
3
|
13823
|
July 29, 2024
|
Compatible Steamer with Beam Search
|
|
1
|
185
|
July 29, 2024
|
Transformers llama flash_attn_varlen_func questions
|
|
0
|
241
|
July 29, 2024
|
How to add a stop sequence to a Pipeline?
|
|
0
|
302
|
July 29, 2024
|
Compute_metrics() behaves strangely in distributed setting
|
|
0
|
55
|
July 28, 2024
|
Hyperparameter search with wandb
|
|
1
|
235
|
July 28, 2024
|
How to parameter efficient finetune Decoder in encoder-decoder model?
|
|
4
|
140
|
July 27, 2024
|
Error Loading Custom Transformers.js model from hugging face hub
|
|
1
|
669
|
July 27, 2024
|
Why Pipeline inferencing with CPU and pytorch for wav2vec only use 50% of cpu? and does chunk length impact the speed for model?
|
|
1
|
630
|
July 26, 2024
|
DeepSpeed error: a leaf Variable that requires grad is being used in an in-place operation
|
|
1
|
81
|
July 26, 2024
|
Adapter_model.safetensors size is very big
|
|
3
|
743
|
July 27, 2024
|
[T5] How to control the lenth of the generated summaries
|
|
0
|
35
|
July 26, 2024
|
A question about code on Mistral-7B attention
|
|
0
|
93
|
July 26, 2024
|
Epoch does not get updated
|
|
0
|
7
|
July 25, 2024
|
ReactCodeAgent - Local LLM
|
|
0
|
66
|
July 25, 2024
|
How to perform batch inference on GroundingDino model
|
|
2
|
708
|
July 25, 2024
|
Unrecognized configuration class in mT5-small-finetuned-tydiqa-for-xqa
|
|
6
|
14147
|
July 25, 2024
|