Is it possible to make the first batch as fast as the subsequent ones?
|
|
1
|
81
|
June 25, 2024
|
How can I keep use of the base model version for inference after fine-tuning
|
|
1
|
93
|
May 12, 2024
|
Workload evicted, storage limit exceeded (100G) Llama-4-Scout-17B-16E
|
|
1
|
46
|
April 7, 2025
|
Exploring the Necessity of Annotation in Multi-Modal LLM Fine-Tuning for Enhanced Image Comprehension
|
|
1
|
44
|
October 8, 2024
|
Calculate tokens per second while fine-tuning llm?
|
|
0
|
111
|
September 17, 2024
|
Reading huggingface dataset in rust
|
|
0
|
111
|
July 23, 2024
|
Updating assistant settings has resulted in an error
|
|
0
|
115
|
June 11, 2024
|
Large Open Dataset Allowed?
|
|
0
|
119
|
June 5, 2024
|
Create entirely new vocabulary for tokenizer
|
|
0
|
117
|
May 30, 2024
|
PyTorch Traced model giving different results than true model
|
|
0
|
119
|
May 9, 2024
|
Failure trying to download model on local machine using Automatic1111
|
|
2
|
37
|
February 17, 2025
|
How to configure the order of subsets in the dataset viewer
|
|
2
|
36
|
October 3, 2024
|
Generate test case with Model Gpt2
|
|
1
|
81
|
August 19, 2024
|
NotImplementedError: ggml_type 21 not implemented
|
|
2
|
62
|
September 23, 2024
|
How to Efficiently Fine-Tune Models on Custom Datasets with Limited Resources?
|
|
0
|
111
|
July 10, 2024
|
Cannot pass a kwargs into `torch.onnx.export` arguments
|
|
0
|
111
|
June 28, 2024
|
LayoutLMV3 mapping numerical predictions to class labels
|
|
0
|
188
|
June 25, 2024
|
Adding compute_metrics produces Cuda OutOfMemoryError
|
|
0
|
117
|
May 22, 2024
|
Fine tunning a Pre-trained Model
|
|
0
|
118
|
May 14, 2024
|
Encode token without spaced between them
|
|
0
|
143
|
May 9, 2024
|
HuggingChat edited prompt context
|
|
0
|
123
|
May 4, 2024
|
How can I integrate the InternVL-Chat-V1.5 model into a web page without specialized hardware or API?
|
|
0
|
116
|
April 29, 2024
|
Access to llama4 models denied (Meta Employee Affiliation)
|
|
0
|
21
|
April 8, 2025
|
What Are the Common Challenges Businesses Face in LLM Training and Inference?
|
|
0
|
20
|
February 13, 2025
|
Challenges with Real-time Inference at Scale
|
|
0
|
20
|
February 12, 2025
|
Metrics for query decomposition
|
|
0
|
21
|
February 11, 2025
|
Which classification model to use?
|
|
0
|
19
|
February 7, 2025
|
Trying To Convert Paligemma model in npz to hugging face model format
|
|
0
|
22
|
January 12, 2025
|
Choosing the right model to ready json data and give predictions
|
|
0
|
19
|
January 3, 2025
|
Computational needs for AI/ML Researchers
|
|
0
|
23
|
December 5, 2024
|
Fine-tunning techniques
|
|
0
|
22
|
November 7, 2024
|
Questions about classification models
|
|
0
|
31
|
November 4, 2024
|
What is the correct way to compute metrics while training using Accelerate?
|
|
0
|
21
|
October 29, 2024
|
Problem with returning decoder cross attentions through generate function
|
|
0
|
24
|
October 25, 2024
|
Inconsistent evaluation result
|
|
0
|
20
|
October 23, 2024
|
Evaluation Metrics are not matching with Shuffle = False
|
|
0
|
20
|
October 19, 2024
|
RLHF steps and logic question
|
|
0
|
25
|
October 8, 2024
|
Token classification metric
|
|
0
|
22
|
October 3, 2024
|
How shall make logon ui better?
|
|
8
|
20
|
October 31, 2024
|
DiffuserCraft STILL keeps acting up but this time, whenever I use it
|
|
3
|
32
|
April 2, 2025
|
504 Gateway Time-out when uploading large dataset to enterprise organization
|
|
3
|
40
|
March 3, 2025
|
How can I read PDFs with mPLUG/DocOwl2?
|
|
3
|
35
|
February 6, 2025
|
Continual Training on my own checkpoint
|
|
1
|
78
|
June 27, 2024
|
The loss quickly drops to a plateau at 8
|
|
1
|
77
|
June 23, 2024
|
Best model than can run locally on a Mac?
|
|
0
|
110
|
September 6, 2024
|
Help using sfttrainer with data collator, peft, and tokenizer template
|
|
0
|
111
|
July 23, 2024
|
Proper way to swap backbones from baseline model
|
|
0
|
111
|
July 6, 2024
|
No module name nltk?
|
|
0
|
111
|
June 25, 2024
|
Why are Initial latents weighted by mask only with unet nchannels=4?
|
|
0
|
118
|
June 6, 2024
|
Need help about the using of transformers GPT2 for training
|
|
0
|
111
|
May 29, 2024
|