Question answer model for Process Data in IIOT
|
|
3
|
17
|
June 18, 2025
|
Trainer in PEFT doesn't report evaluation metrics
|
|
4
|
453
|
June 17, 2025
|
Apply PEFT on ViT
|
|
2
|
452
|
June 17, 2025
|
Explicitly disable bf16 for some layers
|
|
2
|
6
|
June 17, 2025
|
LoRA Finetuning RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0!
|
|
4
|
15
|
June 16, 2025
|
Stopiteration error
|
|
4
|
147
|
June 14, 2025
|
How to use different learning rates when deepspeed enabled
|
|
1
|
16
|
June 14, 2025
|
Careerbert-siamese
|
|
1
|
8
|
June 12, 2025
|
[Error about InformerForPredict][Bug]
|
|
1
|
9
|
June 11, 2025
|
Correct way to load multiple LoRA adapters for inference
|
|
4
|
26
|
June 11, 2025
|
Multi-GPU finetuning of NLLB produces RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0
|
|
2
|
1104
|
June 9, 2025
|
How was self.loss_function implemented
|
|
4
|
18
|
June 9, 2025
|
Transformers Repo Install Error
|
|
9
|
72
|
June 6, 2025
|
How many GPU resources do I need for full-fine tuning of the 7b model?
|
|
2
|
5072
|
June 5, 2025
|
Generate: using k-v cache is faster but no difference to memory usage
|
|
5
|
15709
|
June 3, 2025
|
Distributed Training w/ Trainer
|
|
11
|
8860
|
June 3, 2025
|
Grouping by length makes training loss oscillate and makes evaluation loss worse
|
|
2
|
227
|
June 3, 2025
|
How can LLMs be fine-tuned for specialized domain knowledge?
|
|
2
|
235
|
June 3, 2025
|
Implementing Triplet loss in Vit
|
|
1
|
23
|
June 3, 2025
|
Using Huggingface for computer vision (Tensorflow)?
|
|
3
|
405
|
June 2, 2025
|
valueError: Supplied state dict for layers does not contain `bitsandbytes__*` and possibly other `quantized_stats`(when load saved quantized model)
|
|
4
|
733
|
May 30, 2025
|
RGBA -> RGB default background color vs padding color
|
|
1
|
9
|
May 30, 2025
|
Why is Static Cache latency high?
|
|
2
|
17
|
May 29, 2025
|
Error using Trainer with Colab notebook, anyone have a solution?
|
|
1
|
54
|
May 29, 2025
|
LoRA training with accelerate / deepspeed
|
|
3
|
2275
|
May 28, 2025
|
How does Q, K, V differ in LLM?
|
|
1
|
20
|
May 28, 2025
|
The effect of padding_side
|
|
13
|
14536
|
May 27, 2025
|
Prompt caching in pipelines
|
|
1
|
37
|
May 27, 2025
|
GETTING ERROR >> AttributeError: 'InferenceClient' object has no attribute 'post'
|
|
5
|
424
|
May 27, 2025
|
How does Llama For Sequence Classification determine what class corresponds to what label?
|
|
10
|
4862
|
May 25, 2025
|