Confusion regarding when to use dict-styled chat dialogue vs. when to format using chat template
|
|
0
|
42
|
November 6, 2024
|
Mistral - Sentence classification - mat1 and mat2 shapes cannot be multiplied
|
|
4
|
576
|
November 5, 2024
|
Parallelise pipelines on a single GPU?
|
|
3
|
731
|
October 31, 2024
|
Cannot Merge Lora weights back to the base model
|
|
8
|
312
|
October 29, 2024
|
Invalid image format
|
|
2
|
424
|
October 29, 2024
|
Build error while cloning
|
|
7
|
51
|
October 29, 2024
|
Fine Tuning Format/Structure for data for llma3.1 models
|
|
0
|
53
|
October 28, 2024
|
Need Help with Reliable Cross-Sentence Coreference Resolution for Document Summarization
|
|
0
|
124
|
October 26, 2024
|
Input batch size not matching Target batch size
|
|
0
|
89
|
October 26, 2024
|
Tokenizer deprecating in ORPO
|
|
6
|
2846
|
October 25, 2024
|
Cross-encoder inference API DOWN?
|
|
1
|
63
|
October 25, 2024
|
What's a low enough perplexity value
|
|
1
|
255
|
October 23, 2024
|
Finetuning a Large Language Model
|
|
0
|
82
|
October 23, 2024
|
How does SFTT trainer behave during evaluation?
|
|
0
|
111
|
October 23, 2024
|
I always get a json response from nvidia model, how to remove it? [ Intermediate ]
|
|
0
|
14
|
October 22, 2024
|
Remove causal mask from Llama decoder
|
|
5
|
685
|
October 22, 2024
|
How to add EOS when training T5?
|
|
1
|
134
|
October 21, 2024
|
Best tool/method for AI model traceability management?
|
|
0
|
13
|
October 14, 2024
|
Client Js Failed to fetch file (gradio api)
|
|
1
|
105
|
October 11, 2024
|
Issue in deploying quantized meta-llama/Llama-3.1-8B-Instruct in aws sagemaker
|
|
0
|
70
|
October 10, 2024
|
How to make multiple async calls to AsyncOpenAI and return results to Gradio UI
|
|
1
|
3108
|
October 10, 2024
|
Transfer Learning on yolov8 object detection weights
|
|
1
|
374
|
October 10, 2024
|
Whisper V3 finetuning with qlora
|
|
0
|
138
|
October 10, 2024
|
TypeError: unhashable type: 'list', When trying to create a knowledge graph from a list of documents using `convert_to_graph_documents`
|
|
1
|
323
|
October 9, 2024
|
RLHF steps and logic question
|
|
0
|
29
|
October 8, 2024
|
Why i can't use or can't pass past_key_values = DynamicCache() into Llama 3 model
|
|
1
|
276
|
October 8, 2024
|
Lack of pipeline parallelism examples for image-based transformers
|
|
3
|
62
|
October 7, 2024
|
How to parallelize inference on a quantized model
|
|
5
|
244
|
October 7, 2024
|
Abstracted Application Access via Dynamic URL Distribution
|
|
0
|
11
|
October 4, 2024
|
Windows 11 does not see my 2nd GPU (4090 + 4080)
|
|
2
|
304
|
October 3, 2024
|