Generate() and automatic truncation of context
|
|
0
|
126
|
June 13, 2024
|
Custom modification on transformers
|
|
1
|
170
|
June 13, 2024
|
Multinode worse performance than single node with same settings
|
|
5
|
314
|
June 13, 2024
|
Model Randomness Introduced by DDP
|
|
1
|
81
|
June 13, 2024
|
Create Dataset from Images and Annotations locally
|
|
3
|
424
|
June 12, 2024
|
Longformer for sequence classification throwing error regarding data format and shape
|
|
2
|
224
|
June 11, 2024
|
Unixcoder finetuned returns only a .bin file called model.bin, how to use it for inference?
|
|
0
|
88
|
June 11, 2024
|
When to use AutoModelForSeq2SeqLM?
|
|
3
|
12329
|
June 10, 2024
|
Difference in return sequence for Phi3 model
|
|
11
|
371
|
June 10, 2024
|
Attempting to unscale FP16 gradients
|
|
3
|
8612
|
June 10, 2024
|
Using ViTMAEModel as an encoder for a UNet decoder for semantic segmentation
|
|
0
|
176
|
June 9, 2024
|
Llama 3 Instruct taking too long all of a sudden
|
|
1
|
1470
|
June 9, 2024
|
GPU is far slower than CPU for patch embedding
|
|
0
|
365
|
June 8, 2024
|
Running ctransformers with cuda 11.4 or lower
|
|
1
|
2986
|
June 7, 2024
|
Active Learning code example
|
|
1
|
205
|
June 7, 2024
|
Deploying LLM in Production: Performance Degradation with Multiple Users
|
|
6
|
4834
|
June 7, 2024
|
Text-generation pipeline taking much more time than sentiment-analysis?
|
|
0
|
89
|
June 7, 2024
|
Arabic Female TTS model
|
|
0
|
75
|
June 6, 2024
|
"attention_mask" + `pad_token_id
|
|
2
|
5303
|
June 6, 2024
|
Trainer API showing loss = 0 after first log
|
|
0
|
76
|
June 6, 2024
|
OutOfMemoryError: CUDA out of memory. Tried to allocate 64.00 MiB. GPU
|
|
0
|
523
|
June 5, 2024
|
Trainer API for Model Parallelism using AutoModelForQuestionAnswering
|
|
1
|
152
|
June 5, 2024
|
Loading Llama 3
|
|
2
|
7112
|
June 5, 2024
|
Loss in a Seq2Seq task
|
|
0
|
161
|
June 5, 2024
|
Llama3 OutOfMemory on an A100 when doing CausalLM
|
|
0
|
207
|
June 5, 2024
|
Deepspeed ZeRO2, PEFT, bitsnbytes training
|
|
0
|
129
|
June 4, 2024
|
Generating text with multiple styles using llama
|
|
0
|
95
|
June 3, 2024
|
TypeError: map() got an unexpected keyword argument 'num_proc'
|
|
7
|
1037
|
June 3, 2024
|
Parameter Count & Shape Discrepancies in 4-bit vs. Higher bit LLM models
|
|
2
|
684
|
June 3, 2024
|
Pre/Post Normalization-layers
|
|
0
|
90
|
June 3, 2024
|