Hyper params search for model config
|
|
0
|
175
|
February 22, 2024
|
qloRA with cpu offload
|
|
1
|
951
|
February 22, 2024
|
Fine-tuning throws "index out of range in self"
|
|
6
|
10332
|
February 21, 2024
|
Same checkpoint produces different output
|
|
0
|
148
|
February 20, 2024
|
Llama-2 Sequence Classification: Much lower accuracy on inference from checkpoint compared to model
|
|
5
|
5973
|
February 20, 2024
|
It says that `bfloat16.enabled` without `auto' needed to be specified when training T5, is anyone aware of how to solve that?
|
|
0
|
257
|
February 20, 2024
|
Adding categorical and numerical data to Bert model
|
|
0
|
1004
|
February 20, 2024
|
Gradually increasing CPU load on using sentence embeddings model with kmeans
|
|
0
|
537
|
February 20, 2024
|
Cannot see training accuracy, only validation accuracy
|
|
2
|
1305
|
February 20, 2024
|
Hallucination with trainer.evaluate() on LLMs
|
|
1
|
679
|
February 19, 2024
|
Running ASR inference pipeline on multiple GPU's
|
|
0
|
137
|
February 19, 2024
|
Generate() returns full prompt plus answer
|
|
1
|
6320
|
February 19, 2024
|
Pipelines without a tokenizer
|
|
1
|
643
|
February 19, 2024
|
Token level representations
|
|
0
|
190
|
February 17, 2024
|
Repetition_penalty not working?
|
|
1
|
209
|
February 18, 2024
|
How to set stopping criteria in model.generate() when a certain word appears
|
|
3
|
3789
|
February 18, 2024
|
Fine tuning using LOFTQ - CUDA out of memory error
|
|
4
|
382
|
February 18, 2024
|
Which hidden states have the highest score in beam search?
|
|
0
|
106
|
February 18, 2024
|
Any model's size is huge when saved as opposed to downloading from hub pretrained
|
|
3
|
380
|
February 17, 2024
|
Decoder_start_token_id per sample or per batch during training
|
|
0
|
231
|
February 16, 2024
|
Some Roberta weights are not initializing from the checkpoint
|
|
0
|
791
|
February 16, 2024
|
From Transformers Version v4.12.0 onwards, The example colab BERT2BERT is wrong. (Things to keep in mind when using from transformers import EncoderDecoderModel)
|
|
0
|
272
|
February 16, 2024
|
How to force bos_token_id for each example individually in MBart?
|
|
3
|
1216
|
February 16, 2024
|
What on earth is point_batch_size for the transformers SamModel?
|
|
0
|
321
|
February 15, 2024
|
T5-xxl mlm distributed training?
|
|
1
|
372
|
February 15, 2024
|
Finetune LLaMA2 model with datasets missing labels
|
|
0
|
373
|
February 15, 2024
|
Tranier not starting on multi-GPU setting
|
|
1
|
1076
|
February 15, 2024
|
Difference between CausalLMWithValueHead vs ModelForCausalLM
|
|
2
|
3461
|
February 15, 2024
|
Using Owl ViT Embeddings with cosine similarity
|
|
1
|
566
|
February 15, 2024
|
Trainer freezes after all steps are complete (multi-gpu setting)
|
|
4
|
1589
|
February 14, 2024
|