Alternating between batches of different datasets
|
|
0
|
222
|
February 8, 2024
|
Trying to recreate `model.greedy_search()` for custom decoding of LLM output, but I am getting a different decoded output
|
|
3
|
354
|
February 8, 2024
|
Change from resnet152 to resnet50
|
|
0
|
135
|
February 7, 2024
|
Generate token by token for m2m100_418
|
|
0
|
400
|
February 6, 2024
|
Finetuning of conversational model without train data in conversation style
|
|
1
|
1762
|
February 2, 2024
|
Identifying most useful domain-specific tokens for adding to the existing tokenizer
|
|
1
|
487
|
February 2, 2024
|
Huggingface endpoint with chat agents for conversational NLU
|
|
0
|
178
|
February 1, 2024
|
Common practice, using the hidden state associated with [cls] as an input feature for a classification task?
|
|
3
|
5887
|
January 31, 2024
|
Peft following bits and bytes seems to have no effect on LLM
|
|
0
|
507
|
January 31, 2024
|
How to load quantized LLM to CPU only device
|
|
0
|
1959
|
January 28, 2024
|
Hyperparameter-Search while adding Special tokens
|
|
1
|
509
|
January 28, 2024
|
Evaluation and compute_metrics slowdown
|
|
0
|
797
|
August 29, 2023
|
How to finetune any transformer custom layers using tf
|
|
0
|
205
|
January 27, 2024
|
NER - aggregation_strategy
|
|
1
|
1421
|
January 24, 2024
|
Hugging Face Library 'Call Home' / self-update feature
|
|
0
|
136
|
January 23, 2024
|
Issues when using `accelerate` with `fp16`
|
|
4
|
12103
|
January 22, 2024
|
How is it possible to get GPU memory errors when increasing the gradient_accumulation steps?
|
|
1
|
1394
|
January 22, 2024
|
Model validation failed - Target is multiclass but average='binary'
|
|
2
|
2781
|
January 21, 2024
|
How to finetune/instruction-tune a large language model on a QA corpus?
|
|
1
|
1987
|
January 20, 2024
|
Autogen AI Issues
|
|
0
|
288
|
January 19, 2024
|
Scaling Mistral-7B on AWS SageMaker With Multiple Replica Endpoints
|
|
0
|
621
|
January 19, 2024
|
AI LLM model bias
|
|
0
|
144
|
January 16, 2024
|
Error while training Mixtral in 8bit
|
|
0
|
298
|
January 16, 2024
|
Inference Endpoints creation
|
|
1
|
469
|
January 14, 2024
|
Discriminator of GAN performing poorly for network anomaly detection
|
|
0
|
152
|
January 13, 2024
|
Pipelines, Whisper and how to set parameters
|
|
1
|
755
|
January 12, 2024
|
BLIP How to combine embeddings for multimodal search?
|
|
1
|
2038
|
January 11, 2024
|
DeBERTa-v3: How to keep ELECTRA-style task-head?
|
|
5
|
2282
|
January 10, 2024
|
Multiple Classification Heads (For two tier labelling)
|
|
0
|
214
|
January 10, 2024
|
ASR on multilingual audio data (code-switching)
|
|
0
|
182
|
January 10, 2024
|