How to choose optimal batch size for training LLMs?
|
|
3
|
848
|
March 27, 2023
|
_find_timestamp_sequence algorithm used in Whisper Pipeline
|
|
0
|
26
|
March 24, 2023
|
Interpreting train_loss/val_loss Plot
|
|
3
|
45
|
March 24, 2023
|
Inference with Multi-Step Reasoning
|
|
0
|
31
|
March 23, 2023
|
DeepSpeed Zero3 and Peft LoRA fp16 issue
|
|
1
|
219
|
March 23, 2023
|
What is the correct way to create a feature extractor for a hugging face (HF) ViT model?
|
|
0
|
56
|
March 22, 2023
|
Probsparse_attention in Informer
|
|
2
|
43
|
March 22, 2023
|
Combining OpenAI Embeddings and OpenAI CLIP embeddings?
|
|
0
|
85
|
March 22, 2023
|
Calculating perplexity from hidden_states
|
|
2
|
229
|
March 21, 2023
|
Finetune Donut with new tokenizer
|
|
4
|
201
|
March 21, 2023
|
OSError: Unable to load weights from pytorch checkpoint file
|
|
19
|
23104
|
March 21, 2023
|
Issues when using `accelerate` with `fp16`
|
|
0
|
101
|
March 21, 2023
|
Saving/Loading custom model build from varying HF models
|
|
1
|
50
|
March 20, 2023
|
Does higher work with huggingface (hugging face, HF) models? e.g. ViT?
|
|
1
|
78
|
March 19, 2023
|
Fine-tuning LLM model for E-commerce Chatbot recomendation
|
|
0
|
54
|
March 17, 2023
|
Inference Endpoint - Simultaneous Generations taking a long time
|
|
0
|
41
|
March 14, 2023
|
FAQ question generation and answering using few shot learning
|
|
1
|
373
|
March 14, 2023
|
TypeError: Repository.__init__() got an unexpected keyword argument 'token'
|
|
2
|
203
|
March 10, 2023
|
Custom bert embedding cause "RuntimeError: Trying to backward through the graph a second time"
|
|
0
|
79
|
March 10, 2023
|
HF Dataset as a Replay Buffer for RL applications
|
|
6
|
151
|
March 9, 2023
|
Encoding/decoding NLP model in tensorflow lite (fine-tuned GPT2)
|
|
2
|
558
|
March 9, 2023
|
Write user-inputted data from app to csv in space directory
|
|
0
|
42
|
March 7, 2023
|
Prompt loss weight instead of masking in generative models
|
|
0
|
68
|
March 7, 2023
|
Multi-GPU support lost when overwriting functions for Custom Trainer
|
|
1
|
354
|
March 5, 2023
|
Combining tokenizer.decode and model.generate scores for probability prediction
|
|
2
|
73
|
March 1, 2023
|
The model did not return a loss from the inputs, only the following keys: logits. For reference, the inputs it received are input_values
|
|
3
|
2908
|
February 6, 2023
|
How to correct TypeError: zip argument #1 must support iteration training in multiple GPU
|
|
1
|
168
|
February 28, 2023
|
What's a low enough perplexity value
|
|
0
|
49
|
February 28, 2023
|
Example of hyper-parameter search of fine tuned fill mask model
|
|
0
|
42
|
February 27, 2023
|
Training Fails after multiple passes: ValueError: The model did not return a loss from the inputs
|
|
1
|
158
|
February 27, 2023
|