Order of execution of Top-K, Top-P sampling along with temperature
|
|
1
|
3890
|
October 31, 2023
|
Evaluatation of the gradients of class probabilities and logits with respect to attention layer and hidden states
|
|
0
|
363
|
October 30, 2023
|
BEiT Semantic Segmentation Model Performance Low
|
|
0
|
148
|
October 30, 2023
|
How to override model.generate()
|
|
1
|
991
|
October 30, 2023
|
Batching large csv for embedding
|
|
0
|
552
|
October 30, 2023
|
Output token lengths of smaller models
|
|
0
|
505
|
October 30, 2023
|
'T5ForConditionalGeneration' object has no attribute '_prune_heads'
|
|
1
|
2492
|
October 30, 2023
|
Fine-tuning to google/tapas-base-finetuned-wtq to an italian dataset
|
|
2
|
1440
|
October 28, 2023
|
Tensorboard files not uploading
|
|
0
|
188
|
October 28, 2023
|
Error finding processor's image class. Loading based on pattern matching with feature extractor
|
|
11
|
12642
|
October 27, 2023
|
Will it be learned properly if tokens listed in one dimension are reshaped in the form of (batch, seq_len) and inputted into the transformer xl model?
|
|
0
|
169
|
October 27, 2023
|
The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers
|
|
4
|
5107
|
October 27, 2023
|
Does anyone have an idea how we can run llama2 with multiple GPUs?
|
|
1
|
1280
|
October 26, 2023
|
Tensor size mismatch when using Informer
|
|
0
|
631
|
October 25, 2023
|
What is the correct way to provide sequence bias to the pipeline for automatic-speech-recognitio task using Whisper Model
|
|
0
|
454
|
October 25, 2023
|
Security Policy
|
|
0
|
186
|
October 25, 2023
|
How to build email subject, body generations like chatGPT
|
|
1
|
486
|
October 25, 2023
|
Training loss is not decreasing using TFBertModel
|
|
4
|
5798
|
October 24, 2023
|
Tokenizers How do you extract concatenated entity words from B-ORG and I-ORGs etc
|
|
0
|
108
|
October 24, 2023
|
SpecAugment on Wav2Vec2 feature encoder outputs
|
|
0
|
419
|
October 24, 2023
|
Pretrain own model
|
|
0
|
271
|
October 23, 2023
|
Track number of tokens seen during training in wandb with Trainer API
|
|
2
|
1278
|
October 23, 2023
|
How can I load opt-175b model
|
|
0
|
196
|
October 23, 2023
|
No cuda support for asr pipeline
|
|
0
|
265
|
October 20, 2023
|
Does loading in 4bit override an 8bit model?
|
|
0
|
697
|
October 20, 2023
|
Unable to use Constrained beam search with google/flan-t5-base
|
|
1
|
381
|
October 20, 2023
|
Fine Tuning Segformer on Custom Dataset, getting negative loss
|
|
1
|
537
|
October 20, 2023
|
Trainer: log token count
|
|
0
|
247
|
October 19, 2023
|
Potential bug with beam search + eos_token_id
|
|
1
|
654
|
October 19, 2023
|
XLA Integration for TensorFlow Models
|
|
0
|
142
|
October 19, 2023
|