KerasMetricCallback
|
|
0
|
297
|
July 6, 2023
|
BLIP2 GreedySearchDecoderOnlyOutput, how can I extract the activations of a certain hidden layer?
|
|
0
|
145
|
July 5, 2023
|
TypeError: Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. Please use float32 instead
|
|
2
|
8807
|
July 6, 2023
|
Getting Q, K, V matrices of a ViT
|
|
0
|
158
|
July 5, 2023
|
T5 tokenizer's post-processor is suboptimal for truncated sequences for seq2seq finetuning
|
|
0
|
334
|
July 5, 2023
|
Single Node Multi GPU FlanT5 fine-tuning using HF Dataset and HF Trainer
|
|
4
|
2067
|
July 5, 2023
|
Finetune on Titan X Pascal
|
|
0
|
236
|
July 5, 2023
|
Difference in trainer.predict() and model.generate() for LM
|
|
0
|
1806
|
July 5, 2023
|
torch.cuda.OutOfMemoryError
|
|
0
|
2063
|
July 5, 2023
|
Error in Model.prepare_tf_dataset()
|
|
1
|
706
|
July 5, 2023
|
What is the purpose of 'use_cache' in decoder?
|
|
5
|
24022
|
July 4, 2023
|
Pip install transformers[tf-cpu] fails due to virus
|
|
0
|
467
|
July 4, 2023
|
Pre - Train model with inputs_embeds
|
|
0
|
376
|
July 4, 2023
|
Finetuning T5-small delivers incorrect outputs after finetuning
|
|
1
|
370
|
July 4, 2023
|
ObjectDetectionOutput
|
|
0
|
137
|
July 4, 2023
|
Question about GPT's data preprocess for training
|
|
0
|
301
|
July 4, 2023
|
Explicitly set number of training steps using Trainer
|
|
5
|
9475
|
September 16, 2020
|
Do trainer.save_model saves the best model?
|
|
3
|
6410
|
July 3, 2023
|
Fine-tuning a 16B CodeGen model with 256GB RAM+2xA6000s?
|
|
2
|
1653
|
July 3, 2023
|
When using `auto_find_batch_size` and a new batch size is used, output seems to indicate training examples are left off from before. Not the case?
|
|
0
|
2214
|
July 2, 2023
|
How to extract gradient during training in pytorch with Trainer module?
|
|
4
|
4388
|
July 2, 2023
|
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://huggingface.co/api/bert/models/bert
|
|
5
|
12183
|
July 1, 2023
|
Printing generations periodically during training
|
|
0
|
194
|
June 30, 2023
|
CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling `cublasCreate(handle)`
|
|
2
|
2311
|
June 30, 2023
|
Estimate training compute for 150B LLM
|
|
0
|
536
|
June 30, 2023
|
Difference between vocab_size in model T5forConditionalGeneration "t5-small" and its corresponding Tokenizer "t5-small"
|
|
1
|
634
|
June 30, 2023
|
ImportError: cannot import name 'InstructBlipProcessor' from 'transformers'
|
|
1
|
6159
|
June 29, 2023
|
License for models on huggingface
|
|
5
|
5589
|
May 2, 2023
|
Accepted model_kwargs for a Huggingface model
|
|
0
|
181
|
June 29, 2023
|
Will Trainer loss functions automatically ignore -100?
|
|
2
|
2252
|
June 29, 2023
|