🤗Transformers

Topic	Replies	Views	Activity
KerasMetricCallback 🤗Transformers	0	297	July 6, 2023
BLIP2 GreedySearchDecoderOnlyOutput, how can I extract the activations of a certain hidden layer? 🤗Transformers	0	145	July 5, 2023
TypeError: Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. Please use float32 instead 🤗Transformers	2	8807	July 6, 2023
Getting Q, K, V matrices of a ViT 🤗Transformers	0	158	July 5, 2023
T5 tokenizer's post-processor is suboptimal for truncated sequences for seq2seq finetuning 🤗Transformers	0	334	July 5, 2023
Single Node Multi GPU FlanT5 fine-tuning using HF Dataset and HF Trainer 🤗Transformers	4	2067	July 5, 2023
Finetune on Titan X Pascal 🤗Transformers	0	236	July 5, 2023
Difference in trainer.predict() and model.generate() for LM 🤗Transformers	0	1806	July 5, 2023
torch.cuda.OutOfMemoryError 🤗Transformers	0	2063	July 5, 2023
Error in Model.prepare_tf_dataset() 🤗Transformers	1	706	July 5, 2023
What is the purpose of 'use_cache' in decoder? 🤗Transformers	5	24022	July 4, 2023
Pip install transformers[tf-cpu] fails due to virus 🤗Transformers	0	467	July 4, 2023
Pre - Train model with inputs_embeds 🤗Transformers	0	376	July 4, 2023
Finetuning T5-small delivers incorrect outputs after finetuning 🤗Transformers	1	370	July 4, 2023
ObjectDetectionOutput 🤗Transformers	0	137	July 4, 2023
Question about GPT's data preprocess for training 🤗Transformers	0	301	July 4, 2023
Explicitly set number of training steps using Trainer 🤗Transformers	5	9475	September 16, 2020
Do trainer.save_model saves the best model? 🤗Transformers	3	6410	July 3, 2023
Fine-tuning a 16B CodeGen model with 256GB RAM+2xA6000s? DeepSpeed	2	1653	July 3, 2023
When using `auto_find_batch_size` and a new batch size is used, output seems to indicate training examples are left off from before. Not the case? 🤗Transformers	0	2214	July 2, 2023
How to extract gradient during training in pytorch with Trainer module? 🤗Transformers	4	4388	July 2, 2023
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://huggingface.co/api/bert/models/bert 🤗Transformers	5	12183	July 1, 2023
Printing generations periodically during training 🤗Transformers	0	194	June 30, 2023
CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling `cublasCreate(handle)` 🤗Transformers	2	2311	June 30, 2023
Estimate training compute for 150B LLM DeepSpeed	0	536	June 30, 2023
Difference between vocab_size in model T5forConditionalGeneration "t5-small" and its corresponding Tokenizer "t5-small" 🤗Transformers	1	634	June 30, 2023
ImportError: cannot import name 'InstructBlipProcessor' from 'transformers' 🤗Transformers	1	6159	June 29, 2023
License for models on huggingface 🤗Transformers	5	5589	May 2, 2023
Accepted model_kwargs for a Huggingface model 🤗Transformers	0	181	June 29, 2023
Will Trainer loss functions automatically ignore -100? 🤗Transformers	2	2252	June 29, 2023