🤗Transformers

Topic	Replies	Views	Activity
Llama 2 & 8K Training 🤗Transformers	0	734	August 4, 2023
Is llama2 supported by the Hugging Face Text Generation Inference (TGI) Deep Learning Container on Amazon SageMaker? 🤗Transformers	0	537	August 3, 2023
Probabilistic One Hot Encoding 🤗Transformers	0	297	August 3, 2023
How can i training a MLM without labels? 🤗Transformers	0	256	August 3, 2023
Which version should I fine-tune? 🤗Transformers	0	375	August 2, 2023
Audio Spectrogram Transformer in tensorflow 🤗Transformers	0	121	August 2, 2023
meta-llama/Llama-2-70b-hf filling up my disk 🤗Transformers	0	352	August 2, 2023
Created exe file not getting executed 🤗Transformers	0	561	August 2, 2023
In Donut Where the output of swin diffused with the text->1.At the starting of Bart encoder,2. cross attention(K,V from swin,Q from attention) of second attention of Bart encoder,3.directly the decoder part of BART 🤗Transformers	0	171	August 2, 2023
How can I load an LLM in 4-bits 🤗Transformers	0	486	August 2, 2023
Error with gpt2 training 🤗Transformers	0	362	August 1, 2023
Speech to Speech Generative AI system 🤗Transformers	0	206	August 1, 2023
Training Roberta for RAG 🤗Transformers	0	577	August 1, 2023
Diff between GPTQ and NF4 with bitsandbytes 🤗Transformers	0	1253	August 1, 2023
GPT-NeoX inference OOM with plenty of available memory 🤗Transformers	2	896	August 1, 2023
Falcon for translation 🤗Transformers	0	257	August 1, 2023
Fine Tune text generation Model using different type of data 🤗Transformers	0	355	August 1, 2023
How to implement custom vision encoder-decoder? 🤗Transformers	1	706	August 1, 2023
Issues with fine tuning an Encoder Decoder Model 🤗Transformers	0	813	July 31, 2023
NCCL timeout + corrupts checkpoint/latest DeepSpeed	1	2610	July 31, 2023
Soft prompt learning for BERT and GPT using Transformers 🤗Transformers	4	3823	July 31, 2023
Which summarization model of huggingface supports more than 1024 tokens? Which model is more suitable for programming related articles? 🤗Transformers	1	1774	July 31, 2023
PubMedQA, Preprocessing 🤗Transformers	0	198	July 30, 2023
RuntimeError: tensors must be contiguous when finetuning GPT-J-6B using PEFT Lora DeepSpeed	0	881	July 29, 2023
Class weights for Segformer loss function 🤗Transformers	1	932	July 28, 2023
Reproduce RoBERTa Using Huggingface Transformers 🤗Transformers	0	241	July 28, 2023
Training a model on a CSV 🤗Transformers	1	1005	July 28, 2023
Deepspeed inference and infinity offload with bitsandbytes 4bit loaded models DeepSpeed	2	3860	July 27, 2023
Can not understand the sequence length and hidden size of the BEiT model 🤗Transformers	0	227	July 27, 2023
AttributeError: module 'fsspec' has no attribute 'asyn' 🤗Transformers	6	3671	July 27, 2023