Issues when using `accelerate` with `fp16`
|
|
4
|
11832
|
January 22, 2024
|
Token per second calculations
|
|
2
|
2501
|
April 20, 2025
|
Build an end2end nlp toolkit with transformers and dataset
|
|
0
|
421
|
October 9, 2020
|
Giving a personality to a bot using a LLM
|
|
0
|
2097
|
April 11, 2023
|
API inference limit changed?
|
|
7
|
1865
|
March 24, 2025
|
Document Object Model (DOM) similarity learning
|
|
3
|
810
|
May 20, 2024
|
Question Answering for generating long answers
|
|
2
|
2859
|
June 4, 2021
|
Error Training Vision Encoder Decoder for Image Captioning
|
|
8
|
2899
|
June 8, 2024
|
Running into cuda out of memory when running llama2-13b-chat model on multi-gpu machine
|
|
5
|
11009
|
December 21, 2023
|
I can't understand why generative models make repetitions
|
|
2
|
4716
|
August 26, 2022
|
Fine-tuning LLM model for E-commerce Chatbot recomendation
|
|
0
|
1440
|
March 17, 2023
|
ERROR: vars() argument must have __dict__ attribute when trying to use trainer.train()?
|
|
5
|
18478
|
December 26, 2022
|
Using Tensorboard SummaryWriter with HuggingFace TrainerAPI
|
|
4
|
11169
|
August 24, 2023
|
How to implement bind_tools to custom LLM from huggingface pipeline(Llama-3) for a custom agent
|
|
3
|
1233
|
June 9, 2025
|
API: Quota exceeded for machine error
|
|
0
|
1385
|
June 22, 2023
|
TypeError: Repository.__init__() got an unexpected keyword argument 'token'
|
|
8
|
14591
|
August 9, 2023
|
Stopping `model.generate()` based on custom token
|
|
2
|
4372
|
October 18, 2021
|
Typical sampling decoding technique
|
|
1
|
1670
|
April 28, 2023
|
Tokenizer deprecating in ORPO
|
|
6
|
2796
|
October 25, 2024
|
How to choose optimal batch size for training LLMs?
|
|
4
|
18429
|
December 18, 2023
|
An easy way to make huggingface PRs
|
|
3
|
1140
|
July 28, 2020
|
How to save model in S3 with Trainer?
|
|
5
|
5004
|
May 26, 2023
|
How to fine-tune with unsloth using multiple GPUs as I'm getting out-of-memory error after running os.environ["CUDA_VISIBLE_DEVICES"]
|
|
3
|
3259
|
December 4, 2024
|
Out of index error when using pre-trained Pegasus model
|
|
2
|
1986
|
April 1, 2021
|
Replace special [unusedX] tokens in a tokenizer to add domain-specific words
|
|
0
|
1087
|
October 12, 2023
|
Inference after QLoRA fine-tuning
|
|
8
|
6164
|
June 7, 2024
|
How to set the padding configuration with Huggingface's GenerateMixin's generate method?
|
|
7
|
11082
|
September 26, 2023
|
Is 'autoplay' possible for an audio file in gradio?
|
|
1
|
2178
|
April 7, 2023
|
Ignore numbers while generation
|
|
3
|
849
|
April 12, 2022
|
Docker image: transformers-all-latest-gpu not running
|
|
0
|
898
|
March 30, 2024
|
Understanding zero-shot classification in one-shot ;-)
|
|
3
|
2300
|
August 2, 2021
|
Getting hidden states from the "automatic-speech-recognition" pipeline
|
|
0
|
786
|
July 15, 2022
|
How to use SentenceTransformers for contrastive learning?
|
|
5
|
5699
|
June 30, 2022
|
Evaluation and compute_metrics slowdown
|
|
0
|
784
|
August 29, 2023
|
Problem installing using conda
|
|
4
|
10965
|
June 13, 2021
|
The Correct Attention Mask For Examples Packing
|
|
6
|
2882
|
January 8, 2025
|
Implementation of NER model with relationship extraction?
|
|
3
|
6544
|
September 25, 2024
|
How to use `inputs_embed` and `attention_mask` together?
|
|
1
|
911
|
May 19, 2024
|
What is an embedding?
|
|
4
|
972
|
July 22, 2024
|
Need Help with Reliable Cross-Sentence Coreference Resolution for Document Summarization
|
|
0
|
120
|
October 26, 2024
|
Pre-training LayoutLMv2
|
|
0
|
660
|
November 16, 2021
|
Huggingface token returning an invalid token
|
|
1
|
1446
|
May 17, 2024
|
Baffling performance issue on most NVidia GPUs with simple transformers + pytorch code
|
|
5
|
4491
|
April 9, 2024
|
502 server error when running model
|
|
3
|
5358
|
July 4, 2023
|
How to customize behavior of added special tokens in a pretrained tokenizer?
|
|
0
|
602
|
May 5, 2021
|
Past_key_value with multiple new tokens
|
|
1
|
1326
|
August 10, 2023
|
Deepspeed integration with Trainer in Colab crashing: TypeError: object.__init__() takes exactly one argument (the instance to initialize)
|
|
2
|
1924
|
October 1, 2023
|
How can I evaluate a fine tuned LLM?
|
|
4
|
833
|
January 7, 2025
|
Retraining peft model
|
|
3
|
2915
|
March 1, 2024
|
Saving model per some step when using Trainer
|
|
3
|
9163
|
December 11, 2023
|