Generate 'continuation' for seq2seq models
|
|
1
|
1871
|
February 22, 2021
|
AttributeError: 'NoneType' object has no attribute 'dtype'
|
|
8
|
24537
|
January 17, 2023
|
GPT2Tokenizer not putting bos/eos token
|
|
3
|
5542
|
March 31, 2024
|
How to find closest embedding vectors?
|
|
2
|
1763
|
July 26, 2022
|
Issues when using `accelerate` with `fp16`
|
|
4
|
12139
|
January 22, 2024
|
Token per second calculations
|
|
2
|
2604
|
April 20, 2025
|
Build an end2end nlp toolkit with transformers and dataset
|
|
0
|
424
|
October 9, 2020
|
Giving a personality to a bot using a LLM
|
|
0
|
2142
|
April 11, 2023
|
Document Object Model (DOM) similarity learning
|
|
4
|
858
|
July 11, 2025
|
How to implement bind_tools to custom LLM from huggingface pipeline(Llama-3) for a custom agent
|
|
3
|
1405
|
June 9, 2025
|
Question Answering for generating long answers
|
|
2
|
2875
|
June 4, 2021
|
Error Training Vision Encoder Decoder for Image Captioning
|
|
8
|
2942
|
June 8, 2024
|
Running into cuda out of memory when running llama2-13b-chat model on multi-gpu machine
|
|
5
|
11114
|
December 21, 2023
|
I can't understand why generative models make repetitions
|
|
2
|
4867
|
August 26, 2022
|
Fine-tuning LLM model for E-commerce Chatbot recomendation
|
|
0
|
1451
|
March 17, 2023
|
Using Tensorboard SummaryWriter with HuggingFace TrainerAPI
|
|
4
|
11531
|
August 24, 2023
|
ERROR: vars() argument must have __dict__ attribute when trying to use trainer.train()?
|
|
5
|
18636
|
December 26, 2022
|
Tokenizer deprecating in ORPO
|
|
6
|
3012
|
October 25, 2024
|
TypeError: Repository.__init__() got an unexpected keyword argument 'token'
|
|
8
|
14720
|
August 9, 2023
|
API: Quota exceeded for machine error
|
|
0
|
1390
|
June 22, 2023
|
Stopping `model.generate()` based on custom token
|
|
2
|
4427
|
October 18, 2021
|
How to fine-tune with unsloth using multiple GPUs as I'm getting out-of-memory error after running os.environ["CUDA_VISIBLE_DEVICES"]
|
|
3
|
3829
|
December 4, 2024
|
How to choose optimal batch size for training LLMs?
|
|
4
|
19195
|
December 18, 2023
|
An easy way to make huggingface PRs
|
|
3
|
1200
|
July 28, 2020
|
Typical sampling decoding technique
|
|
1
|
1680
|
April 28, 2023
|
How to save model in S3 with Trainer?
|
|
5
|
5144
|
May 26, 2023
|
Exceeded your monthly included credits for Inference Providers
|
|
8
|
1226
|
April 17, 2025
|
Replace special [unusedX] tokens in a tokenizer to add domain-specific words
|
|
0
|
1117
|
October 12, 2023
|
Out of index error when using pre-trained Pegasus model
|
|
2
|
1995
|
April 1, 2021
|
Inference after QLoRA fine-tuning
|
|
8
|
6376
|
June 7, 2024
|
How to set the padding configuration with Huggingface's GenerateMixin's generate method?
|
|
7
|
11414
|
September 26, 2023
|
How can I evaluate a fine tuned LLM?
|
|
4
|
1442
|
January 7, 2025
|
Is 'autoplay' possible for an audio file in gradio?
|
|
1
|
2188
|
April 7, 2023
|
Huggingface on Databricks
|
|
0
|
971
|
November 12, 2021
|
Using TRL on TPU
|
|
1
|
216
|
February 11, 2025
|
Ignore numbers while generation
|
|
3
|
857
|
April 12, 2022
|
Docker image: transformers-all-latest-gpu not running
|
|
0
|
951
|
March 30, 2024
|
Training a model to autocomplete for a niche domain and a specific style
|
|
2
|
897
|
February 19, 2025
|
Fine-Tuning + RAG based Chatbot: Dataset Structure & Instruction Adherence Issues
|
|
7
|
533
|
March 11, 2025
|
Understanding zero-shot classification in one-shot ;-)
|
|
3
|
2372
|
August 2, 2021
|
The Correct Attention Mask For Examples Packing
|
|
6
|
3157
|
January 8, 2025
|
Evaluation and compute_metrics slowdown
|
|
0
|
798
|
August 29, 2023
|
Need Help with Reliable Cross-Sentence Coreference Resolution for Document Summarization
|
|
0
|
143
|
October 26, 2024
|
Getting hidden states from the "automatic-speech-recognition" pipeline
|
|
0
|
792
|
July 15, 2022
|
Lora: missing adapter keys while loading the checkpoint
|
|
2
|
1423
|
January 6, 2025
|
Problem installing using conda
|
|
4
|
10995
|
June 13, 2021
|
How to use `inputs_embed` and `attention_mask` together?
|
|
1
|
964
|
May 19, 2024
|
Implementation of NER model with relationship extraction?
|
|
3
|
6688
|
September 25, 2024
|
Format Reward Function in GRPO Training Doesn't Stabilise
|
|
0
|
721
|
February 12, 2025
|
What is an embedding?
|
|
4
|
1018
|
July 22, 2024
|