Flan-T5 - Finetuning to a Longer Sequence Length (512 -> 2048 tokens): Will it work?
|
|
3
|
4270
|
January 9, 2024
|
Inference is slow on M1 Mac despite MPS Torch backend
|
|
4
|
3793
|
May 26, 2024
|
Output_attention = True after downloading a model
|
|
2
|
4859
|
August 29, 2020
|
Checkpoint vs model weight
|
|
2
|
4859
|
October 12, 2020
|
How to calculate embeddings with Llama-2 model
|
|
3
|
13283
|
October 31, 2023
|
Using GPU with transformers
|
|
4
|
11839
|
November 3, 2020
|
Accessing uncontextualized BERT word embeddings
|
|
2
|
1505
|
October 30, 2020
|
ValueError: not enough values to unpack (expected 2, got 1)
|
|
7
|
9189
|
July 5, 2021
|
BART Paraphrasing
|
|
6
|
3095
|
February 18, 2022
|
Inference Endpoints fail to start
|
|
1
|
1830
|
August 3, 2023
|
What does EvalPrediction.predictions contain exactly?
|
|
8
|
8625
|
August 3, 2023
|
ValueError: Expected input batch_size (16) to match target batch_size (64)
|
|
7
|
5082
|
November 7, 2023
|
Model Parallelism, how to parallelize transformer?
|
|
3
|
12775
|
June 18, 2021
|
Resources for using custom models with trainer
|
|
6
|
5413
|
April 6, 2021
|
Decoder vs Encoder-decoder clarification
|
|
3
|
12580
|
August 1, 2023
|
Truncation strategy for long text documents
|
|
4
|
3556
|
October 27, 2023
|
Unable to load 8bit model in Kaggle with dual GPU
|
|
5
|
1822
|
April 3, 2023
|
Ideal loss and training values?
|
|
1
|
315
|
May 20, 2025
|
Proxy Issues While Accessing Hugging Face
|
|
1
|
3125
|
July 30, 2024
|
Missing keys "model.embeddings.position_ids" when loading model using state_dict
|
|
4
|
11103
|
August 21, 2020
|
Fine-Tune BART using "Fine-Tuning Custom Datasets" doc
|
|
6
|
9372
|
October 28, 2020
|
Custom Training Loss Function for Seq2Seq BART
|
|
1
|
1743
|
July 21, 2023
|
[Solved] TypeError: Object of type int64 is not JSON serializable
|
|
1
|
9722
|
August 28, 2024
|
How can I get advantage using multi-GPUs
|
|
5
|
3150
|
February 3, 2021
|
Fine tuning gpt2 for question answering
|
|
3
|
12174
|
March 29, 2024
|
Model Suggestion on Text correction
|
|
0
|
769
|
April 2, 2021
|
How to download dataset on Huggingface?
|
|
4
|
10849
|
October 19, 2023
|
How do I change the cache default folder for "hub"?
|
|
6
|
9174
|
October 10, 2024
|
'utf-8' codec can't decode byte 0xff in position 0: invalid start byte
|
|
3
|
12120
|
August 23, 2023
|
Could not locate the configuration_hf_nomic_bert.py inside nomic-ai/nomic-bert-2048
|
|
3
|
681
|
September 30, 2024
|
Number of words
|
|
6
|
9057
|
February 22, 2021
|
Reshaping logits when using Trainer
|
|
1
|
5357
|
May 23, 2022
|
How to get probabilities per label in finetuning classification task?
|
|
5
|
5491
|
February 18, 2022
|
Bert ner classifier
|
|
5
|
5480
|
May 3, 2021
|
How to evaluate the performance of test set in Trainer class
|
|
1
|
2988
|
July 3, 2022
|
How to use hugging face to fine-tune ollama's local model
|
|
7
|
8397
|
August 28, 2024
|
How to set multiple files in `LineByLineTextDataset`?
|
|
1
|
1676
|
February 28, 2023
|
Error calling custom tool - smolagents library
|
|
6
|
893
|
January 16, 2025
|
Cannot Log In – 403 Error and Account Possibly Blocked Without Cause
|
|
5
|
172
|
July 10, 2025
|
Learn About GPU Throttling Quota (Another Stupid Guy) :D
|
|
3
|
1164
|
November 26, 2024
|
Runtime error on huggingface spaces
|
|
8
|
7716
|
June 16, 2024
|
Speaker diarization with Whisper?
|
|
1
|
5167
|
January 31, 2023
|
Load fine tuned model from local
|
|
4
|
10298
|
October 20, 2020
|
RAG LLM Generating the Prompt also at the response
|
|
8
|
4309
|
September 25, 2024
|
Need to create tags out of text
|
|
2
|
746
|
April 19, 2024
|
Llama-2 on colab
|
|
3
|
11445
|
November 28, 2023
|
Fine-tuning GPT2 for text-generation with TensorFlow
|
|
4
|
5709
|
July 24, 2022
|
Tokenizing two sentences with the tokenizer
|
|
1
|
2852
|
October 18, 2021
|
Cannot use apply_chat_template() because tokenizer.chat_template is not set
|
|
6
|
4815
|
December 1, 2024
|
How should a Absolute Beginners Start learning ML/LLM in 2024
|
|
6
|
8512
|
December 2, 2024
|