Falcon-7b-instruct ALWAYS returns SHORT ANSWERS on inference endpoint
|
|
1
|
906
|
September 5, 2023
|
How do I finetune Blip2 model on a custom dataset?
|
|
1
|
508
|
October 1, 2024
|
Fine tune model='facebook/bart-large-mnli'
|
|
0
|
1270
|
May 16, 2022
|
HF Dataset as a Replay Buffer for RL applications
|
|
6
|
480
|
March 9, 2023
|
How to correct TypeError: zip argument #1 must support iteration training in multiple GPU
|
|
1
|
893
|
February 28, 2023
|
Text classification on small dataset (8K)
|
|
1
|
893
|
July 27, 2021
|
Repeatedly decoding tokens multiple times after PEFT fine-tuning whisper
|
|
2
|
726
|
September 20, 2023
|
Opinion: Training Argument Fine Tuning MLM RoBERTa
|
|
1
|
157
|
January 9, 2025
|
Reduced inference f1 score with QLoRA finetuned model
|
|
1
|
880
|
September 6, 2023
|
Optimal methods to monitor attention matrices when doing training/inference using BERT-type models
|
|
2
|
709
|
September 11, 2021
|
How to implement Key Query Layer Normalized Transformers/LLMs in Huggingface?
|
|
0
|
1225
|
February 18, 2023
|
How to load my own pretrained model to huggingface code
|
|
1
|
859
|
January 31, 2023
|
Fine tune gpt2 language model error - unexpected keyword argument 'cache_dir'
|
|
0
|
1205
|
September 8, 2020
|
How to get a model on patent data for question answering
|
|
1
|
851
|
October 15, 2021
|
Adapter Training - Merging Weights
|
|
0
|
1189
|
April 6, 2023
|
Pyannotate pipeline() not working
|
|
6
|
254
|
January 9, 2025
|
How to implement LoRA with Pytorch?
|
|
0
|
1185
|
November 30, 2023
|
Getting completely different performance when trying to write a custom model
|
|
1
|
837
|
June 9, 2022
|
TokenClassification pipeline doing batch processing over a sequence of already tokenised messages
|
|
1
|
831
|
July 6, 2022
|
Inference Endpoints creation
|
|
1
|
467
|
January 14, 2024
|
Sequence Length in Continued Pretraining (MLM) & Masking Strategies
|
|
0
|
1173
|
January 6, 2022
|
Starting my first pull request, but transformers tests stall
|
|
3
|
584
|
November 24, 2021
|
Parameters for evaluation loop of a Seq2SeqTrainer model
|
|
0
|
1165
|
November 26, 2021
|
I Made a simple CLI for playing with BLOOM
|
|
2
|
663
|
September 24, 2022
|
Issues with Pythia model finetuning
|
|
1
|
811
|
October 3, 2023
|
Using same instructions for fine-tuning: Is this bad for the model?
|
|
1
|
456
|
March 26, 2024
|
QLoRA memory requirement with 3B model loads GPU with 10GB of memory with 4bit quantization
|
|
0
|
1143
|
December 19, 2023
|
Meta-Llama-3-8B-Instruct: "max_new_tokens" is not working for /v1/chat/completions
|
|
1
|
808
|
July 2, 2024
|
Inference Endpoints 401 Error
|
|
2
|
370
|
July 15, 2024
|
Sagemaker model parallelism- running the model results in Maximum recursion limit
|
|
7
|
400
|
July 24, 2023
|
Advice Needed for Training an Imbalanced Dataset AI Model: lr, Epochs, and neuronal Architecture
|
|
3
|
565
|
November 27, 2023
|
Positional Encoding error, Protein Bert Model
|
|
2
|
652
|
October 25, 2020
|
Is native Pytorch training loop much slower than Trainer?
|
|
4
|
504
|
November 11, 2024
|
Finetuning using Raytune: Failed to unpickle serialized exception
|
|
0
|
200
|
November 21, 2024
|
Tokenizer from a GGUF file in Python?
|
|
1
|
788
|
May 6, 2024
|
How to increase tokens text generation API
|
|
1
|
753
|
August 28, 2022
|
Can run_clm.py do early stopping?
|
|
2
|
614
|
August 25, 2022
|
GPU memory usage is twice (2x) what I calculated based on number of parameters and floating point precision
|
|
5
|
435
|
May 18, 2024
|
Does it ever make sense to finetune w fp32 if the base model was trained w fp16?
|
|
1
|
745
|
July 8, 2022
|
Llamma index Saving and Loading
|
|
1
|
743
|
January 2, 2024
|
Embeddings of added words
|
|
1
|
740
|
September 9, 2022
|
Experience with and extending LLM for software engineering
|
|
4
|
470
|
August 15, 2024
|
PydanticUserError: The `__modify_schema__` method is not supported in Pydantic v2. Use `__get_pydantic_json_schema__` instead in class `SecretStr`
|
|
1
|
414
|
January 22, 2025
|
Trainer with fp8 - what to use in accel CLI vs. TrainingArguments
|
|
1
|
731
|
April 24, 2024
|
Fine-tuning with LoRA; can't learn
|
|
0
|
1033
|
May 7, 2023
|
Forward and reverse detokinizing
|
|
1
|
728
|
December 26, 2021
|
How to add EOS when training T5?
|
|
1
|
129
|
October 21, 2024
|
Pipelines, Whisper and how to set parameters
|
|
1
|
724
|
January 12, 2024
|
Nested named entity recognition
|
|
2
|
587
|
March 19, 2024
|
Loading two models onto two gpus
|
|
0
|
1015
|
March 31, 2023
|