Need help about the using of transformers GPT2 for training
|
|
0
|
106
|
May 29, 2024
|
How to use GPT4 with trl PPO script
|
|
0
|
125
|
May 28, 2024
|
Freezing layers with SFTTrainer
|
|
0
|
121
|
May 28, 2024
|
Way to fine tune pre trained model & get the embeddings
|
|
2
|
2562
|
May 28, 2024
|
Security of the LLM applications
|
|
1
|
139
|
May 26, 2024
|
Forward method inconsistent for time series transformer
|
|
0
|
86
|
May 26, 2024
|
Evaluating RAG only with open-source
|
|
1
|
525
|
May 24, 2024
|
Accessing certain hidden layer layer outputs
|
|
0
|
117
|
May 22, 2024
|
VisEncoderDecoderModel generate text incomplete when predict image with long text label
|
|
0
|
85
|
May 21, 2024
|
Inference time in TGI quantization
|
|
0
|
124
|
May 21, 2024
|
Document Object Model (DOM) similarity learning
|
|
3
|
667
|
May 20, 2024
|
Ccreate continues set of generated images - same style and characters
|
|
0
|
100
|
May 20, 2024
|
What Model and approach should i use for my use case
|
|
2
|
148
|
May 20, 2024
|
Build pretrained huggingface whisper for tensorrt-llm
|
|
0
|
191
|
May 20, 2024
|
Windows 11 does not see my 2nd GPU (4090 + 4080)
|
|
0
|
109
|
May 19, 2024
|
How to use `inputs_embed` and `attention_mask` together?
|
|
1
|
664
|
May 19, 2024
|
What is the correct way to parse data for DPO? Do you seperate out prompt or not?
|
|
0
|
89
|
May 19, 2024
|
GPU memory usage is twice (2x) what I calculated based on number of parameters and floating point precision
|
|
5
|
219
|
May 18, 2024
|
Change saving metric in Trainer
|
|
2
|
865
|
May 18, 2024
|
Huggingface token returning an invalid token
|
|
1
|
1158
|
May 17, 2024
|
Can't change max_input_length of Text Generation Inference
|
|
0
|
109
|
May 15, 2024
|
Questions about Mistral and apply_chat_template with Text Generation Inference, openai API and messages API
|
|
0
|
133
|
May 15, 2024
|
Question regarding adding a 4080 (and 3080?) to a 4090 rig for AI
|
|
2
|
155
|
May 15, 2024
|
Getting nan while fine tuning Blip 2 and weired output
|
|
0
|
113
|
May 14, 2024
|
Push model to hugging face hub without Trainer
|
|
7
|
1236
|
May 14, 2024
|
Implement few-shot inference for question-answering with DistilBERT
|
|
0
|
98
|
May 13, 2024
|
Negative KL-divergence RLHF implementation
|
|
1
|
1159
|
May 13, 2024
|
Llama2 tools instruction wierd reponse
|
|
2
|
127
|
May 8, 2024
|
The model did not return a loss from the inputs, only the following keys: logits. For reference, the inputs it received are input_values
|
|
19
|
29369
|
May 8, 2024
|
Adding another head to Vision encoder decoder model
|
|
4
|
188
|
May 7, 2024
|