XLA Integration for TensorFlow Models
|
|
0
|
142
|
October 19, 2023
|
Using 3 GPUs for training with Trainer() of transformers
|
|
2
|
2327
|
October 18, 2023
|
IA3 vs LoRA for summarization
|
|
0
|
571
|
October 18, 2023
|
Query endpoint for LLM
|
|
0
|
424
|
October 18, 2023
|
Using decoder only part of pretrained MarianMT (Encoder-Decoder Translation model)
|
|
2
|
531
|
October 18, 2023
|
Choose number of labels in WhisperForAudioClassification
|
|
2
|
569
|
October 18, 2023
|
Fine-tuning TrOCR on custom dataset
|
|
1
|
2745
|
October 18, 2023
|
Summarization pipeline
|
|
0
|
198
|
October 17, 2023
|
Unable to train model (Loss is 0.000000)
|
|
2
|
1097
|
October 17, 2023
|
What HF task is best for text-to-SQL?
|
|
0
|
301
|
October 16, 2023
|
Why is Code LLama token for prefix, suffix, etc weird underscore character
|
|
4
|
1185
|
October 16, 2023
|
How to build a Resume matcher to increase the probability of passing an ATS system with huggingface pipelines
|
|
0
|
642
|
October 15, 2023
|
How to train blended_skill_talk with transformers.trainer?
|
|
1
|
394
|
October 15, 2023
|
Pass CausalLM KV cache into the next inference batch
|
|
0
|
568
|
October 14, 2023
|
Query pertaining to differentiability of CLIPProcessor
|
|
0
|
143
|
October 14, 2023
|
Time Series Forecasting on positive AND negative Examples
|
|
0
|
395
|
October 14, 2023
|
Fine tunning t5: Too many values to unpack (expected 2)
|
|
0
|
214
|
October 14, 2023
|
Getting error when running inference in multiple GPUs
|
|
0
|
656
|
October 13, 2023
|
DistilBert tokenization does not work as expected
|
|
0
|
234
|
October 13, 2023
|
Target size (torch.Size([8])) must be the same as input size (torch.Size([8, 2]))
|
|
5
|
5545
|
October 13, 2023
|
Why async gradient update doesn't get popular in LLM community?
|
|
3
|
328
|
October 13, 2023
|
When does transformers support pipeline parallelism?
|
|
0
|
203
|
October 13, 2023
|
Using proxy to upload models
|
|
4
|
13663
|
October 13, 2023
|
Generating text from pretrained-bert based decoder
|
|
1
|
254
|
October 12, 2023
|
Retrieval Augmented Generation using Transformer Eco System
|
|
0
|
474
|
October 12, 2023
|
Zero-shot Classification With Generative Language Models
|
|
0
|
711
|
October 12, 2023
|
How to set generate parameters in fine-tuning
|
|
1
|
751
|
October 12, 2023
|
Peft model from pretrained load in 8/4 bit
|
|
6
|
17706
|
October 12, 2023
|
MFU number with Finetuning llama 70b using fsdp
|
|
0
|
666
|
October 11, 2023
|
mT5-base model summary language issue
|
|
1
|
350
|
October 11, 2023
|