How to Port or Convert facebook/fairseq models to Hugginface in order to Fine-Tune and Inference
|
|
1
|
436
|
February 27, 2023
|
How can I use diverse beam-search? (it isn't working in my code)
|
|
1
|
2233
|
February 26, 2023
|
Conversion of B to I tokens, didn't update number labels for model
|
|
0
|
231
|
February 26, 2023
|
No weights has been used to initialize the model
|
|
0
|
353
|
February 26, 2023
|
How to prune a transformer?
|
|
1
|
1662
|
February 24, 2023
|
Visualizing attention heatmaps of layoutlmv3
|
|
0
|
1116
|
February 25, 2023
|
Image Captioning fine tuning
|
|
0
|
440
|
February 25, 2023
|
Any ways for the QnA model tp highlight the content of a answer?
|
|
1
|
178
|
February 24, 2023
|
Inference time gets slower as dataset size increase
|
|
0
|
435
|
February 23, 2023
|
How to write a custom configuration for hugging face model for Token Classification
|
|
1
|
2159
|
February 23, 2023
|
Fine-tune Language Models with Layer Freezing in run_clm.py
|
|
0
|
331
|
February 23, 2023
|
Always only a single Linear layer as the classification head?
|
|
0
|
344
|
February 23, 2023
|
Save bert-base-uncased model as checkpoint
|
|
0
|
262
|
February 22, 2023
|
Gradient_checkpointing = True results in error
|
|
3
|
8719
|
February 22, 2023
|
CLIP model incorporated in CLIPSeg
|
|
0
|
743
|
February 22, 2023
|
How to extract encoding before classification layer?
|
|
0
|
574
|
February 21, 2023
|
Storage Full while finetuning with 8gpu 1tb and s3 bucket
|
|
1
|
249
|
February 20, 2023
|
Multi node CPU to train transformer GPT-JT-6B-v1 (moved)
|
|
0
|
424
|
February 20, 2023
|
Model for high-dim numerical input that supports MAE?
|
|
0
|
254
|
February 20, 2023
|
TypeError: _wrap_model() got an unexpected keyword argument 'dataloader'
|
|
0
|
374
|
February 17, 2023
|
How can i skip GPT2LMHeadModel embedding layers?
|
|
4
|
1030
|
February 17, 2023
|
Fine-tune Bloom model but getting mlflow error
|
|
0
|
811
|
February 17, 2023
|
How to create a custom decoding strategy in the GenerationMixin class?
|
|
2
|
1271
|
February 16, 2023
|
How to provide a target and input separately for Trainer?
|
|
0
|
357
|
February 16, 2023
|
How to fine tune gpt2 for chinese sentences
|
|
0
|
276
|
February 16, 2023
|
How is the "sequences_scores" field in the "generate()" method calculated?
|
|
1
|
1521
|
February 15, 2023
|
LogitsProcessor vs LogitsWarper
|
|
1
|
1333
|
February 15, 2023
|
How to construct Chinese dataset with gpt2 fine tune
|
|
0
|
258
|
February 14, 2023
|
Dynamic decoder token masking
|
|
0
|
242
|
February 13, 2023
|
Fine tuning a sentence transformer model for [single_sentence, label] format?
|
|
0
|
508
|
February 13, 2023
|