Self correction by model
|
|
7
|
96
|
September 30, 2024
|
I want to fine tune the KoGPT2 model using Trainer
|
|
0
|
482
|
December 7, 2020
|
Regarding add extra class in fine-tune model
|
|
0
|
481
|
March 7, 2022
|
How to ensuring a new instance of a Language Model (LLM) agent is created or simply specific function executed with every refresh of a web application, as demonstrated in the provided Python code
|
|
0
|
477
|
August 26, 2023
|
Calibrating a transformers model with scipy CalibratedClassifierCV?
|
|
0
|
475
|
July 5, 2023
|
[deepspeed] bigscience/T0* multi-gpu text generation
|
|
0
|
475
|
September 8, 2022
|
Loading adapter merged models
|
|
0
|
473
|
May 29, 2023
|
Get UnicodeEncodeError while using pipeline for question answering
|
|
0
|
472
|
October 12, 2022
|
Tokens in vector space
|
|
0
|
471
|
March 24, 2022
|
Evaluating the model during the run
|
|
0
|
471
|
December 29, 2021
|
Forge synthetic past_key_value batch from multiple outputs
|
|
0
|
470
|
May 12, 2021
|
Web parsing in HuggingChat
|
|
0
|
469
|
October 10, 2023
|
Is there a way to use mean_pooling with Roberta?
|
|
0
|
467
|
April 6, 2022
|
Can't locate the error in my dataset
|
|
3
|
234
|
September 30, 2024
|
4-bit quantization
|
|
0
|
465
|
November 18, 2023
|
VAE for Motion Sequence Generation - Convergence Issue with Scheduled Sampling
|
|
0
|
465
|
September 27, 2023
|
Finetuning BERT on TPU is very slow
|
|
0
|
462
|
August 11, 2022
|
Token Classification Model making mistake outside of training dataset
|
|
0
|
461
|
October 30, 2021
|
Seq2SeqTrainer Error
|
|
0
|
459
|
June 12, 2023
|
When a LLM gives a wrong answer, is it more likely to give a wrong answer on subsequent unrelated questions?
|
|
2
|
149
|
December 17, 2024
|
DataCollator for list of inputs?
|
|
0
|
456
|
November 1, 2022
|
Best Model for Question + Answer Embeddings
|
|
0
|
455
|
March 15, 2024
|
I want to use EncoderDecoderModel.from_pretrained() where time-series-transformer is the encoder and gpt2 as decoder
|
|
3
|
227
|
February 20, 2024
|
Transliterating european languages
|
|
0
|
453
|
March 10, 2022
|
Whats the maths behind padding_to_longest vs padding_to_model_max_len?
|
|
1
|
319
|
July 20, 2022
|
Alibi and Extrapolation
|
|
0
|
450
|
May 29, 2023
|
Blenderbot 1.0B Distilled eats up memory over many inferences
|
|
0
|
450
|
March 7, 2022
|
Is it possible to disassemble a zero-shot model?
|
|
0
|
449
|
March 3, 2022
|
Train new VisonEncoderDecoder model for new languages
|
|
0
|
446
|
February 3, 2022
|
Can an EncoderModel be trained on top of a concatenation of BertModel [CLS] embeddings with additional input data using the transformers library?
|
|
0
|
445
|
December 9, 2022
|
Finetuning a Large Language Model
|
|
0
|
79
|
October 23, 2024
|
Creating a custom loss function for token appearance based in BART on the input
|
|
0
|
440
|
February 11, 2022
|
How to get sentences from embeddings
|
|
0
|
440
|
February 3, 2022
|
Fine-tuning code embedding model for multilingual query-code pairs
|
|
2
|
45
|
March 25, 2025
|
Extracting attention weights of summarization model
|
|
0
|
436
|
August 12, 2021
|
Create DPR Tokenizer for non-Bert model
|
|
1
|
308
|
September 7, 2021
|
Trainer code for token-wise prediction model
|
|
0
|
435
|
June 6, 2022
|
How to create a Custom Feature Extractor that can be published to Huggingface
|
|
0
|
435
|
July 5, 2022
|
How to fine tune BERT with customized classifier and loss function?
|
|
0
|
434
|
March 12, 2021
|
Adding Entity Tags to Transformer Input Embedding for Text Summarization
|
|
0
|
429
|
January 6, 2023
|
How does Huggingface Trainer handle Iterable dataset on TPU?
|
|
0
|
428
|
February 16, 2022
|
Preventing Toxic Outputs
|
|
1
|
302
|
April 1, 2021
|
How to get words from subwords ? Model used to get subwords: T5 model - subwords got by using multihead attention
|
|
0
|
425
|
April 19, 2022
|
Save CamemBert model wrapped in keras
|
|
0
|
423
|
November 2, 2020
|
Fine-tuning: Under the hood
|
|
0
|
421
|
July 11, 2023
|
How to restrict T5 model to generate tokens only from the input text?
|
|
0
|
420
|
June 6, 2023
|
Uploading model with negative prompt?
|
|
0
|
420
|
December 19, 2022
|
Creating distillated version of gelectra-base model
|
|
0
|
419
|
April 5, 2022
|
Fill multiple masks
|
|
0
|
413
|
October 10, 2022
|
Any tutorials on creating Open QA system?
|
|
2
|
238
|
January 16, 2023
|