Intermediate

Topic	Replies	Views	Activity
Self correction by model	7	96	September 30, 2024
I want to fine tune the KoGPT2 model using Trainer	0	482	December 7, 2020
Regarding add extra class in fine-tune model	0	481	March 7, 2022
How to ensuring a new instance of a Language Model (LLM) agent is created or simply specific function executed with every refresh of a web application, as demonstrated in the provided Python code	0	477	August 26, 2023
Calibrating a transformers model with scipy CalibratedClassifierCV?	0	475	July 5, 2023
[deepspeed] bigscience/T0* multi-gpu text generation	0	475	September 8, 2022
Loading adapter merged models	0	473	May 29, 2023
Get UnicodeEncodeError while using pipeline for question answering	0	472	October 12, 2022
Tokens in vector space	0	471	March 24, 2022
Evaluating the model during the run	0	471	December 29, 2021
Forge synthetic past_key_value batch from multiple outputs	0	470	May 12, 2021
Web parsing in HuggingChat	0	469	October 10, 2023
Is there a way to use mean_pooling with Roberta?	0	467	April 6, 2022
Can't locate the error in my dataset	3	234	September 30, 2024
4-bit quantization	0	465	November 18, 2023
VAE for Motion Sequence Generation - Convergence Issue with Scheduled Sampling	0	465	September 27, 2023
Finetuning BERT on TPU is very slow	0	462	August 11, 2022
Token Classification Model making mistake outside of training dataset	0	461	October 30, 2021
Seq2SeqTrainer Error	0	459	June 12, 2023
When a LLM gives a wrong answer, is it more likely to give a wrong answer on subsequent unrelated questions?	2	149	December 17, 2024
DataCollator for list of inputs?	0	456	November 1, 2022
Best Model for Question + Answer Embeddings	0	455	March 15, 2024
I want to use EncoderDecoderModel.from_pretrained() where time-series-transformer is the encoder and gpt2 as decoder	3	227	February 20, 2024
Transliterating european languages	0	453	March 10, 2022
Whats the maths behind padding_to_longest vs padding_to_model_max_len?	1	319	July 20, 2022
Alibi and Extrapolation	0	450	May 29, 2023
Blenderbot 1.0B Distilled eats up memory over many inferences	0	450	March 7, 2022
Is it possible to disassemble a zero-shot model?	0	449	March 3, 2022
Train new VisonEncoderDecoder model for new languages	0	446	February 3, 2022
Can an EncoderModel be trained on top of a concatenation of BertModel [CLS] embeddings with additional input data using the transformers library?	0	445	December 9, 2022
Finetuning a Large Language Model	0	79	October 23, 2024
Creating a custom loss function for token appearance based in BART on the input	0	440	February 11, 2022
How to get sentences from embeddings	0	440	February 3, 2022
Fine-tuning code embedding model for multilingual query-code pairs	2	45	March 25, 2025
Extracting attention weights of summarization model	0	436	August 12, 2021
Create DPR Tokenizer for non-Bert model	1	308	September 7, 2021
Trainer code for token-wise prediction model	0	435	June 6, 2022
How to create a Custom Feature Extractor that can be published to Huggingface	0	435	July 5, 2022
How to fine tune BERT with customized classifier and loss function?	0	434	March 12, 2021
Adding Entity Tags to Transformer Input Embedding for Text Summarization	0	429	January 6, 2023
How does Huggingface Trainer handle Iterable dataset on TPU?	0	428	February 16, 2022
Preventing Toxic Outputs	1	302	April 1, 2021
How to get words from subwords ? Model used to get subwords: T5 model - subwords got by using multihead attention	0	425	April 19, 2022
Save CamemBert model wrapped in keras	0	423	November 2, 2020
Fine-tuning: Under the hood	0	421	July 11, 2023
How to restrict T5 model to generate tokens only from the input text?	0	420	June 6, 2023
Uploading model with negative prompt?	0	420	December 19, 2022
Creating distillated version of gelectra-base model	0	419	April 5, 2022
Fill multiple masks	0	413	October 10, 2022
Any tutorials on creating Open QA system?	2	238	January 16, 2023