Training a model to autocomplete for a niche domain and a specific style
|
|
0
|
155
|
February 24, 2024
|
How to implement early stopping in bert fine tuning for token classification
|
|
0
|
291
|
February 24, 2024
|
Train loss is not decreasing on siamese model based on xlm-roberta
|
|
1
|
396
|
February 22, 2024
|
How to install latest version of transformers on Inference Endpoint dedicated server?
|
|
0
|
73
|
February 22, 2024
|
Getting self-attention values of the GPT2LMHead model before softmax
|
|
0
|
107
|
February 22, 2024
|
Unable to load checkpoint after finetuning
|
|
5
|
1902
|
February 21, 2024
|
OSError: dggokul21/Testcase_Generator does not appear to have a file named pytorch_model.bin, tf_model.h5, model.ckpt or flax_model.msgpack
|
|
0
|
98
|
February 21, 2024
|
OSError: Unable to load weights from pytorch checkpoint file
|
|
21
|
58730
|
February 21, 2024
|
I want to use EncoderDecoderModel.from_pretrained() where time-series-transformer is the encoder and gpt2 as decoder
|
|
3
|
152
|
February 20, 2024
|
Use embeddings stored in vector db to reduce work for LLM generating response
|
|
0
|
802
|
February 19, 2024
|
Create Custom Loss function for transformers using a diffusion model and CLIP
|
|
0
|
250
|
February 19, 2024
|
What's a **fair** way to compute similarities for Contrastive Learning?
|
|
0
|
102
|
February 18, 2024
|
Hugging Face and Distributed Training: DDP/DP Implementation Help Needed
|
|
0
|
272
|
February 14, 2024
|
ASR on inference endpoints
|
|
1
|
253
|
February 11, 2024
|
HF transformers run a process parallel to LLM generation
|
|
0
|
147
|
February 10, 2024
|
How to make multiple async calls to AsyncOpenAI and return results to Gradio UI
|
|
0
|
667
|
February 9, 2024
|
Alternating between batches of different datasets
|
|
0
|
125
|
February 8, 2024
|
Trying to recreate `model.greedy_search()` for custom decoding of LLM output, but I am getting a different decoded output
|
|
3
|
188
|
February 8, 2024
|
Change from resnet152 to resnet50
|
|
0
|
93
|
February 7, 2024
|
Generate token by token for m2m100_418
|
|
0
|
111
|
February 6, 2024
|
Run_backward: expected dtype Float but got dtype Long
|
|
2
|
342
|
February 5, 2024
|
Finetuning of conversational model without train data in conversation style
|
|
1
|
567
|
February 2, 2024
|
Identifying most useful domain-specific tokens for adding to the existing tokenizer
|
|
1
|
326
|
February 2, 2024
|
Huggingface endpoint with chat agents for conversational NLU
|
|
0
|
114
|
February 1, 2024
|
Common practice, using the hidden state associated with [cls] as an input feature for a classification task?
|
|
3
|
2249
|
January 31, 2024
|
Peft following bits and bytes seems to have no effect on LLM
|
|
0
|
275
|
January 31, 2024
|
Text classification training on long text
|
|
1
|
1915
|
January 29, 2024
|
How to load quantized LLM to CPU only device
|
|
0
|
934
|
January 28, 2024
|
Hyperparameter-Search while adding Special tokens
|
|
1
|
466
|
January 28, 2024
|
Evaluation and compute_metrics slowdown
|
|
0
|
547
|
August 29, 2023
|