Compatibility Issue of Transformers Library with TensorFlow 2.18
|
|
2
|
404
|
November 12, 2024
|
Speed issues using tokenizer.train_new_from_iterator on ~50GB dataset
|
|
7
|
2212
|
November 11, 2024
|
T5 variants return Training Loss 0 and Validation loss nan while fine tuning
|
|
8
|
5370
|
November 10, 2024
|
Runtime error Exit code: 0. Reason: application does not seem to be initialized Container logs: ===== Application Startup at 2024-11-09 16:07:19 ===== INFO:__main__:Loading Qwen2-VL model... The argument `trust_remote_code` is to be used with Auto class
|
|
1
|
353
|
November 10, 2024
|
CUDA Out Of Memory when training a DETR Object detection model with compute_metrics
|
|
0
|
89
|
November 9, 2024
|
RuntimeError When Saving Phi 3.5 Vision Due to Shared Tensors
|
|
1
|
228
|
November 9, 2024
|
Error ValueError: too many values to unpack (expected 2) in model training
|
|
1
|
74
|
November 9, 2024
|
ValueError: Unable to create tensor, you should probably activate truncation... but only for training on multiple GPUs or with multi-batch
|
|
3
|
460
|
November 8, 2024
|
ImportError: cannot import name '_expand_mask' from 'transformers.models.bloom.modeling_bloom'
|
|
1
|
1341
|
November 8, 2024
|
How can I stop text generation naturally in an LLM running locally with Hugging Face, without using a hard MAX TOKEN limit?
|
|
1
|
351
|
November 8, 2024
|
Guidance on Optimizing Text Similarity and Reporting with Transformers and Advanced NLP Techniques
|
|
0
|
33
|
November 7, 2024
|
Best way to do multi- to univariate time series prediction
|
|
3
|
83
|
November 7, 2024
|
LLama2-7b QA gives unwanted characters in text_output during inference
|
|
0
|
9
|
November 7, 2024
|
Increasing pretrained CLIP max possible text sequence length
|
|
2
|
1467
|
November 7, 2024
|
Supervised Fine-tuning Trainer - Custom Loss Function
|
|
3
|
4471
|
November 7, 2024
|
Do not save runs (TensorBoard) after the epoch has ended
|
|
3
|
17
|
November 6, 2024
|
Bitsandbytes quantization and QLORA fine-tuning
|
|
1
|
262
|
November 5, 2024
|
BERT token classification / regression question
|
|
0
|
34
|
November 5, 2024
|
Fine-Tuning a Language Model with Data Extracted from Multiple PDFs for a Chat Interface
|
|
2
|
2567
|
November 5, 2024
|
Beam search does not reach the stopping criteria and causes cuda oom
|
|
1
|
279
|
November 5, 2024
|
Help Needed: Converting OpenNMT Model to Hugging Face Format
|
|
1
|
88
|
November 5, 2024
|
Why can padding tokens attend to other tokens in masked self attention?
|
|
0
|
67
|
November 4, 2024
|
What is the classification head doing exactly?
|
|
16
|
24197
|
November 4, 2024
|
CUDA Out of Memory Error When Training Specific Layers
|
|
6
|
351
|
November 2, 2024
|
Bug in gradient accumulation training_step in huggingface Trainer?
|
|
3
|
751
|
November 2, 2024
|
Llama32-11b inferencing took 6 minutes to answer
|
|
7
|
357
|
November 2, 2024
|
/home/user/app/llama32-omran.ipynb disappeared
|
|
2
|
30
|
November 1, 2024
|
RuntimeError: CUDA error: device-side assert triggered CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1 Compile with `TORCH_USE_CU
|
|
2
|
947
|
November 1, 2024
|
How to avert 'loading checkpoint shards'?
|
|
4
|
12235
|
November 1, 2024
|
How to cache common instruction prompt
|
|
16
|
2222
|
October 31, 2024
|