Anywhere where I can read more about the `device_map` kwarg in `from_pretrained`?
|
|
2
|
14981
|
January 5, 2024
|
TokenClassification vs SequenceClassification
|
|
3
|
4071
|
March 16, 2021
|
My kernel keeps crashing when importing pipeline
|
|
3
|
1270
|
May 27, 2025
|
Source and target vs input and labels for causal autoregressive language models
|
|
1
|
1789
|
July 27, 2022
|
CUDA error: device-side assert triggered after a certain steps
|
|
7
|
15869
|
July 24, 2024
|
Exception: Helsinki-NLP/opus-mt-no-en is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
|
|
3
|
2227
|
March 22, 2024
|
Can t5 be used to text-generation?
|
|
7
|
8835
|
April 26, 2023
|
Anyone have advice on best methods to cluster BERT-embedded documents?
|
|
2
|
2540
|
August 31, 2021
|
Could Qwen Be the Best Alternative to Claude Code?
|
|
4
|
1104
|
August 5, 2025
|
How To Fine-Tune Models for Better NSFW AI Detection?
|
|
7
|
2750
|
January 23, 2025
|
Tokenizer decoding using BERT, RoBERTa, XLNet, GPT2
|
|
7
|
8585
|
September 21, 2020
|
Langchain-huggingface
|
|
7
|
8544
|
September 6, 2024
|
Payment method in India
|
|
5
|
978
|
January 11, 2025
|
Create tags / keywords from text
|
|
3
|
3784
|
July 3, 2025
|
Huggingface DecisionTransformer - Reward Calculation
|
|
0
|
237
|
September 15, 2022
|
LLama pad token
|
|
3
|
2103
|
February 18, 2025
|
Fine Tune BERT Models
|
|
5
|
16775
|
June 25, 2021
|
Hyperparameters for lr_scheduler_type in Trainer Arguments
|
|
2
|
13259
|
March 5, 2024
|
How can I batch LLaVa inference, so that I can use all of my GPU memory?
|
|
0
|
1287
|
January 8, 2024
|
Change Gemma tokenizer unused token
|
|
1
|
511
|
January 9, 2025
|
Emergent LLM abilities between training sessions
|
|
5
|
93
|
September 2, 2025
|
Wav2Vec2ForCTC and Wav2Vec2Tokenizer
|
|
4
|
5704
|
July 15, 2021
|
What is the best way to fine-tune ViT with a custom dataset?
|
|
2
|
4136
|
January 12, 2025
|
Internal server error / bool not iterable
|
|
8
|
2383
|
April 9, 2025
|
Train loss is decreasing, but accuracy remain the same
|
|
4
|
17915
|
August 25, 2021
|
Gradio Curl for Image input Not wokring
|
|
1
|
159
|
December 9, 2024
|
Early stopping training using Validation loss as the metric for best model
|
|
1
|
8938
|
February 9, 2023
|
TypeError: '>' not supported between instances of 'NoneType' and 'int' - Error while training distill bert
|
|
6
|
8450
|
April 22, 2024
|
Check Vocabulary of a model
|
|
1
|
4991
|
April 8, 2022
|
How do you manually create a paged optimizer 32 bit object in HF?
|
|
4
|
3153
|
December 18, 2024
|
How do I evaluate a pretrained model on a test dataset?
|
|
1
|
8856
|
February 24, 2022
|
Use custom loss function for training ML task
|
|
2
|
7201
|
March 17, 2022
|
Why do I get no validation loss and why are metrics not calculated?
|
|
4
|
5558
|
February 28, 2025
|
`get_peft_model` or `model.add_adapter`
|
|
2
|
1272
|
February 17, 2025
|
Unable to import faiss
|
|
4
|
9835
|
March 13, 2024
|
Coreference Resolution
|
|
2
|
3999
|
May 19, 2025
|
The point of using pretrained model if I don't freeze layers
|
|
1
|
8683
|
May 31, 2023
|
Instruction tuning llm
|
|
8
|
12909
|
May 8, 2024
|
How to create Q&A chatbot with CSV file
|
|
5
|
4993
|
July 21, 2025
|
Model = model.to(args.device) AttributeError: 'function' object has no attribute 'to'
|
|
6
|
8215
|
September 22, 2023
|
Prerequisite to run bloom locally?
|
|
8
|
12871
|
September 12, 2022
|
How to register Transformer Model in MLFLOW Model Registery
|
|
0
|
1220
|
April 4, 2022
|
TypeError: __init__() got an unexpected keyword argument 'checkpoint_callback'
|
|
4
|
17173
|
October 7, 2022
|
How to calculate the effective batch size on TPU?
|
|
2
|
2186
|
September 1, 2021
|
Can I pretrain LLaMA from scratch?
|
|
7
|
13341
|
November 17, 2023
|
Different results predicting from trainer and model
|
|
6
|
8016
|
December 20, 2021
|
SSLCertVerificationError when loading a model
|
|
8
|
12515
|
April 29, 2025
|
How to reset a layer?
|
|
2
|
3849
|
November 30, 2021
|
You have exceeded your free GPU quota (60s requested vs. 53s left)
|
|
3
|
1868
|
November 12, 2024
|
Unable to create tensor, you should probably activate padding with 'padding=True' to have batched tensors with the same length
|
|
1
|
1482
|
November 6, 2024
|