DeepSpeed MII pipeline issue
|
|
1
|
35
|
September 30, 2024
|
Using huggingface CLI with a certificate
|
|
3
|
3759
|
September 30, 2024
|
What is the best AMD Ryzen 5 laptops for running machine learning models and libraries like TensorFlow or PyTorch?
|
|
4
|
776
|
September 29, 2024
|
[MarianTokenizer] Clarify the use of the vocab parameter
|
|
3
|
818
|
September 29, 2024
|
Generate Mock but realistic data using NLP
|
|
5
|
189
|
September 29, 2024
|
Convert pre-trained MHA weights to GQA weights
|
|
1
|
384
|
September 29, 2024
|
Training from a checkpoint and freezing some of model's parameters
|
|
2
|
857
|
September 29, 2024
|
Deepspeed mii library issues
|
|
1
|
79
|
September 29, 2024
|
How can I use TPUs on autotrain
|
|
1
|
74
|
September 28, 2024
|
Meta-llama / Meta-Llama-3-70B-Instruct is not available as a serverless API
|
|
10
|
1638
|
September 28, 2024
|
How to use Qwen2-VL on multiple gpus?
|
|
2
|
1515
|
September 28, 2024
|
Using gradient checkpointing and KV caching when generation happens in no grad context
|
|
2
|
326
|
September 28, 2024
|
What is the best Multilingual NLI model out there? Not able to find many in opensource space
|
|
1
|
392
|
September 28, 2024
|
Multi-GPU inference with LLM produces gibberish
|
|
14
|
6620
|
September 28, 2024
|
How exactly Llama is accessed?
|
|
2
|
448
|
September 28, 2024
|
How memory is managed in model.generate() method?
|
|
2
|
50
|
September 27, 2024
|
Problem with cloning due to token authentication
|
|
2
|
837
|
September 28, 2024
|
[Feature Request] add Solara support to spaces
|
|
5
|
478
|
September 27, 2024
|
LayoutLMv3 processor error
|
|
4
|
121
|
September 27, 2024
|
Need help deploying a HF model to AWS Sagemaker
|
|
3
|
169
|
September 27, 2024
|
Download models or datasets in zip?
|
|
2
|
2670
|
September 27, 2024
|
Why are some weights FP32 in Llama 3.1 405B FBGEMM FP8 Quantization?
|
|
7
|
534
|
September 27, 2024
|
How to set a Secret File or set such file inaccessable to others
|
|
3
|
278
|
September 27, 2024
|
How to use gated model in inference
|
|
3
|
284
|
September 27, 2024
|
Cannot pin 'torch.cuda.LongTensor' only dense CPU tensors can be pinned
|
|
1
|
1175
|
September 26, 2024
|
Training a language model on Arabic data - handling right-to-left text direction
|
|
4
|
2178
|
September 26, 2024
|
Storing Browser Cookies from Streamlit Space
|
|
3
|
175
|
September 26, 2024
|
How do I run HF space in Flutter?
|
|
4
|
137
|
September 26, 2024
|
Wav2vec2 config -- why is mask_time_prob=0.05 and not 0.5?
|
|
1
|
587
|
September 26, 2024
|
Seeking uncensored Chatgpt for Creative Writing
|
|
1
|
4381
|
September 26, 2024
|