LLama2 Finetuning giving RuntimeError: mat1 and mat2 shapes cannot be multiplied (33x4096 and 1x8388608)
|
|
0
|
496
|
November 17, 2023
|
In diffusers, do i need many venvs to get around version conflicts?
|
|
0
|
135
|
November 17, 2023
|
CORS Issue with HuggingFace Spaces and Netlify-hosted React App
|
|
0
|
668
|
November 17, 2023
|
Is there a small (<5GB) dataset for general-purpose LLMs?
|
|
0
|
382
|
November 17, 2023
|
Load question and response separately
|
|
0
|
414
|
November 17, 2023
|
Training with class weights
|
|
5
|
2788
|
November 18, 2023
|
Using diffusers to create denoising models
|
|
0
|
1000
|
November 18, 2023
|
Using Lora for inference
|
|
1
|
680
|
November 18, 2023
|
I got the this problem and i can solve this what should i do?
|
|
0
|
269
|
November 18, 2023
|
How to make a translation dataset
|
|
3
|
2772
|
November 18, 2023
|
IterableDataset.from_generator with iterator
|
|
2
|
1508
|
November 18, 2023
|
How to plot models using torchviz or hiddenlayer
|
|
3
|
8388
|
November 18, 2023
|
Highlighting important tokens for input into LLM
|
|
0
|
238
|
November 18, 2023
|
4-bit quantization
|
|
0
|
465
|
November 18, 2023
|
Can anyone recommend a good STT model that is well suited to work on with tensorflow-metal on the M1 Mac?
|
|
0
|
318
|
November 19, 2023
|
How to train Wav2Vec2 in LoRA?
|
|
1
|
1248
|
November 19, 2023
|
<|nospeech|> tokens in seq2seq/whisper
|
|
0
|
413
|
November 19, 2023
|
A standard way to have the `generate` method of the `GenerateMixin` only output the generated tokens
|
|
0
|
617
|
November 19, 2023
|
Fine tune for question answering tasks on personal laptop
|
|
0
|
312
|
November 19, 2023
|
Batch Transform with strategy='MultiRecord' returns only one line
|
|
0
|
389
|
November 19, 2023
|
WavLM ECAPA-TDNN embeddings for Speaker verification
|
|
0
|
557
|
November 19, 2023
|
Sharding Models for Inference
|
|
0
|
160
|
November 19, 2023
|
Do Kaggle competitions interest you?
|
|
0
|
245
|
November 19, 2023
|
Is there a place for paid help on transformers
|
|
3
|
201
|
November 19, 2023
|
Autotrain Advanced Cost
|
|
0
|
442
|
November 20, 2023
|
Visualbert lower accuracy in validation dataset
|
|
0
|
184
|
November 20, 2023
|
Transformer.JS in React-Native Application
|
|
1
|
2758
|
November 20, 2023
|
Use Start/Stop button to record live audio using Gradio app
|
|
2
|
2733
|
November 20, 2023
|
Runtime error trying to create Autotrain space
|
|
2
|
786
|
November 20, 2023
|
I have a problem with deploying my first model
|
|
0
|
279
|
November 20, 2023
|