Pruning a model embedding matrix for memory efficiency
|
|
7
|
3489
|
July 27, 2022
|
Extracting HuBERT hidden units
|
|
1
|
1156
|
July 26, 2022
|
Network is Unreachable Error
|
|
0
|
1576
|
July 26, 2022
|
Fused Kernel Operations
|
|
0
|
629
|
July 26, 2022
|
How to find closest embedding vectors?
|
|
2
|
1758
|
July 26, 2022
|
How to correctly measure inference time?
|
|
0
|
939
|
July 25, 2022
|
DeBERTaV3 ONNX conversion error
|
|
2
|
2055
|
July 25, 2022
|
Why is it so slow to access data through iteration with hugginface dataset?
|
|
2
|
2862
|
July 21, 2022
|
Huggingface infinity based inference server vs AWS Inferentia
|
|
0
|
383
|
July 21, 2022
|
Whats the maths behind padding_to_longest vs padding_to_model_max_len?
|
|
1
|
323
|
July 20, 2022
|
LogitsProcessor guide
|
|
0
|
493
|
July 18, 2022
|
Correct way to define outputs for an Image Model
|
|
0
|
656
|
July 17, 2022
|
Getting hidden states from the "automatic-speech-recognition" pipeline
|
|
0
|
791
|
July 15, 2022
|
Clm repeats tokenization when distributed
|
|
5
|
1324
|
July 15, 2022
|
Infilling multiple mask spans with BartForConditionalGeneration
|
|
0
|
410
|
July 12, 2022
|
Does it ever make sense to finetune w fp32 if the base model was trained w fp16?
|
|
1
|
759
|
July 8, 2022
|
StoppingCriteria "scores" always None
|
|
1
|
449
|
July 7, 2022
|
TokenClassification pipeline doing batch processing over a sequence of already tokenised messages
|
|
1
|
832
|
July 6, 2022
|
Using GPT-Neo-125M with ONNX
|
|
3
|
1362
|
July 5, 2022
|
How to create a Custom Feature Extractor that can be published to Huggingface
|
|
0
|
435
|
July 5, 2022
|
Why use `val_transforms()` function in image classification example instead of `feature_extractor`?
|
|
0
|
386
|
July 4, 2022
|
ZeRO 2 and 3 with Tensor Parallelism
|
|
0
|
1174
|
July 3, 2022
|
Different sentiments when texts processed in batches vs singles
|
|
1
|
447
|
July 3, 2022
|
Looking for a simple tutorial on how to fine tune a model for relation extraction
|
|
0
|
499
|
July 2, 2022
|
TypeError: __init__() got an unexpected keyword argument 'hub_token'
|
|
2
|
5233
|
July 1, 2022
|
How to use SentenceTransformers for contrastive learning?
|
|
5
|
5909
|
June 30, 2022
|
Load a single GPU checkpoint to 2 GPUS (deepspeed)
|
|
0
|
2031
|
June 29, 2022
|
BERT model not showing up as trainable in Flax
|
|
0
|
376
|
June 27, 2022
|
T5 Fine-Tuning for summarization with multiple GPUs
|
|
0
|
846
|
June 28, 2022
|
Ranking model poor results, looking for improvement
|
|
0
|
620
|
June 25, 2022
|