Continue pre-training GPT2
|
|
0
|
26
|
March 26, 2023
|
NLP: Infer intent of finalising a transaction in a dialogue/chat system
|
|
0
|
32
|
March 22, 2023
|
Conversational Budget Analytics
|
|
1
|
115
|
March 19, 2023
|
TRL loss blowing up
|
|
2
|
115
|
March 16, 2023
|
Diffusion models for environmental sound generation
|
|
0
|
51
|
March 13, 2023
|
Dose any one fine tune bloom7b model with peft?
|
|
0
|
49
|
March 13, 2023
|
Minimize number of transformers checkpoints for serving muliple client
|
|
3
|
171
|
March 9, 2023
|
How to approach NLG problem, mainly generating summaries from a table/chart using trasnformers based models
|
|
0
|
50
|
March 6, 2023
|
Forward-Forward algorithm by Geoffrey Hinton
|
|
4
|
506
|
February 22, 2023
|
Carrying Gradients Through Generate
|
|
5
|
1390
|
January 29, 2023
|
Model Adaptation
|
|
0
|
93
|
January 24, 2023
|
Swapping out self-attention layer in BERT
|
|
0
|
114
|
January 11, 2023
|
Why are huge batch sizes used for pretraining and small ones for finetuning?
|
|
3
|
2480
|
January 10, 2023
|
How to load only a few parameters
|
|
0
|
140
|
January 7, 2023
|
Domain-specific word similarity problem
|
|
1
|
222
|
January 6, 2023
|
Encoder-Decoder vs Decoder Only Architecture Models
|
|
0
|
434
|
December 18, 2022
|
Train BERT with sentence embeddings
|
|
0
|
208
|
December 14, 2022
|
Is the evaluate-metric/accuracy the same as macro-accuracy?
|
|
0
|
215
|
December 13, 2022
|
Understanding FLOPs-per-token estimates from OpenAI's scaling laws
|
|
5
|
1938
|
December 13, 2022
|
ConformerCTC for streaming
|
|
1
|
244
|
December 12, 2022
|
Sequence classification
|
|
0
|
211
|
December 11, 2022
|
Individually Logging All The Layer/Neuron Outputs
|
|
0
|
296
|
December 1, 2022
|
Incremental decoding with T5
|
|
0
|
418
|
November 29, 2022
|
Is it possible to split a Bert-alike model's output into different task?
|
|
0
|
315
|
November 28, 2022
|
Privacy enhancing technologies in model development
|
|
0
|
345
|
November 22, 2022
|
Conversational QA pretrained model?
|
|
0
|
382
|
November 21, 2022
|
Composition Training/Validation Split of AutoTrain
|
|
0
|
278
|
November 18, 2022
|
Do the common tricks in transformers help with RNNs?
|
|
0
|
357
|
November 10, 2022
|
Rust applications
|
|
3
|
832
|
November 10, 2022
|
Metadata of NLP datasets
|
|
0
|
410
|
November 5, 2022
|