Are dynamic padding and smart batching in the library?

deathcrush · March 1, 2023, 6:32pm

@sgugger This is completely off topic but do you think we could implement grouping by length inside a pipeline to prevent slowdowns due to large differences in sequence lengths? This would only be implemented for users that run the pipeline on a Dataset object. I’d be happy to contribute this. What would be an appropriate forum to discuss details?

Topic		Replies	Views
Padding in datasets 🤗Datasets	6	5103	October 21, 2021
Not sure why padding isn't working for me Beginners	2	1605	January 22, 2021
Data sampler based on number of tokens 🤗Transformers	0	745	February 4, 2022
Dynamic Padding not working for Custom Dataset 🤗Datasets	5	4012	February 9, 2022
Batch tensor creation error when finetuning gpt2 🤗Transformers	2	450	March 21, 2023

Are dynamic padding and smart batching in the library?

Related topics