Using HF to train a custom PyTorch architecture

curt-tigges · July 29, 2022, 6:22pm

I’ve created a basic Seq2Seq transformer from scratch in PyTorch (mainly so that I can learn the architecture), and I was wondering if it is possible to train this kind of model using the HuggingFace Tokenizers, Dataset, and Trainer classes. I’d rather not hand-code the tokenization and training loop for the transformer from scratch.

Is there a way to do this with HF? Existing tutorials seem to exclusively use models that are already in the library.

If it is possible, what parts of the training would I still need to take care of (loss, masking, label smoothing, greedy decode, etc.)?

Topic		Replies	Views
How does one create a custom hugging face model with a already working tokenizer? 🤗Transformers	1	977	July 29, 2024
Custom, without any pretraining, training with PyTorch Beginners	0	287	January 30, 2023
Training General Pytorch model with HuggingFace's Trainer 🤗Transformers	0	393	May 7, 2023
How to train a model designed by myself with the Transformer Framework Beginners	0	229	August 13, 2023
Train a large transformer with Custom Tokenizer/Data 🤗Transformers	0	355	December 23, 2022

Using HF to train a custom PyTorch architecture

Related topics