How to train a model designed by myself with the Transformer Framework

VictorJuiz · August 13, 2023, 10:52am

Hello everyone.

Since I met Hugging Face the first time, I have gone around exploring it with its own NLP course and other documentation for a while. But I haven’t seen any instructions about training architectures that I designed on my own. The most relevant thing is to load an existing architecture without parameters and train it from scratch.

So, after I design my own architecture (with Tensorflow since I don’t think Transformers supports this), how can I make use of its Dataset, Tokenizer, or even Trainer and Accelerator libraries?

My model class will look like:

class Transformer(tf.keras.Model):
def init(self, *, arguments):
super().init()
self.layer1 = something
   self.layer2 = songthing

   self.final_layer = somehting

def call(self, inputs):

and after preprocess data, i want to use

prepare_tf_dataset

and then start straining. It would be great if Accelerate, Trainer and other thing can also apply.

Thank you guys so much.

Topic		Replies	Views
Train a transformer from scratch 🤗Transformers	0	433	August 9, 2021
How can I pretrain a new model re-initializing with my own vocab? 🤗Transformers	0	292	May 25, 2021
Using HF to train a custom PyTorch architecture Beginners	0	510	July 29, 2022
Transformer for Translation from Scratch with Hugging Face/PyTorch Intermediate	5	3791	December 1, 2022
Using Huggingface Trainer for custom models Beginners	5	4365	May 29, 2023

How to train a model designed by myself with the Transformer Framework

Related topics