Custom, without any pretraining, training with PyTorch

Hey!
The tokenizer options in HuggingFace are extremely useful and easy to work with.
I’d like to build a custom pipeline that:

  1. inputs custom data ,not something in Datasets (I know how to do this in HF).
  2. tokenizes it with a HF tokenizer pipeline (this I know how to do, too).
  3. uses the above to feed into a custom (no pretraining, not a specific architecture) PyTorch model,
  4. optimizes with any viable PyTorch loss function.

So basically, I want to work within the HF framework for the text’s preprocessing and tokenizing, while still doing whatever I wish in PyTorch afterwards…

Is this feasible?

Thanks!