Train the best ever transformer-VAE

Hi Patrick, thanks for the feedback!

I’ve linked a revised plan bellow.

The idea here is just to take an existing flax-T5 model and stick an autoencoder between the encoder & decoder. As seen here.

I’ve currently had calls with 3 other team members and we’re really exited to see what this produces!