I am trying to fine tune model embeddings without supervision. I am referring to the output layer of the attention model, the one which is used for generating predictions. In this tutorial, they show how to generate labels from unsup text using masking, but they do it using a masked LM model, which doesn’t give access to its attention output layer, only word level logits, which is one abstraction level higher. In this tutorial, on the other hand, they do fine tune the output layer embeddings but they do so with labeled data.
I know I can implement it myself by simply generating labels by masking. But I would like to know if there is already a package that does this. And more importantly, how is this not addressed in either of these very well thought-out tutorials? It seems to me like a no brainer.