Shot in the dark here - anyone know of a tutorial for how to take a dataset of unlabeled documents of N rows:
(a) load a pretrained transformer model (e.g., BERT)
(b) fine tune the language model with masking on a small number of epochs
(c) extract the final hidden dimension scores so that the output is N (# documents) by k (# units)