Using a pre-trained BERT model, how do I extract the final hidden unit scores, for a dataset of N-examples, such that each example would have 768 final hidden unit scores (so, N-by-768)? Basically, I’m just looking to take a pandas dataset of multiple texts and produce final BERT hidden unit scores for each example and then extract as another pandas dataframe.
So, if I have a starting dataset with text:
df [[“text”]] that is Nx1 pandas dataframe, is there a quick way to do this?