Hi All,
I am trying to get last 4 hidden layers from roberta model, concatenate it and then add a linear ==> softmax layers, to check how the model is performing. Basically I am trying to experiment with the model.
In one of my experiment I was able to get last 4 hidden layers and apply max_pool/avg_pool on the layers and was able to train the model.
But when trying to get last 4 layers and trying to concat it, I am getting error. Does anyone have a clue? I am trying this architecture from BERT paper reference.
Have tried some of the code from here, but not working :Sequence Classification pooled output vs last hidden state · Issue #1328 · huggingface/transformers · GitHub