Hello.
I’m doing code classification with codebert-small-v1, but as the maximum sequence is 512 tokens, this may limit me when faced with a certain amount of code (because of the size). On the other hand, I’ve noticed that T5 has a greater margin as regards the maximum sequence. Is it possible to use the T5 model for sort code classification to have the same output as codebert-small-v1? In the sense that I have the probability of appearance of each class of vulnerability in the code?
I’m not familiar with it, but it seems possible.
But I’m a bit surprised, when I try to classify with “TFAutoModelForSequenceClassification”, I get an error telling me that model T5 is not compatible. However, with codeBert small, no problem. I want to try another model because, I lack performance in predictions. My current model manages to classify the code well according to the CWE around 8 classes, but not when the code is vulnerable (only two classes) Do you have any idea what to do?
Hmm…
even though T5 can be used very well for text-classification it remains a text-to-text only model. So you can only load the model via
from transformers import AutoModelForConditionalGeneration
model = AutoModelForConditionalGeneration.from_pretrained(“t5-small”)
thank you !
This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.