referring the docs I think the second classification head is randomly initialised, which could be the reason for this. pinging @patrickvonplaten .
referring the docs I think the second classification head is randomly initialised, which could be the reason for this. pinging @patrickvonplaten .