Is there a GRPO/RL/PPO for text classification task using encoder only models like bert/roberta.
any github repo , example or help would be really appreciated thanks in advance.
1 Like
This may be an unresolved issue. The following article may be helpful for general information about GRPO, but it is not specific to classification tasks…