GRPO or PPO or some RL

Is there a GRPO/RL/PPO for text classification task using encoder only models like bert/roberta.
any github repo , example or help would be really appreciated thanks in advance.

1 Like

This may be an unresolved issue. The following article may be helpful for general information about GRPO, but it is not specific to classification tasks…