Stable Baselines3 - Different method for learn model

pcpoj · March 10, 2023, 11:03am

Hi,
I’m very new to this topic.
I would like to use the Stable Baselines3 library in my project.
For now, I want to write code that teaches an agent to play CartPole-v1, but not by using the learn() method.

I want to do the following steps:

1 Use model.predict()
2 Use env.step(action)
3 Compute log_prob
4 Add all this stuff (obs, action_prev, reward, log_prob, etc.) to the rollout_buffer
5 Use model.train()

In theory, it should work. I can run my code, but anyway, my PPO model doesn’t learn. The mean reward doesn’t change significantly.

Do you have any advice on how I should do this?

Topic		Replies	Views
How to use the model from the chapter "Fine-tuning a model with the Trainer API" Course	0	322	April 17, 2024
Different models when loading checkpoint (run_mlm) 🤗Transformers	2	504	February 24, 2021
I need help getting more accurate results after training Beginners	0	56	August 25, 2024
My model doesn't learn with my triplet loss Intermediate	3	67	April 22, 2025
High inconsistancies while Training Beginners	0	250	July 29, 2022

Stable Baselines3 - Different method for learn model

Related topics