I'd like to understand on how to train a neural net with agents and evolution

I’d like to understand on how to train a neural net with agents and evolution.

It might be easier to think of a game world though i don’t create games.
The training will be inside jupyter data.
Say I got 10 inputs, the the agent has some value (reward store).
The 10 values are unknown and to be interpreted by a NN
It needs to improve its reward though its output is just 3 options, like left/forward/right
Not every move results in a reward so training likely takes time period.
Depending on their reactions agents might be in a different scenario, though at some time one selects the best (n) agents. (high reward)
And then trains again until a supergood agent is able to interpren the input values

How does one create train such a network?,
Normally in neural networks one trains a network toward a certain goal, using back prop, ea several inputs to resolve something alike DNN or a LSTM. But the rules here are so different.

Anyone knows of some jupyter sample for a DNN training alike that ?.