How can I know which dataset fits my model?

I want to train llama-2 to know a little more Python programming
I saw that there are many datasets on Python, but how can I know which one is suitable to train my model?