Training dataset managment

Hi,

I already built application to summarize programming tasks. I use llama3.1, and just lernt how to fine tune this model.

Finally, I would like to prepare good-quality training data.

But I don’t how to manage large amounts of training data. What software to use?

So far they look like this

If I have 100 rows, and I would like to fix/edit 15 of them, it’s gonna be hard to do it in IDE.

Can you recommend some kind of software to develop and improve training sets?

1 Like