Using ai to generate training data to fill gaps

Wokka77 · December 11, 2024, 10:26am

So at work recently they unveiled an ml app that takes long winded descriptions of machine parts ( usually due to the 1000 different ways someone can describe a specific widget) and distilled it down to its core description. All good.

My question is, is it possible to have some ml code that can look at whats been used so far for training, and based on some human guidance, generate plausible missing permutations of words to fill the gap to generate a 90-95% or higher accuracy by using the new synthetic training data? And then you could stick it in a self tuning loop to keep things tight.

Its just a thought at the moment, sometimes i dont know all the details, but can see the problem and just need the right stuff to fix the problem.

Thoughts welcome.

Topic		Replies	Views
Unlock AI training data with the open-sourced Synthetic Data SDK Show and Tell	0	36	February 4, 2025
[Data processing] How to design a training loop for custom data by GPT2 model Beginners	1	144	August 24, 2023
Training existing llm on my data Beginners	0	488	June 17, 2023
Training Question/Answer on My Own Codebase Intermediate	0	243	March 29, 2024
Creating masked sentences 🤗Datasets	1	411	March 2, 2022

Using ai to generate training data to fill gaps

Related topics