Best way to find a segment of code (output) that matches a given input segment?

I need to develop an application where I give an llm a piece of code, like maybe a function, and then the llm finds the closest match that does the same thing. It would look in one or more source files. The thing found may be worded differently. If the search finds the identical code then it should consider that the match. I assume the llm needed would be the same as a good coding llm.

Is this feasable at all? How hard would this be to develop? Thanks in advance.

1 Like

I think the difficulty of development depends on how much accuracy you want in the results.

For example, if you just want code completion, then you can probably achieve this with existing demos. Try the following Hugging Chat and Qwen 2.5 32B Coder with the search function turned on. Depending on the instructions in the prompt, I think you’ll get pretty close.

And if you can achieve a certain level of success with this, it means that you can achieve almost the same thing by using Smolagents. Apart from the fact that the search results are different between Google and DuckDuckGo, LLM is almost the same at the API level…
And you and other users can add components and LLM that you have created as a force.

I think this is generally called the RAG or agent approach (I may be wrong). I think it could be a clue.