Fine tuning a LLM with a code

I want to fine-tune a LLM locally to serve as an intelligent code reviewer to use as a tool for developers that, given natural language descriptions, identifies and highlights specific locations in the C# codebase where changes are needed. The goal is to streamline the code review process by providing developers with precise indications of where modifications should be made based on their high-level descriptions. Even though there are suitable LLMs for the task i can’t figure out a way to feed my C# code base to the LLM. (a way for the LLM to read my code files )

Are you looking to train any specific LLM? I had used GPT2 for a similar task and it worked decently well.

yes i was thinking code llama or mistral 7b (i can use any open source LLM that supports a C# code base)… how did you feed your code base into the llm to fine tune it to learn the code?

I had created a dataloader function and used huggingface’s trainer function. I used GPT2 and not mistral or code llama.


can you explain the function or maybe give me the code?