Commit Message Generation Model

Jau-Ali · August 27, 2024, 5:21am

Hello, can someone help me? I am working on a project where I am training a model for my VSCode plugin that generates commit messages automatically. The plugin aims to streamline version control by providing contextual and meaningful commit messages based on code changes.

Currently, I’m facing challenges with training time and some library compatibility issues. The dataset I’m using is large, and my model takes too long to train, which is far too long for my hardware setup. I’m using the Hugging Face transformers library, along with the datasets package, but would appreciate guidance on how to optimize the model and reduce the training time.

Here’s my current setup:

I’m training a seq2seq encoder-decoder model.
The dataset is a reduced version of CommitBench.
I’m utilizing a BERT-based tokenizer.
Training happens on a GPU (NVIDIA GeForce MX350), but it’s still very slow.

Any advice on how to optimize the model, adjust training parameters, or any other tips would be much appreciated!

Thank you!

Topic		Replies	Views
Train T5 model for commit message generation Flax/JAX Projects	19	1947	July 2, 2021
Auto-generation of Messages to Commits \| Project Proposing Flax/JAX Projects	4	1577	August 13, 2021
Pretrain GPT-Neo for Open Source GitHub Copilot Model Flax/JAX Projects	54	23992	January 18, 2022
Code review using Codellama-Instruct Models	1	231	May 9, 2025
Closest model available to OpenAI's codex/ GitHub Copilot for code completion 🤗Transformers	6	7676	August 7, 2023

Commit Message Generation Model

Related topics