From Commit Message Generation to History-Aware Commit Message Completion

Aleksandra Eliseeva1, Yaroslav Sokolov2, Egor Bogomolov1, Yaroslav Golubev1, Danny Dig1,3, Timofey Bryksin1,
1JetBrains Research, 2JetBrains, 3University of Colorado Boulder

Overview

    We explore two ideas to personalize the output of the commit message generation (CMG) approaches:
  • Shifting the focus to commit message completion: a prefix provided by the user might steer CMG approaches towards more relevant predictions and potentially even encompass some of the commit message conventions of the current user or project.
  • Utilizing commit message history as additional context: previous commit messages might guide CMG approaches to follow the commit message conventions of the current user or project.

Motivational example for the ideas proposed in our paper. CMG = commit message generation; CMC = commit message completion; CMG + history = commit message generation with commit message history as additional context.

    To evaluate our two ideas, we set the following criteria for the dataset:
  • Suitable for experiments with commit history: provides the necessary metadata for commits; preserves commit history to a reasonable extent.
  • Diverse: incorporates commit messages with a variety of different conventions and writing styles.

We observed that most of the existing CMG datasets don't fit these criteria: they either lack diversity due to the extensive filtering of commit diffs and messages or significantly tamper with the original commit history.

Hence, we built a novel large-scale dataset that overcomes these issues – 📜 CommitChronicle 🔮, available on Zenodo and on HuggingFace Hub!

Experimental Results

  • Models. We experiment with three CMG approaches (CodeT5, CodeReviewer, RACE) and a LLM (GPT-3.5-turbo).
  • Data. We use two subsets of our CommitChronicle dataset for evaluation:
    • \(CMG_{test}\) – around \(200\)k examples; used for experiments with CMG approaches
    • \(LLM_{test}\) – around \(4\)k examples; used for experiments with a LLM

For further details, refer to our paper.

RQs and Key Findings

⬇️ Click on the buttons to expand corresponding subsections!

RQ A1. How do state-of-the-art CMG approaches perform in the completion setting?

RQ A2. How do LLMs perform in comparison with state-of-the-art CMG approaches?

RQ B1. How does using commit message history as an additional input affect the models’ quality?

RQ B2. How do state-of-the-art CMG approaches perform with and without common data filtering steps?

Full Results

Due to the variety of models and configurations in our experiments, we only share a selected subset of the results in our paper. You can find comprehensive results in this section (also available in our repository as JSONLines files).

⬇️ Click on the buttons to expand corresponding subsections!

\(CMG_{test}\) Experiments



\(LLM_{test}\) Experiments



Filters Experiments

Citation

TODO