Extracting the top contributor

Git has a few built-in stats you can get instantaneously. The git log command has different options, such as --numstat, that will show the number of files added and lines deleted for each file since each commit. However, for finding the top committer in the repository, we can just use the git shortlog command.

Getting ready

As with all the examples throughout the book, we are using the jgit repository; you can either clone it or go to one of the clones you might already have.

Clone the jgit repository as follows:

$ git clone https://git.eclipse.org/r/jgit/jgit chapter6
$ cd chapter6

How to do it...

The shortlog Git command is very simple and does not leave a lot of options or flags to use with it. It can show the log but in a boiled-down version, and then it can summarize it for us:

  1. Start by showing the last five commits with shortlog. We can use -5 to limit the amount of output:
    $ git shortlog -5
    Jonathan Nieder (1):
          Update commons-compress to 1.6
    
    Matthias Sohn (2):
          Update com.jcraft.jsch to 0.1.50 in Kepler target platform
          Update target platforms to use latest orbit build
    
    SATO taichi (1):
          Add git checkout --orphan implementation
    
    Stefan Lay (1):
          Fix fast forward rebase with rebase.autostash=true
    
  2. As you can see, the output is very different from the git log output. You can try it for yourself with git log -5. The numbers in parentheses are the number of commits by that committer. Below the name and number are the commit titles of the commits. Note that no commit hashes are shown. To find the top committer with just those five commits is easy, but when you try running git shortlog without -5, it is hard to find that person. To sort and find the top committer, we can use the -n or --numbered option to sort the output. The top committer is on top:
    $ git shortlog -5 --numbered
    Matthias Sohn (2):
          Update com.jcraft.jsch to 0.1.50 in Kepler target platform
          Update target platforms to use latest orbit build
    
    Jonathan Nieder (1):
          Update commons-compress to 1.6
    
    SATO taichi (1):
          Add git checkout --orphan implementation
    
    Stefan Lay (1):
          Fix fast forward rebase with rebase.autostash=true
    
  3. As you can see, the output is nicely sorted. If we don't care about the commit subjects, we can use -s or --summary to only show the commit count for each developer:
    $ git shortlog -5 --numbered --summary
         2  Matthias Sohn
         1  Jonathan Nieder
         1  SATO taichi
         1  Stefan Lay
    
  4. Finally, we have what we want, except we don't have the e-mail addresses of the committers; this option is also available with -e or --email. This will also show the e-mail addresses of the committers in the list. This time, we will try it on the entire repository. Currently, we have only listed it for the HEAD commit. To list it for the repository, we need to add --all at the end of the command so as to execute the command for all branches:
    $ git shortlog  --numbered --summary --email --all
       765  Shawn O. Pearce <[email protected]>
       399  Matthias Sohn <[email protected]>
       360  Robin Rosenberg <[email protected]>
       181  Chris Aniszczyk <[email protected]>
       172  Shawn Pearce <[email protected]>
       160  Christian Halstrick <[email protected]>
       114  Robin Stocker [email protected]
    
  5. So, this is the list now; we know who contributed with the most commits, but this picture can be a little skewed as the top committer may just happen to be the creator of the project and may not actively contribute to the repository. So, to list the top committers for the last six months, we can add --since="6 months ago" to the git shortlog command:
    $ git shortlog  --numbered --summary --email --all --since="6 months ago"
        73  Matthias Sohn <[email protected]>
        15  Robin Stocker <[email protected]>
        14  Robin Rosenberg <[email protected]>
        13  Shawn Pearce <[email protected]>
        12  Stefan Lay <[email protected]>
         8  Christian Halstrick <[email protected]>
         7  Colby Ranger [email protected]
    
  6. As you can see, the picture has changed since the start of the repository.

    Tip

    You can use "n weeks ago", "n days ago", "n months ago", " n hours ago", and so on for specifying time periods. You can also use specific dates such as "1 october 2013".

    You can also list the top committer for a specific month using the --until option, where you can specify the date you wish to list the commit until. This can be done as follows:

    $ git shortlog  --numbered --summary --email --all --since="30 september" --until="1 november 2013"
        15  Matthias Sohn <[email protected]>
         4  Kaloyan Raev <[email protected]>
         4  Robin Rosenberg <[email protected]>
         3  Colby Ranger <[email protected]>
         2  Robin Stocker <[email protected]>
         1  Christian Halstrick <[email protected]>
         1  Michael Nelson <[email protected]>
         1  Rüdiger Herrmann <[email protected]>
         1  Tobias Pfeifer <[email protected]>
         1 Tomasz Zarna [email protected]
    
  7. As you can see, we get another list, and it seems like Matthias is the main contributor, at least compared to the initial result. These types of data can also be used to visualize the shift of responsibility in a repository by collecting the data for each month since the repository's initialization.

There's more...

While working with code, it is often useful to know who to go to when you need to perform a fix in the software, especially in an area where you are inexperienced. So, it would be nice to figure out who is the code owner of the file or the files you are changing. The obvious reason is to get some input on the code, but also to know who to go to for a code review. You can again use git shortlog to figure this out. You can use the command on the files as well:

  1. To do this, we simply add the file to the end of the git shortlog command:
    $ git shortlog  --numbered --summary --email ./pom.xml
        86  Matthias Sohn <[email protected]>
        21  Shawn O. Pearce <[email protected]>
         4  Chris Aniszczyk <[email protected]>
         4  Jonathan Nieder <[email protected]>
         3  Igor Fedorenko <[email protected]>
         3  Kevin Sawicki <[email protected]>
         2  Colby Ranger [email protected]
    
  2. As for pom.xml, we also have a top committer. As all the options you have for git log are available for shortlog, we can also do this on a directory.
    $ git shortlog  --numbered --summary --email ./org.eclipse.jgit.console/
        57  Matthias Sohn <[email protected]>
        11  Shawn O. Pearce <[email protected]>
         9  Robin Rosenberg <[email protected]>
         2  Chris Aniszczyk <[email protected]>
         1  Robin Stocker [email protected]
    
  3. As you can see, it is fairly simple to get some indication on who to go to for the different files or directories in Git.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset