Chain of trust

An important part of collaborative efforts during the development of a project is ensuring the quality of its code. This includes protection against the accidental corruption of the repository, and unfortunately also from malicious intent—a task that the version control system can help with. Git needs to ensure trust in the repository contents: your own and other developers' (including especially the canonical repository of the project).

Content-addressed storage

In Chapter 2, Exploring Project History, we learned that Git uses SHA-1 hashes as a native identifier of commit objects (which represent revisions of the project, and form its history). This mechanism makes it possible to generate commit identifiers in a distributed way, taking the SHA-1 cryptographic hash function of the commit object link to the previous commit (the SHA-1 identifier of the parent commit) included.

Moreover, all other data stored in the repository (including the file contents in the revision represented by the blob objects, and the file hierarchy represented by the tree objects) also use the same mechanism. All types of object are addressed by their contents, or to be more accurate, the hash function of the object. You can say that the base of a Git repository is the content-addressed object database.

Thus Git provides a built-in trust chain through secure SHA-1 hashes. In one dimension, the SHA-1 of a commit depends on its contents, which includes the SHA-1 of the parent commit, which depends on the contents of the parent commit, and so forth down to the initial root commit. In the other dimension, the content of a commit object includes the SHA-1 of the tree representing the top directory of a project, which in turn depends on its contents, and these contents includes the SHA-1 of subdirectory trees and blobs of file contents, and so forth down to the individual files.

All of this allows SHA-1 hashes to be used to verify whether objects obtained from a (potentially untrusted) source are correct, and that they have not been modified since they have been created.

Lightweight, annotated, and signed tags

The trust chain allows us to verify contents, but does not verify the identity of the person that created this contents (the author and committer name are fully configurable). This is the task for GPG/PGP signatures: signed tags, signed commits, and signed merges.

Lightweight tags

Git uses two types of tags: lightweight and annotated. A lightweight tag is very much like a branch that doesn't change – it's just a pointer (reference) to a specific commit in the graph of revisions, though in refs/tags/ namespace rather than in refs/heads/ one.

Annotated tags

Annotated tags, however, involve tag objects. Here the tag reference (in refs/tags/) points to a tag object, which in turn points to a commit. Tag objects contain a creation date, the tagger identity (name and e-mail), and a tagging message. You create an annotated tag with git tag -a (or --annotate). If you don't specify a message for an annotated tag on the command line (for example, with -m "<message>"), Git will launch your editor so you can enter it.

You can view the tag data along with the tagged commit with the git show command as follows, (commit skipped):

$ git show v0.2
tag v0.2
Tagger: Joe R Hacker <[email protected]>
Date:   Sun Jun 1 03:10:07 2014 -0700

random v0.2

commit 5d2584867fe4e94ab7d211a206bc0bc3804d37a9

Signed tags

Signed tags are annotated tags with a clear text GnuPG signature of the tag data attached. You can create it with git tag -s (which uses your committer identity to select the signing key, or user.signingKey if set), or with git tag -u <key-id>; both versions assume that you have a private GPG key (created, for example, with gpg --gen-key).

Note

Annotated or signed tags are meant for marking a release, while lightweight tags are meant for private or temporary revision labels. For this reason, some Git commands (such as git describe) will ignore lightweight tags by default.

Of course in collaborative workflows it is important that the signed tag is made public, and that there is a way to verify it.

Publishing tags

Git does not push tags by default: you need to do it explicitly. One solution is to individually push a tag with git push <remote> tag <tag-name> (here tag <tag> is equivalent to the longer refspec refs/tags/<tag>:refs/tags/<tag>); however, you can skip tag in most cases, here. Another solution is to push tags in mass either all the tags—both lightweight and annotated—with the use of the --tags option, or just all annotated tags that point to pushed commits with --follow-tags. This explicitness allows you to re-tag (using git tag -f) with impunity, if it turns out that you tagged the wrong commit, or there is a need for a last-minute fix—but only if the tag was not made public.

When fetching changes, Git automatically follows tags, downloading annotated tags that point to fetched commits. This means that downstream developers will automatically get signed tags, and will be able to verify releases.

Tag verification

To verify a signed tag, you use git tag -v <tag-name>. You need the signer's public GPG key in your keyring for this (imported using for example gpg --import or gpg --keyserver <key-server> --recv-key <key-id>), and of course the tagger's key needs to be vetted in your chain of trust.

$ git tag -v v0.2
object 1085f3360e148e4b290ea1477143e25cae995fdd
type commit
tag signed
tagger Joe Random <[email protected]> 1411122206 +0200

project v0.2
gpg: Signature made Fri Jul 19 12:23:33 2014 CEST using RSA key ID A0218851
gpg: Good signature from "Joe Random <[email protected]>"

Signed commits

Signed tags are a good solution for users and developers to verify that the tagged release was created by the maintainer. But how do we make sure that a commit purporting to be by a somebody named Jane Doe, with the [email protected] e-mail, is actually a commit from her? How to make it so anybody can check it?

One possible solution, available since Git version 1.7.9, is to GPG-sign individual commits. You can do this with git commit --gpg-sign[=<keyid>] (or -S in short form). The key identifier is optional—without this, Git would use your identity as the author. Note that -S (capital S) is different from -s (small s); the latter adds a Signed-off-by line at the end of the commit message for the Digital Certificate of Ownership.

$ git commit -a --gpg-sign

You need a passphrase to unlock the secret key for
user: "Jane Doe <[email protected]>"
2048-bit RSA key, ID A0218851, created 2014-03-19

[master 1085f33] README: eol at eof
 1 file changed, 1 insertion(+), 1 deletion(-)

To make commits available for verification, just push them. Anyone can then verify them with the --show-signature option to git log (or git show), or with one of the %Gx placeholders in git log --format=<format>.

$ git log -1 --show-signature
commit 1085f3360e148e4b290ea1477143e25cae995fdd
gpg: Signature made Wed Mar 19 11:53:49 2014 CEST using RSA key ID A0218851
gpg: Good signature from "Jane Doe <[email protected]>"
Author: Jane Doe <[email protected]>
Date:   Wed Mar 19 11:53:48 2014 +0200

    README: eol at eof

Since Git version 2.1.0, you can also use the git verify-commit command for this.

Merging signed tags (merge tags)

The signed commit mechanism, described in the previous section, may be useful in some workflows, but it is inconvenient in an environment where you push commits out early—for example, to your own public repository—and only after a while do you decide whether they are worth including in the upstream (worth sending to the main repository). This situation can happen if you follow the recommendations of Chapter 8, Keeping History Clean; you know only after the fact (long after the commit was created), that the given iteration of the commit series passes code review.

You can deal with this issue by rewriting the whole commit series after its shape is finalized (after passing the review), signing each rewritten commit; or just amending and signing only the top commit. Both of those solutions would require forced push to replace old not signed history. Or you can create an empty commit (with --allow-empty), sign it, and push it on top of the series. But there is a better solution: requesting the pull of a signed tag (available since Git version 1.7.9).

In this workflow, you work on changes and, when they are ready, you create and push a signed tag (tagging the last commit in the series). You don't have to push your working branch—pushing the tag is enough. If the workflow involves sending a pull request to the integrator, you create it using a tag as the end commit:

$ git tag -s for-maintainer
$ git request-pull origin/master public-repo 1253-for-maintainer 
  >msg.txt

The signed tag message is shown between the dashed lines in the pull request, which means that you may want to explain your work in the tag message when creating the signed tag. The maintainer, after receiving such pull request, can copy the repository line from it, fetching and integrating the named tag. When recording the merge result of pulling the named tag, Git will open an editor and ask for a commit message. The integrator will see the template starting with:

Merge tag '1252-for-maintainer'

Work on task tsk-1252

# gpg: Signature made Wed Mar 19 12:23:33 2014 CEST using RSA key ID A0218851
# gpg: Good signature from "Jane Doe <[email protected]>"

This commit template includes the commented out output of the GPG verification of the signed tag object being merged (so it won't be in the final merge commit message). The tag message helps describe the merge better.

The signed tag being pulled is not stored in the integrator's repository, not as a tag object. Its content is stored, hidden, in a merge commit. This is done so as to not pollute the tag namespace with a large number of such working tags. The developer can safely delete the tag (git push public-repo --delete 1252-for-maintainer) after it gets integrated.

Recording the signature inside the merge commit allows for after-the-fact verification with the --show-signature option:

$ git log -1 --show-signature
commit 0507c804e0e297cd163481d4cb20f3f48ceb87cb
merged tag '1252-for-maintainer'
gpg: Signature made Wed Mar 19 12:23:33 2014 CEST using RSA key ID A0218851
gpg: Good signature from "Jane Doe <[email protected]>"
Merge: 5d25848 1085f33
Author: Jane Doe <[email protected]>
Date:   Wed Mar 19 12:25:08 2014 +0200

    Merge tag 'for-maintainer'
    
    Work on task tsk-1252
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset