Git
Chapters ▾ 2nd Edition

A2.2 Appendix B: Embedding Git in your Applications - Libgit2

Libgit2

© Another option at your disposal is to use Libgit2. Libgit2 is a dependency-free implementation of Git, with a focus on having a nice API for use within other programs. You can find it at http://libgit2.github.com.

First, let’s take a look at what the C API looks like. Here’s a whirlwind tour:

// Open a repository
git_repository *repo;
int error = git_repository_open(&repo, "/path/to/repository");

// Dereference HEAD to a commit
git_object *head_commit;
error = git_revparse_single(&head_commit, repo, "HEAD^{commit}");
git_commit *commit = (git_commit*)head_commit;

// Print some of the commit's properties
printf("%s", git_commit_message(commit));
const git_signature *author = git_commit_author(commit);
printf("%s <%s>\n", author->name, author->email);
const git_oid *tree_id = git_commit_tree_id(commit);

// Cleanup
git_commit_free(commit);
git_repository_free(repo);

The first couple of lines open a Git repository. The git_repository type represents a handle to a repository with a cache in memory. This is the simplest method, for when you know the exact path to a repository’s working directory or .git folder. There’s also the git_repository_open_ext which includes options for searching, git_clone and friends for making a local clone of a remote repository, and git_repository_init for creating an entirely new repository.

The second chunk of code uses rev-parse syntax (see Branch References for more on this) to get the commit that HEAD eventually points to. The type returned is a git_object pointer, which represents something that exists in the Git object database for a repository. git_object is actually a “parent” type for several different kinds of objects; the memory layout for each of the “child” types is the same as for git_object, so you can safely cast to the right one. In this case, git_object_type(commit) would return GIT_OBJ_COMMIT, so it’s safe to cast to a git_commit pointer.

The next chunk shows how to access the commit’s properties. The last line here uses a git_oid type; this is Libgit2’s representation for a SHA-1 hash.

From this sample, a couple of patterns have started to emerge:

  • If you declare a pointer and pass a reference to it into a Libgit2 call, that call will probably return an integer error code. A 0 value indicates success; anything less is an error.

  • If Libgit2 populates a pointer for you, you’re responsible for freeing it.

  • If Libgit2 returns a const pointer from a call, you don’t have to free it, but it will become invalid when the object it belongs to is freed.

  • Writing C is a bit painful.

That last one means it isn’t very probable that you’ll be writing C when using Libgit2. Fortunately, there are a number of language-specific bindings available that make it fairly easy to work with Git repositories from your specific language and environment. Let’s take a look at the above example written using the Ruby bindings for Libgit2, which are named Rugged, and can be found at https://github.com/libgit2/rugged.

repo = Rugged::Repository.new('path/to/repository')
commit = repo.head.target
puts commit.message
puts "#{commit.author[:name]} <#{commit.author[:email]}>"
tree = commit.tree

As you can see, the code is much less cluttered. Firstly, Rugged uses exceptions; it can raise things like ConfigError or ObjectError to signal error conditions. Secondly, there’s no explicit freeing of resources, since Ruby is garbage-collected. Let’s take a look at a slightly more complicated example: crafting a commit from scratch

blob_id = repo.write("Blob contents", :blob) (1)

index = repo.index
index.read_tree(repo.head.target.tree)
index.add(:path => 'newfile.txt', :oid => blob_id) (2)

sig = {
    :email => "bob@example.com",
    :name => "Bob User",
    :time => Time.now,
}

commit_id = Rugged::Commit.create(repo,
    :tree => index.write_tree(repo), (3)
    :author => sig,
    :committer => sig, (4)
    :message => "Add newfile.txt", (5)
    :parents => repo.empty? ? [] : [ repo.head.target ].compact, (6)
    :update_ref => 'HEAD', (7)
)
commit = repo.lookup(commit_id) (8)
  1. Create a new blob, which contains the contents of a new file.

  2. Populate the index with the head commit’s tree, and add the new file at the path newfile.txt.

  3. This creates a new tree in the ODB, and uses it for the new commit.

  4. We use the same signature for both the author and committer fields.

  5. The commit message.

  6. When creating a commit, you have to specify the new commit’s parents. This uses the tip of HEAD for the single parent.

  7. Rugged (and Libgit2) can optionally update a reference when making a commit.

  8. The return value is the SHA-1 hash of a new commit object, which you can then use to get a Commit object.

The Ruby code is nice and clean, but since Libgit2 is doing the heavy lifting, this code will run pretty fast, too. If you’re not a rubyist, we touch on some other bindings in Other Bindings.

Advanced Functionality

Libgit2 has a couple of capabilities that are outside the scope of core Git. One example is pluggability: Libgit2 allows you to provide custom “backends” for several types of operation, so you can store things in a different way than stock Git does. Libgit2 allows custom backends for configuration, ref storage, and the object database, among other things.

Let’s take a look at how this works. The code below is borrowed from the set of backend examples provided by the Libgit2 team (which can be found at https://github.com/libgit2/libgit2-backends). Here’s how a custom backend for the object database is set up:

git_odb *odb;
int error = git_odb_new(&odb); (1)

git_odb_backend *my_backend;
error = git_odb_backend_mine(&my_backend, /*…*/); (2)

error = git_odb_add_backend(odb, my_backend, 1); (3)

git_repository *repo;
error = git_repository_open(&repo, "some-path");
error = git_repository_set_odb(odb); (4)

(Note that errors are captured, but not handled. We hope your code is better than ours.)

  1. Initialize an empty object database (ODB) “frontend,” which will act as a container for the “backends” which are the ones doing the real work.

  2. Initialize a custom ODB backend.

  3. Add the backend to the frontend.

  4. Open a repository, and set it to use our ODB to look up objects.

But what is this git_odb_backend_mine thing? Well, that’s the constructor for your own ODB implementation, and you can do whatever you want in there, so long as you fill in the git_odb_backend structure properly. Here’s what it could look like:

typedef struct {
    git_odb_backend parent;

    // Some other stuff
    void *custom_context;
} my_backend_struct;

int git_odb_backend_mine(git_odb_backend **backend_out, /*…*/)
{
    my_backend_struct *backend;

    backend = calloc(1, sizeof (my_backend_struct));

    backend->custom_context = …;

    backend->parent.read = &my_backend__read;
    backend->parent.read_prefix = &my_backend__read_prefix;
    backend->parent.read_header = &my_backend__read_header;
    // …

    *backend_out = (git_odb_backend *) backend;

    return GIT_SUCCESS;
}

The subtlest constraint here is that my_backend_struct's first member must be a git_odb_backend structure; this ensures that the memory layout is what the Libgit2 code expects it to be. The rest of it is arbitrary; this structure can be as large or small as you need it to be.

The initialization function allocates some memory for the structure, sets up the custom context, and then fills in the members of the parent structure that it supports. Take a look at the include/git2/sys/odb_backend.h file in the Libgit2 source for a complete set of call signatures; your particular use case will help determine which of these you’ll want to support.

Other Bindings

Libgit2 has bindings for many languages. Here we show a small example using a few of the more complete bindings packages as of this writing; libraries exist for many other languages, including C++, Go, Node.js, Erlang, and the JVM, all in various stages of maturity. The official collection of bindings can be found by browsing the repositories at https://github.com/libgit2. The code we’ll write will return the commit message from the commit eventually pointed to by HEAD (sort of like git log -1).

LibGit2Sharp

If you’re writing a .NET or Mono application, LibGit2Sharp (https://github.com/libgit2/libgit2sharp) is what you’re looking for. The bindings are written in C#, and great care has been taken to wrap the raw Libgit2 calls with native-feeling CLR APIs. Here’s what our example program looks like:

new Repository(@"C:\path\to\repo").Head.Tip.Message;

For desktop Windows applications, there’s even a NuGet package that will help you get started quickly.

objective-git

If your application is running on an Apple platform, you’re likely using Objective-C as your implementation language. Objective-Git (https://github.com/libgit2/objective-git) is the name of the Libgit2 bindings for that environment. The example program looks like this:

GTRepository *repo =
    [[GTRepository alloc] initWithURL:[NSURL fileURLWithPath: @"/path/to/repo"] error:NULL];
NSString *msg = [[[repo headReferenceWithError:NULL] resolvedTarget] message];

Objective-git is fully interoperable with Swift, so don’t fear if you’ve left Objective-C behind.

pygit2

The bindings for Libgit2 in Python are called Pygit2, and can be found at http://www.pygit2.org/. Our example program:

pygit2.Repository("/path/to/repo") # open repository
    .head                          # get the current branch
    .peel(pygit2.Commit)           # walk down to the commit
    .message                       # read the message

Further Reading

Of course, a full treatment of Libgit2’s capabilities is outside the scope of this book. If you want more information on Libgit2 itself, there’s API documentation at https://libgit2.github.com/libgit2, and a set of guides at https://libgit2.github.com/docs. For the other bindings, check the bundled README and tests; there are often small tutorials and pointers to further reading there.