Git and libgit2
Mike Slinn

Divergent Libgit2 Library Wrappers

Published 2023-05-12. Last modified 2023-05-30.
Time to read: 3 minutes.

This page is part of the git collection, categorized under Git.

My Introduction to Libgit2 article mentioned the following language bindings (wrappers) for libgit2:

My Merge and Pull: git CLI vs. Libgit2 Wrappers article discusses the mechanics of git merging and related concepts.

My libgit2 Examples article explores the C code examples written by the libgit2 team to demonstrate how to use their library.

Each Language Binding Provides A Different API

Libgit2 is a low-level API, providing the ‘plumbing’ that is common to all of the language-specific wrappers. However, the git CLI also provides a set of high-level (‘porcelain’) commands, which are not implemented by libgit2. Examples are git-push, git-pull and git-merge.

These high-level commands are user-friendly and would also be very useful if libgit2 implemented them. That is not the case, unfortunately, and there does not seem to be a high level of interest by the libgit2 developers in providing them.

Many of the language wrapper libraries have some degree of support for the high-level functionality. Unfortunately, there is no co-ordination amongst the development communities that have formed around the language-specific libraries. This means that the work that is in process now by the various groups of developers to eventually provide equivalent functionality to the git cli does not share a common API. In some cases the approaches are completely different.

As a result, the code bases for the gitlib2 language wrappers are such that progress made by a team that is working with one language is not very helpful to other teams, working on other language wrappers.

2023-05-30 Update

A few weeks after publishing this article I found the libgit2 example code.

These examples are a mixture of basic emulation of core Git command line functions and simple snippets demonstrating libgit2 API usage (for use with Docurium). As a whole, they are not vetted carefully for bugs, error handling, and cross-platform compatibility in the same manner as the rest of the code in libgit2, so copy with caution.

That being said, you are welcome to copy code from these examples as desired when using libgit2. They have been released to the public domain, so there are no restrictions on their use.

 – From libgit2 examples README

I have not had time to see how these code examples compare to the production code for other language bindings, discussed below.

Examples: Git-Merge and Git-Pull

For example, the pygit2 wrapper library for Python's implementation for git-merge uses a completely different concept from that of the rugged wrapper library for Ruby.

Neither library provides support for a git-pull work-alike. Although many have stated that “git-pull is just git-fetch followed by git-merge” that is a radical over-simplification, and does not properly describe the work that git-pull actually does.

Michael Boselowitz implemented git-pull for pygit2. He naturally used pygit2’s Repository.merge method. I highlighted that line in the following code.

Shell
def pull(repo, remote_name='origin', branch='master'):
    for remote in repo.remotes:
        if remote.name == remote_name:
            remote.fetch()
            remote_master_id = repo.lookup_reference('refs/remotes/origin/%s' % (branch)).target
            merge_result, _ = repo.merge_analysis(remote_master_id)
            # Up to date, do nothing
            if merge_result & pygit2.GIT_MERGE_ANALYSIS_UP_TO_DATE:
                return
            # We can just fastforward
            elif merge_result & pygit2.GIT_MERGE_ANALYSIS_FASTFORWARD:
                repo.checkout_tree(repo.get(remote_master_id))
                try:
                    master_ref = repo.lookup_reference('refs/heads/%s' % (branch))
                    master_ref.set_target(remote_master_id)
                except KeyError:
                    repo.create_branch(branch, repo.get(remote_master_id))
                repo.head.set_target(remote_master_id)
            elif merge_result & pygit2.GIT_MERGE_ANALYSIS_NORMAL:
                repo.merge(remote_master_id)

                if repo.index.conflicts is not None:
                    for conflict in repo.index.conflicts:
                        print 'Conflicts found in:', conflict[0].path
                    raise AssertionError('Conflicts, ahhhhh!!')

                user = repo.default_signature
                tree = repo.index.write_tree()
                commit = repo.create_commit('HEAD',
                                            user,
                                            user,
                                            'Merge!',
                                            tree,
                                            [repo.head.target, remote_master_id])
                # We need to do this or git CLI will think we are still merging.
                repo.state_cleanup()
            else:
                raise AssertionError('Unknown merge analysis result')

I wrote a literal translation of Michael Boselowitz’s Python code to Ruby, using rugged. As mentioned, rugged for Ruby implements merge differently. I highlighted the problematic line; scroll the souce listing to see it.

Shell
require 'rainbow/refinement'
require 'rugged'
require_relative 'credentials'
require_relative 'repository'

class GitUpdate
  using Rainbow

  abort "Error: Rugged was not built with ssh support. Please see https://www.mslinn.com/git/4400-rugged.html".red \
    unless Rugged.features.include? :ssh

  # Just update the default branch
  def pull(repo, remote_name = 'origin') # rubocop:disable Metrics/AbcSize, Metrics/MethodLength
    remote = repo.remotes[remote_name]
    unless remote.respond_to? :url
      puts "  Remote '#{remote_name}' has no url defined. Skipping this repository."
      return
    end
    puts "  remote.url=#{remote.url}".yellow
    default_branch = repo.head.name.split('/')[-1]
    refspec_str = "refs/remotes/#{remote_name}/#{default_branch}"
    begin
      success = remote.check_connection(:fetch, credentials: select_credentials)
      unless success
        puts "  Error: remote.check_connection failed.".red
        return
      end
      remote.fetch(refspec_str, credentials: select_credentials)
    rescue Rugged::NetworkError => e
      puts "  Error: #{e.full_message}".red
    end
    abort "Error: repo.ref(#{refspec_str}) for #{remote} is nil".red if repo.ref(refspec_str).nil?
    remote_master_id = repo.ref(refspec_str).target
    merge_result, = repo.merge_analysis remote_master_id

    case merge_result
    when :up_to_date
      # Nothing needs to be done
      puts "  Repo at '#{repo.workdir}' was already up to date.".blue.bright

    when :fastforward
      repo.checkout_tree(repo.get(remote_master_id))
      master_ref = repo.lookup_reference 'refs/heads/master'
      master_ref.set_target remote_master_id
      repo.head.set_target remote_master_id

    when :normal
      repo.merge remote_master_id # rugged does not have this method
      raise "Problem: merging updates for #{repo.name} encountered conflicts".red if repo.index.conflicts?

      user = repo.default_signature
      tree = repo.index.write_tree
      repo.create_commit 'HEAD', user, user, 'Merge', tree, [repo.head.target, remote_master_id]
      repo.state_cleanup
    else
      raise AssertionError 'Unknown merge analysis result'.red
    end
  end

  def update_via_rugged(dir_name)
    repo = Rugged::Repository.new dir_name
    pull repo
  rescue StandardError => e
    puts "Ignoring #{dir_name} due to error: .#{e.full_message}".red
  end
end

Unfortunately, rugged does not offer a similar method. Instead, Tree.merge is provided (written in C), which accepts completely different parameter types.

Here is an example of how to merge two trees using rugged:

Shell
tree = Rugged::Tree.lookup(repo, "d70d245ed97ed2aa596dd1af6536e4bfdb047b69")
diff = tree.diff(repo.index)
diff.merge!(tree.diff)

This means the code I wrote above (the literal translation) does not work because theses two language libraries for libgit2 have diverged.

Michael Boselowitz’s Python code does not handle all possible scenarios. He provided valuable prototype code, not production code. I have attempted to illustrate important issues in the articles I write, but I am also not pretending to publish production code in these articles either. For more information, including a discussion of the missing scenarios, please read the aforementioned article Merge and Pull: git CLI vs. Libgit2 Wrappers.

I will have to significantly rewrite my Ruby code to accomodate the differences between implementations of higher-level git APIs for the various language binding libraries.

Progress would happen much faster if the communities downstream from libgit2 co-operated in a standards effort, and/or the libgit2 community authorized an architect/evangelist to help standardize higher-level APIs.

Progress made by any project downstream would be available, after translation, to all other projects. A lucid dream worthy of taking action on.