Calculating the GitHub API's sha
representation of a local file
I've recently been doing some work to commit files directly to a GitHub repository using the API, as part of my work on dependency-management-data.
When you're updating a file with GitHub, you need to specify the sha
of the file that you're updating, which can be retrieved in one of two ways:
org=deepmap
repo=oapi-codegen
filepath=README.md
# via the REST API
$ gh api \
-H "Accept: application/vnd.github+json" \
-H "X-GitHub-Api-Version: 2022-11-28" \
/repos/$org/$repo/contents/$filepath --jq '.sha'
# Outputs:
# 2d7636c681adda0369171751f8e908576b99b431
# alternatively via the GraphQL API
gh api graphql -f query='
{
repository(owner: "deepmap", name: "oapi-codegen") {
object(expression: "HEAD:README.md") {
... on Blob {
oid
}
}
}
}
'
# Outputs:
# {
# "data": {
# "repository": {
# "object": {
# "oid": "2d7636c681adda0369171751f8e908576b99b431"
# }
# }
# }
# }
But what happens if you want to compare this against your locally constructed file contents, for instance to avoid pushing a commit that's going to make no changes to the file?
As it wasn't initially straightforward, I decided to write it as a form of blogumentation, in this case, with Go.
Applied from this Ruby snippet, this Bash snippet and this comment, it was noted that GitHub doesn't do anything fancy, and instead relies upon Git's own SHA-1 file-hash, which is slightly different to just running sha1sum
on the file.
Therefore we get the following function:
func calculateHashOfFile(contents []byte) (string, error) {
h := sha1.New()
_, err := h.Write([]byte(fmt.Sprintf("blob %d", len(contents))))
if err != nil {
return "", fmt.Errorf("failed to calculate hash: %w", err)
}
// a NULL byte https://stackoverflow.com/a/55779582
_, err = h.Write([]byte{byte(0)})
if err != nil {
return "", fmt.Errorf("failed to calculate hash: %w", err)
}
_, err = h.Write(contents)
if err != nil {
return "", fmt.Errorf("failed to calculate hash: %w", err)
}
return hex.EncodeToString(h.Sum(nil)), nil
}
Which when fed the contents of the same README.md
, will return the same hash.