8000 Hook support by tiennou · Pull Request #3824 · libgit2/libgit2 · GitHub
[go: up one dir, main page]

Skip to content

Hook support #3824

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 44 commits into from
Closed

Hook support #3824

wants to merge 44 commits into from

Conversation

tiennou
Copy link
Contributor
@tiennou tiennou commented Jun 16, 2016

This is a first pass at supporting hooks. It aims to provide the basic blocks (as per PROJECTS.md), so enumeration, loading and saving is supported, and there's a quick skeleton of how I'm planning to handle hook execution.

A few points/questions :

  • I was under the impression there had already been some discussion around the design, but I haven't been able to find them.
  • The list of hooks is hardcoded. This is completely debatable, but given variable argument count, I thought something that acts as a "we support these hooks" was preferable to YMMV.
  • I seem to remember bindings don't like callbacks (and me too, given my current enumeration test is human-based 😉). Should I provide an alternate way (e.g. git_hook_iterator_new()) ?
  • The plan for actually executing the hooks is to have something like git_hook__execute(hook_name, ...) and sprinkle those everywhere, but it might or might not be doable — I haven't been that far yet.

@whoisj
Copy link
8000
whoisj commented Jun 16, 2016

Some hook support already exists. The trick with hooks is that they're basically shell scripts and require the platform to support executing then.

@tiennou
Copy link
Contributor Author
tiennou commented Jun 16, 2016

Yep, I'm aware that direct hook execution does not belong in the library. The current plan is to have clients use git_hook_register to say they're interested in hook notifications, and then (maybe ?) provide the full path to the script to execute if applicable, or even no path at all if we want to support that.

Also, what did you meant by "already exists" ? I haven't been able to find anything other than some references in tests, but maybe I'm blind 😉 .

@whoisj
Copy link
whoisj commented Jun 16, 2016

Also, what did you meant by "already exists" ?

I'm sorry, the hook support is in the C# wrapper library, not in LibGit2 core. My mistake.

@ethomson
Copy link
Member
ethomson commented Jun 17, 2016

It feels to me like you're trying to design the support in libgit2 that you think you want before you've tried to implement any hooks in your application itself. I feel like trying to build "hooks support" into libgit2 before trying to get your application to do anything is actually the wrong direction to attack this problem. Instead, I think that you should get your application to exec some hooks so that you know what you can get libgit2 to help you with.

I realize it seems counterintuitive to do it in the application first, since generally you're coding a functionality or a UI on top of libgit2 and in that case the libgit2 part must already exist. But without having actually succeeded in running a hook from your application, I think that we will merely be making wild guesses as to what we want the hooks support (if any) to look like and will very likely design this wrong.

As a concrete example, it's unclear to me how git_hook_load would be a useful addition to the library. You want to run a hook, you never want to load a hook into memory. It's an executable - probably (but not necessarily) a shell script, but it could be an ELF binary (or COFF, or a.out, or whatever). Heck, it could be a Java class and you're relying on binfmt_misc to know how to execute it for you. The point is, you need to exec it, you don't want to have a copy of the image in memory.

Similarly, most hooks don't make sense in the context of libgit2. Things like the prepare-commit-msg and commit-msg hooks are things that could never possibly be run, since we don't launch an editor to get the commit message.

We think that we have callbacks in the appropriate places that you can map those to hook execution if that's something that you're interested in (though they are not called "hooks" callbacks, they are generally more generic than that.) I think the best first place to start is to identify hooks that you want to support that are lacking callbacks (that can be supported, that is, unlike the aforementioned hooks around commit messages). That's a clear improvement.

libgit2 certainly can provide some support for hooks, like providing the path. This seems of limited use to me, since it's mostly just string concatenation, (.git/hooks/commit-msg) But I can understand wanting (say) to be able to query for that path to reduce the amount of git repository-type business logic you have in your application. Similarly, I can imagine asking for a map of the hooks that exist in .hooks. Again, this is a relatively trivial readdir but I can imagine wanting an API to provide some abstraction here.

Anyway. There are probably some things that libgit2 can help you with here, but I would like to be conservative with these APIs.

@tiennou
Copy link
Contributor Author
tiennou commented Jun 17, 2016

Thanks for the thorough reply @ethomson. Just so we're clear, the API I wrote is inspired from what's in PROJECTS.md, and git_hook_load/save is suggested there. As you point out (and I do had a pause when I sketched the API), it might not make sense every time, and I'm now tempted to drop it completely (maybe keeping just a few helpers for path-wrangling)...

Looking at that as an app developer, what I want is a way to be told when I need to exec a hook, because I want to be compatible with git/not break expectations. For example, I would expect (and I might be wrong 😉) libgit2 to detect that there's a commit-msg hook set up, trigger a callback to my hook execution function*, and abort or proceed with committing based on what was reported by the hook. It should not allow the commit to go through without telling me about it. I do see how the prepare-commit-msg would be weird to handle though (I'm not even sure if we support commit.template).

  • This is the wacky part, because we could either pass data that makes sense to us (like in that case the commit message), or helpfully write a file and pass that so the execution function doesn't have to do it (which IMHO reduces duplication).

But then, maybe that whole responsibility could indeed be moved down to the application (provided the same effect can be achieved through the generic callbacks). I might be worrying too much about duplicated code 😉.

@ethomson
Copy link
Member

Just so we're clear, the API I wrote is inspired from what's in PROJECTS.md, and git_hook_load/save is suggested there.

Thanks for the clarification - I didn't realize that was there. :P

For example, I would expect (and I might be wrong 😉) libgit2 to detect that there's a commit-msg hook set up, trigger a callback to my hook execution function*, and abort or proceed with committing based on what was reported by the hook.

Right - so the thing is that we don't have an API that maps to the git commit command. We have a handful of git_commit_create commands, but these correspond more closely to the git-commit-tree plumbing command.

This is important because if we were to (say) fire the commit-msg command in the git_commit_create functions, then we would be firing it every time a commit was created, not every time the user was "performing a commit" (like what git commit does). So, during merges and rebases. If you had some commit-msg hook that actually validated the contents of a message, then we would fire it when we create recursive merge bases (which have totally bogus internal messages, since they're not intended to leak out). It would be quite disastrous. :/

This is why we've asked the callers to handle hooks-like things: you know when you're going through a commit workflow like the git commit command line, versus creating commits that are (say) part of a merge or a cherry-pick operation.

Of course, the problem here is that various other libraries (say, LibGit2Sharp) may provide a commit like workflow as API. So this makes the problem a little bit more difficult.

Anyway. I'm sorry that I don't have clearer answers here yet. This is something that's been discussed a few times, and I think that all the vendors who are building UI on top of libgit2 should absolutely be supporting. As a result, I would love to give as much help as we can to you (and the rest) so that it's less of a slog.

@tiennou
Copy link
Contributor Author
tiennou commented Jun 27, 2016

Thanks, that makes it clear why hook support is a hard problem. IIUC, if libgit2 had a git_perform_commit (that would be considered "porcelain"/high-level) it would make sense.

I'll go ponder what the bindings are currently up to, to see if there's any common infrastructure we could bring in, but arguably this API is bound to be a simple wrapper around exec/popen/Windows-y stuff with some sugar on top, like git_hook_exec("update", "the-ref", "bda34...", "eff453...", my_execution_callback).

@tiennou
Copy link
Contributor Author
tiennou commented Jun 28, 2016

#964 (I knew there had been some discussion already).

@tiennou tiennou closed this Jun 28, 2016
@tiennou tiennou deleted the hooks branch June 28, 2016 19:18
@tiennou tiennou restored the hooks branch June 28, 2016 19:18
@tiennou tiennou reopened this Jun 28, 2016
@tiennou
Copy link
Contributor Author
tiennou commented Jun 28, 2016

Here's a first pass at hook execution. It allows the client to register a callback for hook execution and a wrapper that we (or, for some hooks, the client) can call to cause that callback to be executed with the expected arguments after the checks are done.

This still needs to tackle the case of the few hooks that do I/O, but I think I'll just pass a git_buf in the git_hook_env and let the client do its thing.

I'm saddened by the fact that this is so STRINGS EVERYWHERE, with additional NULL SENTINELS 😅 . There might be a way to make it cleaner, but as "conservative" is a keyword, let's keep it simple for now.

It's related to the point above, but there will be times where the client has OIDs and git_bufs, and the hook expects SHAs and filenames. I was thinking that there could be some wrapper (for example git_hook_execute_pre_push("local-ref", "remote-ref", ...) and have that automatically build the required strings for execution. But again, "conservative".

Right now the code has a list of hardcoded hook names, but there might be a way to make that more generic (just found out about git_path_diriter). And the generic execution call could be used by clients who which to call those (with the paincaveats of argument-building bestowed upon them).

git_repository *repo,
git_hook_foreach_cb callback,
void *payload
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our coding style has the closing brace on the same line as the last argument. Also applies for the following functions.

@pks-t
Copy link
Member
pks-t commented Nov 1, 2016

Hummm... I think we shouldn't actually worry about git.git deprecating certain hooks. For one, I think it rather unlikely that they'd deprecate hooks as there will always be use-cases which are broken by removing a hook. And furthermore, git.git always pays strong attention not to remove existing features.

And even in the case that git.git does deprecate a hook I think there's a big benefit in having an enum here. If the client will just compare strings, he cannot easily know when a hook has been deprecated and may fail silently. But in the case of an enum which subsequently got removed due to deprecation, the client will fail to compile. While I agree it is not a nice thing to break API I deem it better still than causing a client to silently fail.

Copy link
Member
@pks-t pks-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just skipped the changes. There's a compilation error on MSVC currently

clar__assert_equal(file, line, "Vector length mismatch: %" PRIuZ " != %" PRIuZ, 1, PRIuZ, git_vector_length(a), git_vector_length(b));

size_t i;
char *string;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is illegal in C90. Declarations have to come before all statements.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And by the way, there's some compilation error on Windows with MSVC here. Seems to be related to size_t, but I didn't really take a look into it.

const char * test_str = "a test string";
char **malloc_str = git__malloc(sizeof(*malloc_str));
*malloc_str = git__calloc(strlen(test_str), sizeof(*malloc_str));
strcpy(*malloc_str, test_str);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could just have a normal char *malloc_str and pass that to the callback. No need to allocate a pointer to a string here. The callback would then directly get the string and cast it to char *.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I seem to remember having a hard time not freeing a stack-allocated string, hence the mess. I'll try to clean that up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, looking at it, I'm not even sure what's the point of building something to pass to the destructor. The test is only concerned with making sure its called, freeing things is not its responsibility.

cl_git_pass(git_config_set_string(cfg, "core.hooksPath", ALT_HOOK_DIR));

cl_git_pass(git_hook_dir(&alt_hook_str, g_repo));
cl_git_pass(git_buf_sets(&alt_hook, alt_hook_str));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're leaking alt_hook_str. You probably want git_buf_attach(&alt_hook, alt_hook_str, strlen(alt_hook_str) + 1) here, which will also free the string later on.

@tiennou
Copy link
Contributor Author
tiennou commented Nov 7, 2016

I've added sys/hook.h, with the goal of having the "higher-level" versions of hooks there, as a bunch of git_hook_call_* functions that the client can use to trigger hook execution (a missing hook is ignored, but it could be changed to GIT_ENOTFOUND if that's better). Now that I think about it, I feel like the current headers should be reversed — hook.h contains the nice, supported versions, and sys/hook.h has the low-level git_hook_execute function and friends...

I hooked up (pun intended) pre-rebase and added a quick test to see how it works and check that a hook can prevent a rebase. I'm completely abusing the executor to do that but well...

As for the enum vs strings discussion, providing those higher-level functions make the enum useless from the executor standpoint. As I see it, it's only concerned with executing the given hook, with the given arguments and optional input.

If you (still) like where this is going, I'll add more of the high-level functions. For th E5FC e record, here is my current FIXME comment 😉 :

/**
 * FIXME: Unsupported hooks :
 *
 * - applypatch-msg, pre-applypatch, post-applypatch: no git-am.
 * - pre-commit, prepare-commit-msg, commit-msg, post-commit: our "commit workflow"
 * doesn't match git.
 * - post-checkout: the "normal checkout" looks like a "no matching workflow" too,
 * as our checkout doesn't work on references, but directly on HEAD/index/tree.
 * The "clone checkout" looks ok though.
 * - post-merge: no git-pull.
 * - pre-push: looks ok-ish.
 * - pre-receive, update, post-receive, post-update, push-to-checkout:
 * no git-receive-pack.
 * - pre-auto-gc: no git-gc.
 * - post-rewrite: looks ok-ish.
 */

@tiennou
Copy link
Contributor Author
tiennou commented Apr 11, 2018

Closing in favor of #4620.

@tiennou tiennou closed this Apr 11, 2018
@tiennou tiennou deleted the hooks branch June 3, 2018 20:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0