10000 Clean up some error handling and change callback error behavior by arrbee · Pull Request #1986 · libgit2/libgit2 · GitHub
[go: up one dir, main page]

Skip to content

Clean up some error handling and change callback error behavior #1986

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 23 commits into from
Dec 13, 2013

Conversation

arrbee
Copy link
Member
@arrbee arrbee commented Dec 5, 2013

This PR has a few error handling improvements that I've been playing with. You folks may consider them to be too intrusive, but I figured I'd open it up and get the conversation started.

There are two main areas changed here:

  1. Creating internal APIs for looking up config values without a GIT_ENOTFOUND error, and
  2. Changing callback to pass the return code back to the user instead of converting it to GIT_EUSER.

For handling config value access, I added a new internal-only function:

extern int git_config__lookup_entry(
  const git_config_entry **out,
  const git_config *cfg,
  const char *key,
  bool no_errors);

This looks up a specific key (no normalization is applied, so it must already be in normal form) and returns the found entry or NULL if it was not found. Not finding the key is not considered an error. There is an extra no_errors parameter that suppresses all errors completely to silently continue in cases where there is config file corruption and an operation where accessing a config value is non-critical is performed. (This differs from core Git, but until we have a "warning" mechanism, I prefer it.)

There are also three helper functions to get a string, bool, and int from the config with a fallback value to use if the value is not found or if an error is encountered. These are named git_config__get_string_force, git_config__get_bool_force, and git_config__get_int_force. I'm not sure about those names, so opinions are welcome -- I'm was trying to imagine a --force flag being applied to config lookups.

In places where a known configuration setting is being looked up (i.e. the name of the config is not coming from user input) and a GIT_ENOTFOUND is to be ignored, I inserted this new function which has a streamlined path and leaves no error to be cleared.


I experimented with a number of different cleanups for callback functions and GIT_EUSER, initially trying to make sure we were reliably clearing the last error, stopping the operation in progress, and returning GIT_EUSER. However, after some discussion (see below in this PR), I ended up changing the fundamental behavior with callbacks so that we would instead stop the operation in progress and pass the user-returned error code back to the original caller. In these cases, I still put a call to a new function giterr_set_after_callback that sets a generic error message if the callback did not use giterr_set_str to set on explicitly. This way, there should always be some message set when a function returns non-zero, even if the source was a user callback.

I also added a couple of convenience functions giterr_capture and giterr_restore for internal use when we need to take a snapshot of the current error state on a thread and then restore it after making some function calls that could mess it up. This is convenient when you need to perform some cleanup after an error and/or cancel is detected.


There are a variety of other unrelated error handling fixes that I put in as I was going through the code. There were a number of functions that were either not checking allocations or were replacing error return codes with -1. Those fixes are scattered around.

I also fixed an apparent bug in git_branch_move where the config was updated before the reference was altered. As a result, if the reference change encountered a conflict, the config settings would be trashed and not restored. I swapped the order in which the steps are executed to fix this and added some new tests to cover this case.

I also fixed an inefficiency in reference_matches_remote_head inside the clone code where once a match was found, we continued to iterate over all reference names (just immediately returning from the callback) instead of returning something like GIT_ITEROVER to stop the iteration.


Anyhow, hopefully this hasn't gone too far off the tracks and there is some value in it. There are a number of other fix ups I'd still like to do (such as dealing with the GIT_ENOTFOUND from looking up environment files -- i.e. the system, global, and XDG files) and there are definitely more places where we are not propagating error codes correctly (I saw several in the networking code). However, I'd prefer not to let this grow to be too large (particularly since it gets conflicts easily) and see if we can get something mergeable that will be an incremental improvement for some of the error issues.

@carlosmn
Copy link
Member
carlosmn commented Dec 5, 2013

I'm not sure what core Git does in cases where there is config file corruption and an operation where accessing a config value is non-critical is performed.

Git will happily abort for non-critical failures. If your push.default is something that your version doesn't understand, you won't even be able to run git receive-pack.

@ethomson
Copy link
Member
ethomson commented Dec 5, 2013

So, I'm going to argue that we should go the other way and let callbacks return richer error codes than 0 or GIT_EUSER. Why should a callback not be able to signal us that it ran out of memory? Further, we have a strange dichotomy now between user-settable functions that can provide richer error codes (odb, refdb, transports) and the ones that cannot.

I think it may have made sense to me why we convert everything to GIT_EUSER, and I forgot, so maybe this is a dumb suggestion.

@arrbee
Copy link
Member Author
arrbee commented Dec 5, 2013

My original belief was that we should preserve user error codes and that they had to be carefully passed back to the caller. A long time ago I started to look at that (i.e. making sure we were passing them back & not converting to -1). That was when we ended up going to EUSER route as I recall.

When I'm not on mobile, I'll try to dig up the old PRs / Issues and we can revisit the discussion. Personally, I kind of like preserving the users return value and if the user wants to know for certain that their function generated the error and not libgit2, they can explicitly return GIT_EUSER. @vmg had good reasoning for the route we went I seem to remember (he won me over, at least, but that may have just been charm and compliments).

@vmg
Copy link
Member
vmg commented Dec 5, 2013

The argument back then is that custom-returned error codes are dangerous because the user couldn't giterr_set, so you could get back e.g. a GIT_ENOTFOUND with an empty global error, and you'd segfault.

Since then, we've started exporting giterr_set, so I maybe it would be worth it to revisit allowing users to return custom errors, if we document the fact that they need to manually set the corresponding error message.

@arrbee
Copy link
Member Author
arrbee commented Dec 5, 2013

One option is that I can alter the purpose giterr_user_cancel to that instead of returning GIT_EUSER and clearing the error, instead it preserves the user's error code and checks if the user set an error message. If they did not (i.e. we would return an error without a message), it can set the message to a generic "git_reference_foreach callback returned %d" sort of message.

@ethomson
Copy link
Member
ethomson commented Dec 5, 2013

@vmg, @arrbee this is glorious.

@arrbee
Copy link
Member Author
arrbee commented Dec 7, 2013

Okay, I think this is shaping up. There are really three main things left to do:

  • update the description of this PR to accurately describe the changes so it can be more meaningfully reviewed, and
  • update the documentation on functions taking callbacks describing the new behavior of reliably propagating user errors
  • test propagation a little more carefully because there are probably some places that still have it wrong

I think we could probably ship this without the third item and just fix those issues as they come up.

@arrbee
Copy link
Member Author
arrbee commented Dec 11, 2013

Can you tell that I'm going alphabetically through the callback-using APIs? :-)

@arrbee arrbee mentioned this pull request Dec 11, 2013
4 tasks
This adds `git_config__lookup_entry` which will look up a key in
a config and return either the entry or NULL if the key was not
present.  Optionally, it can either suppress all errors or can
return them (although not finding the key is not an error for this
function).  Unlike other accessors, this does not normalize the
config key string, so it must only be used when the key is known
to be in normalized form (i.e. all lower-case before the first dot
and after the last dot, with no invalid characters).

This also adds three high-level helper functions to look up config
values with no errors and a fallback value.  The three functions
are for string, bool, and int values, and will resort to the
fallback value for any error that arises.  They are:

* `git_config__get_string_force`
* `git_config__get_bool_force`
* `git_config__get_int_force`

None of them normalize the config `key` either, so they can only
be used for internal cases where the key is known to be in normal
format.
This adds giterr_user_cancel to return GIT_EUSER and clear any
error message that is sitting around.  As a result of using that
in places, we need to be more thorough with capturing errors that
happen inside a callback when used internally.  To help with that,
this also adds giterr_capture and giterr_restore so that when we
internally use a foreach-type function that clears errors and
converts them to GIT_EUSER, it is easier to restore not just the
return value, but the actual error message text.
This continues auditing all the places where GIT_EUSER is being
returned and making sure to clear any existing error using the
new giterr_user_cancel helper.  As a result, places that relied
on intercepting GIT_EUSER but having the old error preserved also
needed to be cleaned up to correctly stash and then retrieve the
actual error.

Additionally, as I encountered places where error codes were not
being propagated correctly, I tried to fix them up.  A number of
those fixes are included in the this commit as well.
There are a lot of places that we call git__free on each item in
a vector and then call git_vector_free on the vector itself.  This
just wraps that up into one convenient helper function.
This changes the behavior of callbacks so that the callback error
code is not converted into GIT_EUSER and instead we propagate the
return value through to the caller.  Instead of using the
giterr_capture and giterr_restore functions, we now rely on all
functions to pass back the return value from a callback.

To avoid having a return value with no error message, the user
can call the public giterr_set_str or some such function to set
an error message.  There is a new helper 'giterr_set_callback'
that functions can invoke after making a callback which ensures
that some error message was set in case the callback did not set
one.

In places where the sign of the callback return value is
meaningful (e.g. positive to skip, negative to abort), only the
negative values are returned back to the caller, obviously, since
the other values allow for continuing the loop.

The hardest parts of this were in the checkout code where positive
return values were overloaded as meaningful values for checkout.
I fixed this by adding an output parameter to many of the internal
checkout functions and removing the overload.  This added some
code, but it is probably a better implementation.

There is some funkiness in the network code where user provided
callbacks could be returning a positive or a negative value and
we want to rely on that to cancel the loop.  There are still a
couple places where an user error might get turned into GIT_EUSER
there, I think, though none exercised by the tests.
I find this easier to read...
Okay, I've decided I like the readability of this style much
better so I used it everywhere.
The callback to supply data chunks could return a negative value
to stop creation of the blob, but we were neither using GIT_EUSER
nor propagating the return value.  This makes things use the new
behavior of returning the negative value back to the user.
The checkout notify callback behavior on non-zero return values
was not being tested.  This adds tests, fixes a bug with positive
values, and clarifies the documentation to make it clear that the
checkout can be canceled via this mechanism.
Clone callbacks can return non-zero values to cancel the clone.
This adds some tests to verify that this actually works and updates
the documentation to be clearer that this can happen and that the
return value will be propagated back by the clone function.
This time actually checking return values in diff notify tests and
actually testing callbacks for the index all-all/update-all/etc
functions.
@arrbee
Copy link
Member Author
arrbee commented Dec 11, 2013

A little rebasing and a few more tests... Test coverage for a lot of the callback usage cases, particularly testing what happens when a non-zero value is returned, it pretty lacking.

This adds tests that try canceling an indexer operation from
within the progress callback.

After writing the tests, I wanted to run this under valgrind and
had a number of errors in that situation because mmap wasn't
working.  I added a CMake option to force emulation of mmap and
consolidated the Amiga-specific code into that new place (so we
don't actually need separate Amiga code now, just have to turn on
-DNO_MMAP).

Additionally, I made the indexer code propagate error codes more
reliably than it used to.
@arrbee
Copy link
Member Author
arrbee commented Dec 12, 2013

Testing cancellation from inside an indexer progress callback turned out to be a bit painful. I wanted a flag that would using fake mmap for testing in valgrind, and since that code already existed in the Amiga support, I just decided to make it into a "supported" option -DNO_MMAP which could be activated for the Amiga platform, but also on any platform if requested (e.g. during a valgrind run where I don't want mmap to come into play). As a result, I deleted the src/amiga stuff and just made the Amiga build use the new flag. (I don't actually have an Amiga to test on, but I tested the -DNO_MMAP flag in my local valgrind testing.)

@ethomson
Copy link
Member

Until you can confirm that we build and test cleanly on Amiga, I'm sort of lukewarm on merging this. I hate to accrue technical debt on that platform.

@@ -178,34 +168,30 @@ static int update_head_to_new_branch(

static int update_head_to_remote(git_repository *repo, git_remote *remote)
{
int retcode = -1;
int error = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

THANK YOU.

@ethomson
Copy link
Member

So... you missed one. And I'm not going to tell you where it is.

No, but for real: this looks amazing to me. I wouldn't mind taking one more scan through diff_patch because holy cow that's over my head. But I don't think I'm going to find anything surprising.

@arrbee I owe you a beer (or several) next semester.

@arrbee
Copy link
Member Author
arrbee commented Dec 12, 2013

I wouldn't mind taking one more scan through diff_patch because holy cow that's over my head.

That's a good sign that the code wasn't written too well! 😭

Thanks for going through this all! I've got a couple more tests that I've written and I agree that the git_vector_free_all -> git_vector_free_deep is a good rename. Thanks!!! 😻

@ethomson
Copy link
Member

Wait, I just got out of my meeting and looked again, and I don't know what I was talking about with diff_patch. I think I saw that we added an assert and wanted to better understand that. I dunno.

So ignore me, it looks great.

This covers diff print, push, and ref foreach.  This also has a
fix for a small memory leak in the push tests.
This renames git_vector_free_all to the better git_vector_free_deep
and also contains a couple of memory leak fixes based on valgrind
checks.  The fixes are specifically: failure to free global dir
path variables when not compiled with threading on and failure to
free filters from the filter registry that had not be initialized
fully.
This wasn't being tested and since it has a callback, I fixed it
even though the return value of this callback is not treated like
any of the other callbacks in the API.
@arrbee
Copy link
Member Author
arrbee commented Dec 12, 2013

Okay, so I went through all the callbacks in the headers and tried to make sure we had some test coverage for almost all of them especially cases where an error return value would cancel the operation in progress. I then went through those cancelation cases with valgrind to make sure that canceling mid-operation didn't result in memory leaks, and I fixed up the small number of leaks that I found.

I don't have anything else I plan to add to this PR at this point.

@ethomson
Copy link
Member

@arrbee you are a madman. I think you just rewrote the entirely library. This is just crazy great.

Note that `git_pool_strdup` cannot really return any error codes,
 because the pool doesn't set errors on OOM.

 The only place where `giterr_set_oom` is called is in
 `git_pool_strndup`, in a conditional check that is always optimized
 away. `n + 1` cannot be zero if `n` is unsigned because the compiler
 doesn't take wraparound into account.

 This check has been removed altogether because `size_t` is not
 particularly going to overflow.
@vmg
Copy link
Member
vmg commented Dec 13, 2013

I've committed a refactoring for the only thing that was bothering me. ^^

Does anybody have any other complaints? I think this looks great. :)

@vmg
Copy link
Member
vmg commented Dec 13, 2013

Aw, I've just seen @arrbee is taking today off. Fuck it, not gonna leave this open over the weekend, this is too good not to merge. :)

vmg added 2 commits December 13, 2013 12:41
Ok, scrap the previous commit. This is the right overflow check that
takes care of 64 bit overflow **and** 32-bit overflow, which needs to be
considered because the pool malloc can only allocate 32-bit elements in
one go.
The size_t is 32-bit already, so it overflows before going into the
function. The `-1` test should handle this gracefully in both cases
anyway.
@vmg
Copy link
Member
vmg com A84B mented Dec 13, 2013

Man, this was quite a fight with the overflows. Fuck itttt

vmg pushed a commit that referenced this pull request Dec 13, 2013
Clean up some error handling and change callback error behavior
@vmg vmg merged commit 79194bc into development Dec 13, 2013
@arrbee arrbee deleted the rb/error-handling-cleanups branch February 3, 2014 18:27
phatblat pushed a commit to phatblat/libgit2 that referenced this pull request Sep 13, 2014
Clean up some error handling and change callback error behavior
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0