-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Add example program to emulate 'git log' and create pathspec matching API #1711
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This adds a new public API for compiling pathspecs and matching them against the working directory, the index, or a tree from the repository. This also reworks the pathspec internals to allow the sharing of code between the existing internal usage of pathspec matching and the new external API. While this is working and the new API is ready for discussion, I think there is still an incorrect behavior in which patterns are always matched against the full path of an entry without taking the subdirectories into account (so "s*" will match "subdir/file" even though it wouldn't with core Git). Further enhancements are coming, but this was a good place to take a functional snapshot.
This adds more command line processing to the example version of log. In particular, this adds the funky command line processing that allows an arbitrary series of revisions followed by an arbitrary number of paths and/or glob patterns. The actual logging part still isn't implemented.
This fixes the way the example log program decides if a merge commit should be shown when a pathspec is given. Also makes it easier to use the pathspec API to just check "does a tree match anything in the pathspec" without allocating a match list.
if (parent_count < 1) | ||
parent_count = 1; | ||
|
||
if (git_vector_init(&commit->parent_ids, parent_count, NULL) < 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be an ideal place to use the typedef array
you created. The vector here has always felt overkill.
This converts the array of parent SHAs from a git_vector where each SHA has to be separately allocated to a git_array_t where all the SHAs can be kept in one block. Since the two collections have almost identical APIs, there isn't much involved in making the change. I did add an API to git_array_t so that it could be allocated at a precise initial size.
@vmg I updated the commit object to keep a |
@nulltoken This work includes a new public API for "compiling" a pathspec and matching it against the working directory, the index, or a given tree - see https://github.com/libgit2/libgit2/pull/1711/files#diff-6 - I'd really love your feedback if you have a chance. I tried to cover various use cases like "just tell if it matches or not", "tell me the list of files that match", and "tell me the list of patterns that had no match." I'm don't think that the various case-insensitivity issues are really addressed at this point, but I think that fixing them will be part of an eventual fix for #1689 and not part of this PR. |
* Compile a pathspec | ||
* | ||
* @param out Output of the compiled pathspec | ||
* @param flags Combination of git_pathspec_flag_t values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unused parameter?
Thanks @nulltoken - I fixed that problem and a number of other doc issues in the header (and tried to make the explanations clearer in general). The docs could probably still be even better, but I think at least there are no outright errors now. |
There is one more change I'd like to make to this PR based on looking at some of the old issues that are open related to implement log behavior. I'd like to add a Anyhow, I'm already working on this extension, but I'd like to hold off on merging this PR until I have it ready! |
Adding the function to match a pathspec to a diff has led to a lot of refactoring so that new function won't have to duplicate all the code in the old function. It's fine but it's just a little bigger project that I was initially thinking. Still, I think it will improve the code in the long run. |
This is a simple bit vector object that is not resizable after the initial allocation but can be of arbitrary size. It will keep the bti vector entirely on the stack for vectors 64 bits or less, and will allocate the vector on the heap for larger sizes. The API is uniform regardless of storage location. This is very basic right now and all the APIs are inline functions, but it is useful for storing an array of boolean values.
This adds an additional pathspec API that will match a pathspec against a diff object. This is convenient if you want to handle renames (so you need the whole diff and can't use the pathspec constraint built into the diff API) but still want to tell if the diff had any files that matched the pathspec. When the pathspec is matched against a diff, instead of keeping a list of filenames that matched, instead the API keeps the list of git_diff_deltas that matched and they can be retrieved via a new API git_pathspec_match_list_diff_entry. There are a couple of other minor API extensions here that were mostly for the sake of convenience and to reduce dependencies on knowing the internal data structure between files inside the library.
Okay, I added in: GIT_EXTERN(int) git_pathspec_match_diff(
git_pathspec_match_list **out,
git_diff_list *diff,
uint32_t flags,
git_pathspec *ps); This is designed to be useful if you want to apply rename detection (so you need all the deltas in the diff object and can't use the pathspec built in to the diff options) but you still want to understand if the diff results contain any items that match your pathspec. Along with this, I also reorganized the code in pathspec.c so there could be more code sharing between the two inner function bodies (i.e. the diff match and all the other match function which use iterators). As part of the code cleanup, instead of allocating an array of |
MERGED MANUALLY |
I started this a little while ago as an attempt to write an example program that would mimic
git log
behavior. I ended up with that (albeit only supporting a small fraction of the options), plus new public APIs for "compiling" a pathspec and matching against the working directory, the index, or an arbitrary tree. I also added a couple of small adjustments that probably won't make much difference - a way to get the raw text of a commit header (which changed the commit parsing internals slightly since I now keep the commit header text around) and aGIT_INDEX_STAGE_ANY
constant that can be used during index lookups if you just want to find out if a given file is mentioned anywhere in the index regardless of the stage.The log example is far from complete. In particular, I didn't really do any formatting, including no support currently for
--oneline
or anything like that. Also, there is no color output or pagination, so it's kind of a pain to use. However, once the basiclog
example is in place, it becomes more of a side project or newbie project to add support for various command line options.The log example does support several of the simple options (
--max-count
and--skip
,--min-parents
and--max-parents
) and it supports two of the more complicated options: pathname matching and patch printing (-p
). The pathname matching was particularly interesting - it is the reason that I created the public pathspec API (although in the end, it generally works off of diff, not off of the pathspec API) and because it took me a while to get the rules right for merge commits. The diff display was pretty easy given that we have a working example of diff.Comments and criticism welcome! There are probably some parts of the example that are a little odd, so I'd love feedback.