10000 Shallow (#6396) with some fixes from review by ethomson · Pull Request #6557 · libgit2/libgit2 · GitHub
[go: up one dir, main page]

Skip to content

Shallow (#6396) with some fixes from review #6557

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 98 commits into from
May 9, 2023
Merged

Shallow (#6396) with some fixes from review #6557

merged 98 commits into from
May 9, 2023

Conversation

ethomson
Copy link
Member
@ethomson ethomson commented May 8, 2023

I made some quick fixes while reviewing #6396. Pushing this up to get a CI run before merging it.

Super excited about landing this. Thanks so much to all the people who worked on this throughout the years to get it landed. 🎉

tiennou and others added 30 commits June 27, 2020 14:33
This represents (old-style) grafted commits, a.k.a an array of
overridden parents for a commit's OID.
This wires git_repository to open the .git/info/grafts file and load its
contents as git_commit_grafts objects.
In order to increase maintainability in the future, we should try to
make structures as self-contained and opaque to its users as possible.
Thus it is probably not a good idea to just typedef `git_graftmap` to
`git_oidmap`, as that will make it a lot harder in the future to extend
the API in the future, if this need ever arises.

Refactor the code to instead declare a real structure `git_grafts`,
which is completely opaque to its callers.
Instead of using the newly introduced `git_buf_foreach_line`, which
modifies the buffer itself, we should try to use our existing parsing
infrastructure in "parse.h".

Convert the grafts parsing code to make use of `git_parse_ctx`. Remove
the `git_buf_foreach_line` macro, as grafts have been its sole user.
Parsing of grafts files is currently contained in the repository code.
To make grafts-related logic more self-contained, move it into
"grafts.c" instead.
The shallow roots are in fact another user of the grafting mechanism,
and in essence they do use the same file format for grafted commits.
Thus, instead of hand-coding the parsing logic a second time, we can
just reuse the `git_grafts` structure for shallow commits, as well.
When loading shallow grafts, we add each of the grafting commits to the
grafts backed by ".git/info/grafts". Keeping track of both grafts
separately, but partially storing them in a common grafts structure is
likely to lead to inconsistencies. In fact, there already are
inconsistencies if refreshing shallow grafts at a later point, as we
only refresh the shallows, but not the normal grafts in that case.

Disentangle both grafting stores and instead check both separately when
parsing commits.
The refresh logic for both "normal" and shallow grafts are currently
part of the repository code and implemented twice. Unify them into the
grafts code by introducing two new functions to create grafts from a
file and to refresh a grafts structure.
If replacing an already existing graft in the grafts map, then we need
to free the previous `git_commit_graft` structure.
Currently, we expose the function `git_repository_shallow_roots` to get
all grafted roots of the repository. This already paints us into a
corner, though, as we certainly need to experiment with some
functionality of the grafting mechanism before we can happily expose
some of its functionality. Most importantly, we need to get right when
to refresh grafts and when not.

Thus, this commit removes the public function with no public
replacement. We should first try and see what usecases people come up
with to e.g. expose the `git_grafts` mechanism directly in the future
or do something different altogether. Instead, we provide an internal
interface to get weak pointers to the grafting structs part of the
repository itself.
The opt mechanism isn't _really_ meant to be for feature flags, and it's
weird to feature flag shallow / unshallow at all.
Teach the smart transport more about oid types, instead of assuming SHA1.
@ethomson ethomson force-pushed the ethomson/shallow branch 4 times, most recently from 2660c2c to 8351502 Compare May 8, 2023 13:14
ethomson added 8 commits May 8, 2023 15:06
Looks like a double-free here.
Use SHA256 for file checksums. SHA1 makes no sense as a default in 2023.

Given that we're just looking at a file checksum to see if it's changed,
this does not need to take repository's OID type into account or
otherwise be configurable.
Don't mix parsing by hand and using `git_parse` to parse.
Depth of `0` should indicate full depth. Disallow negative values (they
may have a future meaning) and use `0` as the default.
The semantics of `from_file` are weird - it looks like a function that
just opens a file, but it actually inspects the pointer, which is
unexpected and could make things very crashy.

Make an `open` function that just does an open, and move the magic to
`open_or_refresh` whose name better indicates that it may do weird
stuff.
@ethomson ethomson force-pushed the ethomson/shallow branch 2 times, most recently from 1627ea6 to b166186 Compare May 9, 2023 09:40
@ethomson
Copy link
Member Author
ethomson commented May 9, 2023

OK -- I added just a few cleanups on top of #6396 since I couldn't push to that branch directly. On the whole, this work was 💯. I'm super excited to land this. Thank you @tiennou, @pks-t, @lya001, @lrm29, @Qix- and anybody else I'm missing -- for all the hard work here.

@Qix-
Copy link
Contributor
Qix- commented May 9, 2023

Amazing, can't wait for this to land! Thanks for getting this in the pipeline @ethomson :)

8000

@ethomson ethomson force-pushed the ethomson/shallow branch from b166186 to b6cb1b3 Compare May 9, 2023 16:02
ethomson added 2 commits May 9, 2023 17:14
Users should provide us an array of object ids; we don't need a separate
type. And especially, we should not be mutating user-providing values.
Instead, use `git_oid *` in the shallow code.
The `depth` field is suitable to specify unshallowing; provide an enum
to aide in specifying the `unshallow` value.
@ethomson ethomson force-pushed the ethomson/shallow branch from b6cb1b3 to 437c5f5 Compare May 9, 2023 16:14
@ethomson ethomson merged commit 2bbcdee into main May 9, 2023
@ethomson ethomson deleted the ethomson/shallow branch May 9, 2023 19:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants
0