-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Move git_off_t
to git_object_size_t
#5123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
88fc191
to
f22e3fc
Compare
Aha, we do actually treat |
5d18fc5
to
33ff267
Compare
a2a587d
to
b871167
Compare
Rebased to fix conflicts. |
4f033a1
to
54c9fb4
Compare
Rebased. |
@pks-t any thoughts here? I'd love to think about a release 🔜 and include this. |
On Fri, Aug 16, 2019 at 02:41:53AM -0700, Edward Thomson wrote:
@pks-t any thoughts here? I'd love to think about a release 🔜
and include this.
I hope to be able to find enought time to pick up some loose ends
in libgit2 this Friday. And I fully agree that we should start
thinking about a release soon, it's been quite some time.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks mostly good to me. There's two APIs where I'm sceptical about using git_object_size_t
with the diff API and the futils
interfaces, though
include/git2/diff.h
Outdated
@@ -603,7 +603,7 @@ typedef struct { | |||
int new_lineno; /**< Line number in new file or -1 for deleted line */ | |||
int num_lines; /**< Number of newline characters in content */ | |||
size_t content_len; /**< Number of bytes of data */ | |||
git_off_t content_offset; /**< Offset in the original file to the content */ | |||
git_object_size_t content_offset; /**< Offset in the original file to the content */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it really make sense to use git_object_size_t
here? Diffs aren't necessary referring to any objects at all but, so it seems a bit dirty to me to mix these up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, perhaps not. The rare good use of git_off_t
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
src/diff_file.c
Outdated
@@ -46,7 +46,7 @@ static int diff_file_content_init_common( | |||
{ | |||
fc->opts_flags = opts ? opts->flags : GIT_DIFF_NORMAL; | |||
|
|||
if (opts && opts->max_size >= 0) | |||
if (opts && (int64_t)opts->max_size >= 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mh. This comparison doesn't make sense anymore, as max_size
is now unsigned. If it becomes negative due to casting then I'd consider this a bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, ignore my last comment. You're right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
include/git2/diff.h
Outdated
* Defaults to 512MB. | ||
*/ | ||
git_off_t max_size; | ||
git_object_size_t max_size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, this changes our API, doesn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In practice, if callers ever provided a negative value, it was caught in diff_file_content_init_common
, and ignored.
#5123 (diff) was doing the same. So as-is we are API compatible, but if we remove that odd-looking check, then we are in fact changing the API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved this back to a signed int for sanity.
src/diff_xdiff.c
Outdated
@@ -160,7 +160,7 @@ static int git_xdiff_cb(void *priv, mmbuffer_t *bufs, int len) | |||
else if (line.origin == GIT_DIFF_LINE_DELETION) | |||
line.content_offset = bufs[1].ptr - info->xd_old_data.ptr; | |||
else | |||
line.content_offset = -1; | |||
line.content_offset = GIT_OBJECT_SIZE_MAX; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This changes semantics too, as we pass these to the callers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, can you give me more information about what you mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're populating the git_diff_line
struct here, which we directly pass to the user via the data_cb
callback. So previously callers would've seen content_offset = -1
, now he sees content_offset = UINT64_MAX
if a line was unchanged. I could imagine people having checks for that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved this back to a signed int to support this.
src/futils.c
Outdated
@@ -112,7 +112,7 @@ int git_futils_truncate(const char *path, int mode) | |||
return 0; | |||
} | |||
|
|||
git_off_t git_futils_filesize(git_file fd) | |||
int git_futils_filesize(git_object_size_t *out, git_file fd) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how I feel about these. off_t
directly maps to the mmap functions of the OS, using git_object_size_t
feels weird to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, mmap
does take an off_t
but only for the offset into the file. That doesn't correspond to the length of the file or memory-mapped segment.
I definitely don't think that we should use a signed type for file length, it simply can't be negative. We wouldn't want to use size_t
here, since on a 32-bit system, we want to be able to get the size of a file larger than 4 GB.
We could call this the more direct uint64_t
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hum. Some platforms handle negative offsets in mmap
just fine and treat it as valid. It's questionable though whether we're going to ever need that, especially considering that we'd have to fight with cross-platform behaviour here. So 👍 for uint64_t
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated for uint64_t
c689857
to
284d1d9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM. I think I'd like it better as a single global: deprecate git_off_t
commit, since it's mostly about typechanging things around, but that's minor. Thanks for fixing this @ethomson !
src/futils.h
Outdated
@@ -330,7 +330,7 @@ extern int git_futils_fake_symlink(const char *new, const char *old); | |||
*/ | |||
typedef struct { | |||
struct timespec mtime; | |||
git_off_t size; | |||
git_object_size_t size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some file APIs now use uint64_t
, some use git_object_size_t
. Shouldn't this one here use uint64_t
, as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this probably makes sense.
@@ -235,7 +235,7 @@ int git__mmap_alignment(size_t *alignment) | |||
} | |||
|
|||
|
|||
int p_mmap(git_map *out, size_t len, int prot, int flags, int fd, git_off_t offset) | |||
int p_mmap(git_map *out, size_t len, int prot, int flags, int fd, off64_t offset) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quick question, no need to change: why did you opt to convert to off64_t
instead of just typedeffing git_off_t
to off64_t
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought that it was clearer for internal uses to use something Very Obviously 64 bit, and something that is a Standard Type. git_off_t
feels like the sort of thing that is perhaps useful for our external API, but less useful for our internal.
Your comment did make me rethink a few things, so I actually kept git_off_t
in the external type places that it was used. I think this is a pretty reasonable compromise.
src/win32/map.c
Outdated
@@ -99,7 +99,7 @@ int p_mmap(git_map *out, size_t len, int prot, int flags, int fd, git_off_t offs | |||
return -1; | |||
} | |||
|
|||
assert(sizeof(git_off_t) == 8); | |||
assert(sizeof(off64_t) == 8); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this make sense? I mean it's very unlikely that off64_t
is ever going to be something else than 8 bytes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😬 Yes, that's very true. Removed entirely.
Introduce `git_object_size_t`, an unsigned type that we can use for the maximum size of git objects.
Instead of using a signed type (`off_t`) use a new `git_object_size_t` for the sizes of objects.
Instead of using a signed type (`off_t`) use a new `git_object_size_t` for the sizes of objects.
Instead of using a signed type (`off_t`) use `uint64_t` for the maximum size of files.
Instead of using a signed type (`off_t`) use a new `git_object_size_t` for the sizes of objects.
Instead of using a signed type (`off_t`) use an unsigned `uint64_t` for the size of the files.
994e41e
to
6c13cf6
Compare
64 bit types are always 64 bit.
Prefer `off64_t` to `git_off_t` for internal visibility.
Prefer `off64_t` to `git_off_t` internally for visibility.
Use int64_t internally for type visibility.
Prefer `off64_t` internally.
f48d720
to
6460e8a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot, Ed!
if (!fc->file->size) | ||
error = git_futils_filesize(&fc->file->size, fd); | ||
|
||
if (error < 0 || !fc->file->size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does look a bit weird. Seems like we do not allow loading empty files from disk here. Your change only makes code a lot clearer here, so I'm fine with this as is.
git_off_t
is problematic: first, it's used to represent the size ofa blob, but that's represented in git as an
unsigned long
, meaningthat we've lost a bit (on 64-bit machines). Second,
git_off_t
is apoor name, suggesting that it's an offset, not a size (which makes
sense, given that it's a signed type).
Migrate
git_off_t
togit_object_size_t
, an unsigned 64-bit type.