8000 `gix-worktree` and `gix-index` (checkout, status, commit) · Issue #301 · GitoxideLabs/gitoxide · GitHub
[go: up one dir, main page]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gix-worktree and gix-index (checkout, status, commit) #301

Open
28 of 44 tasks
Byron opened this issue Jan 20, 2022 · 0 comments
Open
28 of 44 tasks

gix-worktree and gix-index (checkout, status, commit) #301

Byron opened this issue Jan 20, 2022 · 0 comments
Labels
C-tracking-issue An issue to track to track the progress of multiple PRs or issues

Comments

@Byron
Copy link
Member
Byron commented Jan 20, 2022

Note that we handle both crates here as they are very much intertwined. git-index handles the data structure to accelerate operations in the git-worktree for actually manipulating the working copy.

Tasks for checkout

Reset

  • gix reset with soft/mixed/hard/merge/keep semantics with pathspecs as well. Submodule support should be possible, too.
  • gix-worktree-state reset to reset a working tree according to to an index, with pathspec support.
  • reset index to match tree based on pathspecs.

Out of scope

  • hunk support (i.e. git reset -p)

Tasks for add

Add files to the index.

Tasks for commit

  • create tree from index
  • create commit
  • round-trippable reads and writes (write all index extensions to not degenerate information)

Tasks for fetch/clone

  • create index from tree
    • can there be an optimization that keeps what didn't change?

Tasks for status

The difference between an index and the work tree. Analysis TBD.

See this blog post for incredible details on how git does things, related to fs-monitor as well.

There is also an alternative implementation which provides a lot of details on how to be better.
@pascalkuthe did a first analysis and concluded that most of the speedup came through congestion-free multi-threading and the usage of something like the untracked-cache. On Linux, it's possible to also speedup syscalls using more specific versions of it, but that should definitely be left as last resort for performance improvements.

Stages

  • determine unstaged changes (Diff between worktree and index #805)
    • changes between worktree and index
    • needs one stat call per file one way or another.
    • Question: what's faster: walkdir or symlink_metadata per index entry? Note that walkdir doesn't use ``
    • rename/copy tracking - should be based on tree-tree rename tracking, can it be generalized?
  • assure status works with file_size >= u32::MAX
    • currently it's acknowledged in the documentation but there is no test for that, nor is it clear how this works in git.
  • determine staged changes
    • compare tree entries with index entries
    • Question: is there a way to avoid having to traverse a tree recursively? Yes, use the TREE extension to know the dir ids of all entries, which allows to reproduce the trees and see if they changed, and only if so we lookup the tree itself.
  • find untracked files
    • can use untracked-cache to be faster. Could be coming 'for free' if walkdir would be used
  • fast is-dirty checks - and wiring that up to describe

Checkout Research

Follow Ups

  • symlink wait for 1.0 release with additional fixes (see thread on MR)
    • need to use remove_symlink() from this crate, but can't use it for relative paths due to the filename check
@Byron Byron added C-tracking-issue An issue to track to track the progress of multiple PRs or issues and removed C-tracking-issue An issue to track to track the progress of multiple PRs or issues labels Jan 22, 2022
@Byron Byron added the C-tracking-issue An issue to track to track the progress of multiple PRs or issues label Jan 23, 2022
Byron added a commit that referenced this issue Feb 7, 2022
Byron added a commit that referenced this issue Feb 7, 2022
Byron added a commit that referenced this issue Feb 7, 2022
Byron added a commit that referenced this issue Feb 7, 2022
Byron added a commit that referenced this issue Feb 24, 2022
Byron added a commit that referenced this issue Feb 27, 2022
Byron added a commit that referenced this issue Feb 28, 2022
The latter should be useful when fully implementing all required
baseline capabilities of doing a correct checkout
Byron added a commit that referenced this issue Feb 28, 2022
Byron added a commit that referenced this issue Feb 28, 2022
Byron added a commit that referenced this issue Feb 28, 2022
Byron added a commit that referenced this issue Feb 28, 2022
Byron added a commit that referenced this issue Feb 28, 2022
Byron added a commit that referenced this issue Mar 1, 2022
Byron added a commit that referenced this issue Mar 1, 2022
Byron added a commit that referenced this issue Mar 2, 2022
For now they are unused but that should change when doing collision
checks.
Byron added a commit that referenced this issue May 18, 2022
Byron added a commit that referenced this issue May 18, 2022
This also works if the work-tree can't be found but it is otherwise
a valid git dir.
Byron added a commit that referenced this issue May 18, 2022
Byron added a commit that referenced this issue Jun 30, 2022
Really just an excuse to start a new PR for additional attribute work
without investing much time.
@Byron Byron changed the title git-worktree git-worktree and git-index Aug 3, 2022
Byron added a commit that referenced this issue 10000 Aug 7, 2022
@Byron Byron changed the title git-worktree and git-index git-worktree and git-index Nov 4, 2022
@Byron Byron changed the title git-worktree and git-index git-worktree and git-index (checkout, status, commit) Nov 4, 2022
Byron added a commit that referenced this issue Feb 17, 2023
Really just an excuse to start a new PR for additional attribute work
without investing much time.
Byron added a commit that referenced this issue Mar 17, 2023
Really just an excuse to start a new PR for additional attribute work
without investing much time.
Byron added a commit that referenced this issue Mar 20, 2023
Really just an excuse to start a new PR for additional attribute work
without investing much time.
Byron added a commit that referenced this issue Apr 2, 2023
Really just an excuse to start a new PR for additional attribute work
without investing much time.
Byron added a commit that referenced this issue Apr 4, 2023
Really just an excuse to start a new PR for additional attribute work
without investing much time.
Byron added a commit that referenced this issue Apr 12, 2023
Really just an excuse to start a new PR for additional attribute work
without investing much time.
@Byron Byron mentioned this issue Jul 5, 2023
1 task
@Byron Byron changed the title git-worktree and git-index (checkout, status, commit) gix-worktree and gix-index (checkout, status, commit) Sep 4, 2023
EliahKagan added a commit to EliahKagan/gitoxide that referenced this issue Feb 15, 2025
The `to_unix_separators` and `to_windows_separators` functions in
`gix_path::convert` had TODO comments saying they should use the
`path-slash` crate "to handle escapes". These comments were added
as part of e4f4c4b (GitoxideLabs#397) but the context there and in the broader
related issue GitoxideLabs#301 does not seem to clarify the basis for this.

It is not really clear what handling escapes would entail here, and
it seems like there is not a way to do it without substantially
changing the interface of these conversion functions in `gix-path`,
which currently take a single argument representing a path and
return a single string-like value also representing a path. If
escape sequences appaer in the input to such a path conversion
function, it would not have a way to know if they are meant
literally or as escape sequences. (An analogous concern applies if
a function is to add escape sequences in its return value; it would
have no way to know if the caller expects them.)

Furthermore, while `path-slash` can convert some `\` paths to use
`/` instead, it does not appear to do anything related to handling
escape sequencs or distinguishing which occurrences of `\` or any
other character may be intended as part of an escape sequence.
Its documentation does prominently mention that `\` in escape
sequnces should not be converted to `/`:

> On Unix-like OS, the path separator is `/`. So any conversion is
> not necessary. But on Windows, the file path separator is `\`,
> and needs to be replaced with `/` for converting the paths to
> "slash paths". Of course, `\`s used for escaping characters
> should not be replaced.

But it looks like the part about `\` characters used for escaping
is meant as advice on how and when to use `path-slash`, rather than
meaning that `path-slash` would itself be able to distinguish
between `\` characters meant as directory separators and `\`
characters that perform quoting/escaping.
EliahKagan added a commit to EliahKagan/gitoxide that referenced this issue Feb 15, 2025
The `to_unix_separators` and `to_windows_separators` functions in
`gix_path::convert` had TODO comments saying they should use the
`path-slash` crate "to handle escapes". These comments were added
as part of e4f4c4b (GitoxideLabs#397) but the context there and in the broader
related issue GitoxideLabs#301 does not seem to clarify the basis for this.

It is not really clear what handling escapes would entail here, and
it seems like there is not a way to do it without substantially
changing the interface of these conversion functions in `gix-path`,
which currently take a single argument representing a path and
return a single string-like value also representing a path. If
escape sequences appear in the input to such a path conversion
function, it would not have a way to know if they are meant
literally or as escap
4FBC
e sequences. (An analogous concern applies if
a function is to add escape sequences in its return value; it would
have no way to know if the caller expects them.)

Furthermore, while `path-slash` can convert some `\` paths to use
`/` instead, it does not appear to do anything related to handling
escape sequences or distinguishing which occurrences of `\` or any
other character may be intended as part of an escape sequence. Its
documentation (https://docs.rs/path-slash/latest/path_slash/) does
prominently mention that `\` in escape sequences should not be
converted to `/`:

> On Unix-like OS, the path separator is `/`. So any conversion is
> not necessary. But on Windows, the file path separator is `\`,
> and needs to be replaced with `/` for converting the paths to
> "slash paths". Of course, `\`s used for escaping characters
> should not be replaced.

But it looks like the part about `\` characters used for escaping
is meant as advice on how and when to use `path-slash`, rather than
meaning that `path-slash` would itself be able to distinguish
between `\` characters meant as directory separators and `\`
characters that perform quoting/escaping.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-tracking-issue An issue to track to track the progress of multiple PRs or issues
Projects
None yet
Development

No branches or pull requests

1 participant
0