Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gix-index towards 1.0 #293

Open
17 of 32 tasks
Byron opened this issue Jan 8, 2022 · 0 comments
Open
17 of 32 tasks

gix-index towards 1.0 #293

Byron opened this issue Jan 8, 2022 · 0 comments
Labels
C-tracking-issue An issue to track to track the progress of multiple PRs or issues

Comments

@Byron
Copy link
Owner

Byron commented Jan 8, 2022

Only interpret and make accessible all values and figure out how to bridge different formats in a single implementation ideally.

Reading

  • V3 reading + V4 reading with multi-threading
  • read extensions
    • needed for multi-threading
    • TREE
    • mandatory link and sdir
    • remaining optional extensions
  • use bitflags instead of u32 for type safety
  • test for long paths and extended flags
  • publish v 0.1 of index and bitmap crates
  • improve performance - there seems to be a bottleneck when reading large indices, reading a 53MB file takes 2s!
  • evaluate using std::thread::scope() instead of statically scoped threads, which should make it easier (needs Rust 1.63 MVP)

Instantiation

  • verification/warm up
    • visualize/print index content
    • implement verification first (to double check what we produce from a tree)
    • handle verification on linked indices
  • index from tree
    • it looks like a basic tree traversal to create entries, maybe it's smart by creating them sorted right way.
    • git can actually traverse multiple trees at the same time, maybe taking advantage of them being sorted. It seems to unpack the same entry on each level for comparison.
    • it inserts without special knowledge of tree traversal, and typically does a binary search unless it naturally inserts past the end
      • It will invalidate the cache tree and update the untracked cache.
      • It checks every path and rejects .git (on macos with case-insensitivity, should be a configuration flag). Handles backslashes in paths, too, on windows, rejecting them.
      • '.git' is always 'outlawed' without case checking, everywhere.
    • The tree cache (extension) is created after inserting all entries, and thus not created on the fly
  • create cache tree
    • see here - interesting is to see promisor objects and 'quick-fetches' without negotiation.

Writing

Definitely round-trip tests with what we are reading.

  • when writing trees, assure these do not contains relative path components or .git named entries
  • write V4
  • write entire tree after clone
    • 'tree' extension
    • what about racy timestamps? Needs modification dates after checkout to learn more
    • parallelization support (via extension)
    • … see if other extensions should also get write support, even though there isn't much use in them after clone in particular.
  • generalized writing - an index after various modifications

Plumbing

  • entries
  • info, with details for extensions…
    • TREE
    • …others

Research

  • There is an fs-monitor implementation in Rust, it might be a basis for a pure-Rust notify based implementation, avoiding the watchman tool.
  • the ignore crate is probably interesting for .gitignore/exclude files. And it's a big dependency but most definitely worth it.
Byron added a commit that referenced this issue Jan 8, 2022
Byron added a commit that referenced this issue Jan 8, 2022
Note that in-code we must make sufficiently clear where a particular
fixture is coming from, or we name it after the test and file right
away.
@Byron Byron mentioned this issue Jan 8, 2022
11 tasks
Byron added a commit that referenced this issue Jan 8, 2022
It deals with comparing items from the work tree and the index,
and is generally what makes use of exclude specificiations.
Byron added a commit that referenced this issue Jan 8, 2022
Byron added a commit that referenced this issue Jan 8, 2022
It should be easy enough to learn from git tests to generate whichever
kind of index we need.
Byron added a commit that referenced this issue Jan 9, 2022
Byron added a commit that referenced this issue Jan 9, 2022
Byron added a commit that referenced this issue Jan 9, 2022
Byron added a commit that referenced this issue Jan 9, 2022
Byron added a commit that referenced this issue Jan 9, 2022
Byron added a commit that referenced this issue Jan 9, 2022
This is now sufficiently well implemented in the standard library.
Byron added a commit that referenced this issue Jan 9, 2022
It's sufficiently well supported using the standard library now.
Byron added a commit that referenced this issue Jan 9, 2022
Byron added a commit that referenced this issue Jan 9, 2022
Byron added a commit that referenced this issue Jan 9, 2022
Byron added a commit that referenced this issue Jan 10, 2022
Byron added a commit that referenced this issue Jan 10, 2022
Byron added a commit that referenced this issue Jan 10, 2022
Byron added a commit that referenced this issue Jan 10, 2022
For now the data structure is just 'as-written' and we see what
needs to change there as we have to maintain it.
Byron added a commit that referenced this issue Jan 25, 2022
Byron added a commit that referenced this issue Jan 27, 2022
This also fixes an issue with the node-id seemingly being optional,
even though it is not.
Byron added a commit that referenced this issue Jan 27, 2022
Byron added a commit that referenced this issue Jan 27, 2022
…until we know what to do with their data, which is when we can
assert more.
Byron added a commit that referenced this issue Jan 27, 2022
Byron added a commit that referenced this issue Jan 27, 2022
Byron added a commit that referenced this issue Jan 27, 2022
Byron added a commit that referenced this issue Jan 27, 2022
Byron added a commit that referenced this issue Jan 27, 2022
Either it's the way these are loaded, maybe they are missing a step,
or it's the way we think they are sorted, as they definitely aren't
sorted the way we think they are.
Byron added a commit that referenced this issue Jan 28, 2022
This is based on the way git is reading back the TREE extension,
which subtly inserts the read items into their position based
on alphabetical order (not the index sort order for entries, or
tree entries).

See https://github.com/git/git/blob/main/cache-tree.c#L604:L604
Byron added a commit that referenced this issue Jan 28, 2022
This method makes the index of the default workspace available.
Byron added a commit that referenced this issue Jan 28, 2022
Will need more work to deal with multi-workspace repositories
Byron added a commit that referenced this issue Jan 28, 2022
…nsion() (#293)

One cannot use `Option` as intermediary as this breaks closure inference
entirely. There should be an issue for this, let's reproduce it
and see what to watch.
Byron added a commit that referenced this issue Jan 28, 2022
Byron added a commit that referenced this issue Feb 1, 2022
It's mainly for completeness to provide people with with a `FullNameRef`
of HEAD.
Byron added a commit that referenced this issue Feb 22, 2022
The current implementation could possibly be improved by
implementing sorting along with duplication checks, but
right now it seems fast enough.
Byron added a commit that referenced this issue Feb 23, 2022
@Byron Byron changed the title git-index git-index Sep 4, 2023
@Byron Byron changed the title git-index gix-index Sep 4, 2023
@Byron Byron changed the title gix-index gix-index towards 1.0 Sep 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-tracking-issue An issue to track to track the progress of multiple PRs or issues
Projects
None yet
Development

No branches or pull requests

1 participant