Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node/FS wrapper for nanotar #14

Open
1 task
benmccann opened this issue Aug 28, 2024 · 8 comments
Open
1 task

Node/FS wrapper for nanotar #14

benmccann opened this issue Aug 28, 2024 · 8 comments
Labels
enhancement New feature or request

Comments

@benmccann
Copy link

Describe the feature

There are a group of folks from https://e18e.dev that are going around the ecosystem and swapping out heavy libraries for lighter ones. It seems like it'd be great to switch some libraries to use nanotar to reduce their dependencies. However, the two APIs are fairly different making it somewhat difficult to do it in a bunch of places and get people to quickly adopt nanotar.

E.g. the first instance I went to look at was tar.extract({ file: tarball, cwd: to, strip: 1, onentry: filter_func }) and it's not super obvious how to swap it for nanotar.parseTar(data).

A few things that might help here to accelerate that:

Additional information

  • Would you be willing to help implement this feature?
@benmccann benmccann added the enhancement New feature or request label Aug 28, 2024
@ayuhito
Copy link

ayuhito commented Aug 29, 2024

For additional reference, we're talking about replacing tar. tar-fs, but also gunzip-maybe in the process which have a lot of dependencies.

I have a branch to replace it in @remix-run/dev and create-remix. It's still incomplete, but it's a little bit of a heavy-handed migration.

Though I understand the APIs are completely different so it naturally won't be easy to migrate.

@pi0
Copy link
Member

pi0 commented Aug 29, 2024

Hi, and thanks for your interest in this project.

If you have suggestions to make API of nanotar more DX friendly (without adding to it's complexity, bundle size hugely and runtime dependency) please feel free to open discussions i am certainly willing to hear them 👍🏼

Depending on runtime-specific APIs and mimicking the bigger alternatives is not a goal of this project. (otherwise we will eventually be them). What I would suggest, is that you might find it useful to make a wrapper library that adds functionalities such as file-system support by export conditions to make it a more obvious replacement for tar/tar-fs.

(i have some future plans re fs support but it can take time)

@pi0 pi0 changed the title Additional tar compatibility Node/FS wrapper for nanotar Jan 20, 2025
@benmccann
Copy link
Author

From @bluwy:

The difficult part is interpreting each tar file's "file name", and writing that down to fs. You can download a big tarball from a github repo for example, and the names are quite inconsistent. Some are full paths, some are basenames expecting to be under the previous tar-files directory. Some are a chain of directories that should be nested, but sometimes they're not actually nested and should be on the same level, etc. A big headache but more respect to the existing libraries

@pi0
Copy link
Member

pi0 commented Jan 21, 2025

@benmccann @bluwy Can you perhaps submit some examples (more useful, tools, libs that generate inconsistent paths). We can either export tree-shakable normalization plugins or do it built-in for some common patterns (like force-relative them).

On other hard thing to handle is platform-specific types such as hard-links, which I expect fs-wrapper to handle but also nanotar should convert them from their (inconsistent) numeric format, which is in my plan.

@bluwy
Copy link
Contributor

bluwy commented Jan 21, 2025

My tests were from installing the astro repo tarball from github, but any large-sized repos should also cause the same issue with the file names. @43081j mentioned that he's working on the fs implementation too though, maybe he has some findings to share here.

@pi0
Copy link
Member

pi0 commented Jan 21, 2025

Good pointer @bluwy, i could reproduce some issues ~> #29

@43081j
Copy link
Member

43081j commented Jan 21, 2025

i think we should have a nanotar/extract.js entrypoint to allow us to avoid messy conditional exports and what not

we could also have an fs option to allow a custom file system impl to be passed

im not sure ill get time to do this myself yet, but here's the rough usage i thought:

interface ExtractOptions {
  fs: typeof fs; // if you want to bring your own FS API

  // the various options that `tar` supports. see the man page:
  // https://man7.org/linux/man-pages/man1/tar.1.html
  keepOldFiles: boolean;
  keepNewerFiles: boolean;
  // ...
}

declare function extract(
  path: string,
  options?: ExtractOptions
): Promise<void>;

i wouldn't expect us to implement every possible tar option though

when extracting files, you should strip leading / and throw on .. appearing. everything is written relative to cwd then

if you set absoluteNames: true, skip the stripping of /. that means a tar containing /im_at_the_root will write that file literally to /im_at_the_root rather than in cwd

@pi0
Copy link
Member

pi0 commented Jan 21, 2025

thanks for the ideas @43081j.

I believe fs-integration should be out of this lib's scope to keep it agnostic.

We can make a wrapper lib that can use export conditions for platform-specific fs implementations and possibly add the additional set of options related to filesystem write behavior for extract util for example. (it is also a good chance to maximize node:tar API compatibility if we aim for ecosystem replacement).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants