Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RemoveAll to Basic Interface? #77

Open
cipherboy opened this issue Aug 13, 2024 · 4 comments
Open

Add RemoveAll to Basic Interface? #77

cipherboy opened this issue Aug 13, 2024 · 4 comments

Comments

@cipherboy
Copy link

I was wanting to call the equivalent of os.RemoveAll(...) on a worktree's filesystem and ended up implementing support for it here and in go-git (the latter was a trivial change only in RepositoryFilesystem; would this change be welcomed here?

I suspect ideally this would be implemented as an optional additional interface to not break backwards compatibility (existing user would have to implement the interface on upgrade), but maybe it would be best for a v6 to simplify the interface? Thoughts?


I also ended up implementing a non-chroot'd version (similar to osfs/os_bound.go, but taking into account an arbitrary $PWD different from the actual $PWD at the time of execution), but I'm not sure that changeset has relevance outside of my use case. :-) Let me know if it is also of interest though!

@pjbgf
Copy link
Member

pjbgf commented Aug 19, 2024

Hey @cipherboy!

The util package contains a RemoveAll(fs billy.Basic, path string) error function that can be used to that effect.

However, a native RemoveAll within the Basic interface would come handy, but I don't see that working before v6 as you mentioned. Feel free to submit a PR to add that targeting the v6-exp branch, which is where we are keeping the changes for v6.

As for the os_bound implementation, go-billy is primarily focused on supporting go-git. So I am not sure your specific use case would fit that bill. The os_bound already has an arbitrary $PWD, which is immutable. I am not sure how different that is from your version.

@cipherboy
Copy link
Author

\o hey @pjbgf -- regarding the last point, I ran across this when working across multiple repos: go-git/go-git#1155 (comment)

It seems this only works if dir == . and $PWD is the root of the desired Git repository. Otherwise, when I had say, dir=repos/some-repo, every operation in the repository involving simlinks resulted in them being prefixed with dir, rewriting links from say lib->lib64 on checkout to become lib->repos/some-repo/lib64. I was working across multiple repositories, so continually updating the app's $PWD for the desired repo to edit would've maybe been possible but likely would've been a bit of work.

I think the issue with this is it's unsafe for sandbox execution/bypass; you don't want to necessarily allow or follow symlinks to arbitrary places on disk if you're a git repo provider. But if you're using known repositories it probably isn't as much an issue.

Let me see if I can create a better reproducer.

cipherboy added a commit to cipherboy/testbed that referenced this issue Aug 20, 2024
Signed-off-by: Alexander Scheel <[email protected]>
@cipherboy
Copy link
Author

cipherboy commented Aug 20, 2024

@pjbgf Hmm, looks like it was simpler than I thought. Check out this code:

https://github.com/cipherboy/testbed/tree/master/go-git-repo-symlink

Script and code

Script:

#!/bin/bash

create_repo() {(
	rm -rf repos/ repos.bak/

	mkdir -p repos/example
	cd repos/example
	git init .

	git status
	echo "initial" > README.md
	git add README.md && git commit -m "initial"

	git status
	echo "lib symlink" > README.md
	mkdir lib64
	echo "whoami" > lib64/code.sh
	ln -s lib64 lib
	git add lib64 lib README.md && git commit -m "lib symlink"

	git status
	echo "absolute symlink" > README.md
	ln -s /bin/bash lib64/shell
	git add lib64/shell README.md && git commit -m "absolute symlink"

	git status
	echo "latest change" > README.md
	git add README.md && git commit -m "latest change"

	git log | cat -
)}

backup_repo() {(
	cp -pr repos/ repos.bak
)}

create_repo
backup_repo

go mod tidy
# go run ./main.go

(
	cd repos/example && go run ../../main.go .
)

Code:

package main

import (
	"fmt"
	"os"

	"github.com/go-git/go-billy/v5/osfs"
	"github.com/go-git/go-git/v5"
	"github.com/go-git/go-git/v5/plumbing/cache"
	"github.com/go-git/go-git/v5/plumbing/object"
	"github.com/go-git/go-git/v5/storage/filesystem"
)

func getRepo(dir string) (*git.Repository, error) {
	// See https://github.com/go-git/go-git/issues/1155#issuecomment-2242778023
	wt := osfs.New(dir, osfs.WithBoundOS())
	dotfs, err := wt.Chroot(git.GitDirName)
	if err != nil {
		return nil, err
	}
	store := filesystem.NewStorage(dotfs, cache.NewObjectLRUDefault())

	// Clone -> Open
	return git.Open(store, wt)
}

func walkCommits(repo *git.Repository) error {
	iter, err := repo.CommitObjects()
	if err != nil {
		return fmt.Errorf("failed getting commit objects: %v", err)
	}

	repoW, err := repo.Worktree()
	if err != nil {
		return fmt.Errorf("failed to get repo worktree: %v", err)
	}

	if err := iter.ForEach(func(commit *object.Commit) error {
		if err := repoW.Checkout(&git.CheckoutOptions{Hash: commit.Hash, Force: true}); err != nil {
			return fmt.Errorf("failed to checkout commit (%#v): %v", commit.Hash.String(), err)
		}

		return nil
	}); err != nil {
		return fmt.Errorf("failed iterating over commit object: %v", err)
	}

	return nil
}

func main() {
	dir := "repos/example"
	if len(os.Args) > 1 {
		dir = os.Args[1]
	}

	fmt.Printf("using dir=%v\n", dir)

	repo, err := getRepo(dir)
	if err != nil {
		panic(fmt.Sprintf("failed getting repository: %v", err))
	}

	if err := walkCommits(repo); err != nil {
		panic(fmt.Sprintf("failed walking commits: %v", err))
	}
}

In particular, the current mode (dir=. with $PWD=repo/example) fails with:

panic: failed walking commits: failed iterating over commit object: failed to checkout commit ("97f28a73331c05fe8bfda6561f7e3786dc31c622"): remove bin/bash: no such file or directory

or similar depending on what file gets executed first.

But the other one has the error I remember:

panic: failed walking commits: failed iterating over commit object: failed to checkout commit ("36f532a4ff85e64af9683daa8dfbe157df3c8dae"): open repos/example/lib64: no such file or directory

Both of these hashes are (the freshly generated) repo's initial commit. lib is a symlink to lib64; both were added in the second commit.

With my hacky filesystem, both methods pass.

The implementation essentially looks like this:

Implementation
//go:build !js
// +build !js

package osfs

import (
	"os"
	"path/filepath"
	"strings"

	securejoin "github.com/cyphar/filepath-securejoin"
	"github.com/go-git/go-billy/v5"
)

// WorkingDirOS is a legacyfilesystem without a chroot of the OS filesystem,
// but instead treats relative directories as if they were specific to a
// given underlying path, even if CWD changes.
type WorkingDirOS struct {
	baseDir string
}

func newWorkingDirOS(baseDir string) billy.Filesystem {
	return &WorkingDirOS{baseDir}
}

func (fs *WorkingDirOS) Chroot(path string) (billy.Filesystem, error) {
	joined, err := securejoin.SecureJoin(fs.baseDir, path)
	if err != nil {
		return nil, err
	}
	return New(joined), nil
}

func (fs *WorkingDirOS) Root() string {
	return fs.baseDir
}

func (fs *WorkingDirOS) Create(filename string) (billy.File, error) {
	if !filepath.IsAbs(filename) && !strings.HasPrefix(filename, fs.baseDir) {
		filename = filepath.Join(fs.baseDir, filename)
	}
	return fs.OpenFile(filename, os.O_RDWR|os.O_CREATE|os.O_TRUNC, defaultCreateMode)
}

func (fs *WorkingDirOS) OpenFile(filename string, flag int, perm os.FileMode) (billy.File, error) {
	if !filepath.IsAbs(filename) && !strings.HasPrefix(filename, fs.baseDir) {
		filename = filepath.Join(fs.baseDir, filename)
	}
	return openFile(filename, flag, perm, fs.createDir)
}

func (fs *WorkingDirOS) createDir(fullpath string) error {
	if !filepath.IsAbs(fullpath) && !strings.HasPrefix(fullpath, fs.baseDir) {
		fullpath = filepath.Join(fs.baseDir, fullpath)
	}
	dir := filepath.Dir(fullpath)
	if dir != "." {
		if err := os.MkdirAll(dir, defaultDirectoryMode); err != nil {
			return err
		}
	}

	return nil
}

func (fs *WorkingDirOS) ReadDir(dir string) ([]os.FileInfo, error) {
	if !filepath.IsAbs(dir) && !strings.HasPrefix(dir, fs.baseDir) {
		dir = filepath.Join(fs.baseDir, dir)
	}
	return readDir(dir)
}

func (fs *WorkingDirOS) Rename(from, to string) error {
	if !filepath.IsAbs(from) && !strings.HasPrefix(from, fs.baseDir) {
		from = filepath.Join(fs.baseDir, from)
	}
	if err := fs.createDir(to); err != nil {
		return err
	}

	if !filepath.IsAbs(to) && !strings.HasPrefix(to, fs.baseDir) {
		to = filepath.Join(fs.baseDir, to)
	}

	return rename(from, to)
}

func (fs *WorkingDirOS) MkdirAll(path string, perm os.FileMode) error {
	if !filepath.IsAbs(path) && !strings.HasPrefix(path, fs.baseDir) {
		path = filepath.Join(fs.baseDir, path)
	}
	return os.MkdirAll(path, defaultDirectoryMode)
}

func (fs *WorkingDirOS) Open(filename string) (billy.File, error) {
	return fs.OpenFile(filename, os.O_RDONLY, 0)
}

func (fs *WorkingDirOS) Stat(filename string) (os.FileInfo, error) {
	if !filepath.IsAbs(filename) && !strings.HasPrefix(filename, fs.baseDir) {
		filename = filepath.Join(fs.baseDir, filename)
	}
	return os.Stat(filename)
}

func (fs *WorkingDirOS) Remove(filename string) error {
	if !filepath.IsAbs(filename) && !strings.HasPrefix(filename, fs.baseDir) {
		filename = filepath.Join(fs.baseDir, filename)
	}
	return os.Remove(filename)
}

func (fs *WorkingDirOS) TempFile(dir, prefix string) (billy.File, error) {
	if err := fs.createDir(dir + string(os.PathSeparator)); err != nil {
		return nil, err
	}

	return tempFile(dir, prefix)
}

func (fs *WorkingDirOS) Join(elem ...string) string {
	return filepath.Join(elem...)
}

func (fs *WorkingDirOS) RemoveAll(path string) error {
	if !filepath.IsAbs(path) && !strings.HasPrefix(path, fs.baseDir) {
		path = filepath.Join(fs.baseDir, path)
	}
	return os.RemoveAll(filepath.Clean(path))
}

func (fs *WorkingDirOS) Lstat(filename string) (os.FileInfo, error) {
	if !filepath.IsAbs(filename) && !strings.HasPrefix(filename, fs.baseDir) {
		filename = filepath.Join(fs.baseDir, filename)
	}
	return os.Lstat(filepath.Clean(filename))
}

func (fs *WorkingDirOS) Symlink(target, link string) error {
	if err := fs.createDir(link); err != nil {
		return err
	}

	if !filepath.IsAbs(link) && !strings.HasPrefix(link, fs.baseDir) {
		link = filepath.Join(fs.baseDir, link)
	}
	return os.Symlink(target, link)
}

func (fs *WorkingDirOS) Readlink(link string) (string, error) {
	if !filepath.IsAbs(link) && !strings.HasPrefix(link, fs.baseDir) {
		link = filepath.Join(fs.baseDir, link)
	}
	return os.Readlink(link)
}

// Capabilities implements the Capable interface.
func (fs *WorkingDirOS) Capabilities() billy.Capability {
	return billy.DefaultCapabilities
}

i.e., a very simple one with a bunch of:

if !filepath.IsAbs(link) && !strings.HasPrefix(link, fs.baseDir) {
	link = filepath.Join(fs.baseDir, link)
}

in relevant places.

@pjbgf
Copy link
Member

pjbgf commented Aug 21, 2024

@cipherboy thanks for clarifying and sharing the code. I think the current implementation is mistakenly trying to remove the target instead of the symlink. However, the safeguards in place to restrict deletion of things outside of the repository dir are blocking that from taking place. The bug here is that we are doing a Stat instead of a Lstat here.

Would you be keen to propose a PR to fix this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants