Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fs.put() writes the object under a different target path if it is called multiple times #659

Closed
tilan7663 opened this issue Nov 3, 2022 · 2 comments

Comments

@tilan7663
Copy link

Hello,

We are seeing very strange behavior with fs.put API after upgrading our s3fs version from 0.4.2 to 2022.10.0. It seems like the object path changes when fs.put() is called multiple times.

How to reproduce

Local filesystem

mkdir -p /home/user/foo_dir
echo "text" > /home/user/foo_dir/hello.txt

Python

import s3fs
import uuid

s3 = s3fs.S3FileSystem()
session_id = uuid.uuid4().hex

s3_path = f"s3://bucket/{session_id}/" // s3_path doesn't exist yet in S3

s3.put("/home/user/foo_dir", s3_path, recursive=True) // first attempt

// after the first s3.put() is called, the object is created under s3://bucket/{session_id}/hello.txt which is the expected behavior

s3.put("/home/user/foo_dir", s3_path, recursive=True) // second attempt

// after the second attempt, the object path is changed and it is created under s3://bucket/{session_id}/foo_dir/hello.txt. The old version of library does not produce this behavior

After we did some code tracing, it seems like the reason that the object path is changed on the second attempt is because of a patch that was made previously to address a somewhat related issue. And interestingly, after we revert this patch, we couldn't reproduce the behavior that was described in the original issue and new behavior is same as the the one in s3fs==0.4.2. Wondering if there is something that could be changed since this patch was merged which somehow makes this patch behave differently.

Please let me know if you have any questions.

Thanks,
Tian

@ianthomas23
Copy link
Contributor

I cannot reproduce this locally. I've submitted a PR (#666) containing a test for it to see if it passes CI.

@ianthomas23
Copy link
Contributor

@tilan7663 Since fsspec/filesystem_spec#1148 and fsspec/filesystem_spec#1163 this is now expected behaviour as it is consistent with command-line cp and scp. To obtain the behaviour that you would like you just need to append a slash to your source directory, i.e.

s3.put("/home/user/foo_dir/", s3_path, recursive=True)

I am going to close this issue as I think no further action is necessary. Feel free to re-open it if you would like to discuss it further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants