-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ceph compatibility - is_dir() fails for directories when "/" included at end of path #148
Comments
Hi @remi-braun, I don't reproduce this based on what you specified. Here's an example with a public asset on S3: >>> file = S3Path('s3://drivendata-public-assets/nested_dir/test.dot/a_file.txt')
>>> file.is_file()
True
>>> file.is_dir()
False
>>> directory = S3Path('s3://drivendata-public-assets/nested_dir/test.dot/')
>>> directory.is_file()
False
>>> directory.is_dir()
True A few questions:
Here's the logic that checks directories/files on S3. It's not too complicated, but bsaically we check if S3 lists the exact path when we ask it for files. If it does, we assume it is a file. |
Hi ! Thanks for your quick reply :)
>>> B
S3Path('s3://sertit-eoreader-ci/optical/S2B_MSIL2A_20200114T065229_N0213_R020_T40REQ_20200114T094749.SAFE/')
This directory is full and the glob function works: >>> [path.name for path in B.glob("**/*")]
['S2B_MSIL2A_20200114T065229_N0213_R020_T40REQ_20200114T094749.SAFE', 'AUX_DATA', 'DATASTRIP', 'DATASTRIP', 'DS_MTI__20200114T094749_S20200114T065245', 'DS_MTI__20200114T094749_S20200114T065245', 'MTD_DS.xml', 'QI_DATA', 'QI_DATA', 'FORMAT_CORRECTNESS.xml', 'GENERAL_QUALITY.xml', 'GEOMETRIC_QUALITY.xml', 'RADIOMETRIC_QUALITY.xml', 'SENSOR_QUALITY.xml', 'GRANULE', 'GRANULE', 'L2A_T40REQ_A014918_20200114T065245', 'L2A_T40REQ_A014918_20200114T065245', 'AUX_DATA', 'AUX_DATA', 'AUX_ECMWFT', 'IMG_DATA', 'IMG_DATA', 'R10m', 'R10m', 'T40REQ_20200114T065229_AOT_10m.jp2', 'T40REQ_20200114T065229_B02_10m.jp2', 'T40REQ_20200114T065229_B03_10m.jp2', 'T40REQ_20200114T065229_B04_10m.jp2', 'T40REQ_20200114T065229_B08_10m.jp2', 'T40REQ_20200114T065229_TCI_10m.jp2', 'T40REQ_20200114T065229_WVP_10m.jp2', 'R20m', 'R20m', 'T40REQ_20200114T065229_AOT_20m.jp2', 'T40REQ_20200114T065229_B02_20m.jp2', 'T40REQ_20200114T065229_B03_20m.jp2', 'T40REQ_20200114T065229_B04_20m.jp2', 'T40REQ_20200114T065229_B05_20m.jp2', 'T40REQ_20200114T065229_B06_20m.jp2', 'T40REQ_20200114T065229_B07_20m.jp2', 'T40REQ_20200114T065229_B11_20m.jp2', 'T40REQ_20200114T065229_B12_20m.jp2', 'T40REQ_20200114T065229_B8A_20m.jp2', 'T40REQ_20200114T065229_SCL_20m.jp2', 'T40REQ_20200114T065229_TCI_20m.jp2', 'T40REQ_20200114T065229_WVP_20m.jp2', 'R60m', 'R60m', 'T40REQ_20200114T065229_AOT_60m.jp2', 'T40REQ_20200114T065229_B01_60m.jp2', 'T40REQ_20200114T065229_B02_60m.jp2', 'T40REQ_20200114T065229_B03_60m.jp2', 'T40REQ_20200114T065229_B04_60m.jp2', 'T40REQ_20200114T065229_B05_60m.jp2', 'T40REQ_20200114T065229_B06_60m.jp2', 'T40REQ_20200114T065229_B07_60m.jp2', 'T40REQ_20200114T065229_B09_60m.jp2', 'T40REQ_20200114T065229_B11_60m.jp2', 'T40REQ_20200114T065229_B12_60m.jp2', 'T40REQ_20200114T065229_B8A_60m.jp2', 'T40REQ_20200114T065229_SCL_60m.jp2', 'T40REQ_20200114T065229_TCI_60m.jp2', 'T40REQ_20200114T065229_WVP_60m.jp2', 'MTD_TL.xml', 'QI_DATA', 'QI_DATA', 'FORMAT_CORRECTNESS.xml', 'GENERAL_QUALITY.xml', 'GEOMETRIC_QUALITY.xml', 'MSK_CLDPRB_20m.jp2', 'MSK_CLDPRB_60m.jp2', 'MSK_CLOUDS_B00.gfs', 'MSK_CLOUDS_B00.gml', 'MSK_DEFECT_B01.gfs', 'MSK_DEFECT_B01.gml', 'MSK_DEFECT_B02.gfs', 'MSK_DEFECT_B02.gml', 'MSK_DEFECT_B03.gfs', 'MSK_DEFECT_B03.gml', 'MSK_DEFECT_B04.gfs', 'MSK_DEFECT_B04.gml', 'MSK_DEFECT_B05.gfs', 'MSK_DEFECT_B05.gml', 'MSK_DEFECT_B06.gfs', 'MSK_DEFECT_B06.gml', 'MSK_DEFECT_B07.gml', 'MSK_DEFECT_B08.gfs', 'MSK_DEFECT_B08.gml', 'MSK_DEFECT_B09.gml', 'MSK_DEFECT_B10.gml', 'MSK_DEFECT_B11.gfs', 'MSK_DEFECT_B11.gml', 'MSK_DEFECT_B12.gfs', 'MSK_DEFECT_B12.gml', 'MSK_DEFECT_B8A.gfs', 'MSK_DEFECT_B8A.gml', 'MSK_DETFOO_B01.gfs', 'MSK_DETFOO_B01.gml', 'MSK_DETFOO_B02.gfs', 'MSK_DETFOO_B02.gml', 'MSK_DETFOO_B03.gfs', 'MSK_DETFOO_B03.gml', 'MSK_DETFOO_B04.gfs', 'MSK_DETFOO_B04.gml', 'MSK_DETFOO_B05.gfs', 'MSK_DETFOO_B05.gml', 'MSK_DETFOO_B06.gfs', 'MSK_DETFOO_B06.gml', 'MSK_DETFOO_B07.gml', 'MSK_DETFOO_B08.gfs', 'MSK_DETFOO_B08.gml', 'MSK_DETFOO_B09.gml', 'MSK_DETFOO_B10.gml', 'MSK_DETFOO_B11.gfs', 'MSK_DETFOO_B11.gml', 'MSK_DETFOO_B12.gfs', 'MSK_DETFOO_B12.gml', 'MSK_DETFOO_B8A.gfs', 'MSK_DETFOO_B8A.gml', 'MSK_NODATA_B01.gfs', 'MSK_NODATA_B01.gml', 'MSK_NODATA_B02.gfs', 'MSK_NODATA_B02.gml', 'MSK_NODATA_B03.gfs', 'MSK_NODATA_B03.gml', 'MSK_NODATA_B04.gfs', 'MSK_NODATA_B04.gml', 'MSK_NODATA_B05.gfs', 'MSK_NODATA_B05.gml', 'MSK_NODATA_B06.gfs', 'MSK_NODATA_B06.gml', 'MSK_NODATA_B07.gml', 'MSK_NODATA_B08.gfs', 'MSK_NODATA_B08.gml', 'MSK_NODATA_B09.gml', 'MSK_NODATA_B10.gml', 'MSK_NODATA_B11.gfs', 'MSK_NODATA_B11.gml', 'MSK_NODATA_B12.gfs', 'MSK_NODATA_B12.gml', 'MSK_NODATA_B8A.gfs', 'MSK_NODATA_B8A.gml', 'MSK_SATURA_B01.gfs', 'MSK_SATURA_B01.gml', 'MSK_SATURA_B02.gfs', 'MSK_SATURA_B02.gml', 'MSK_SATURA_B03.gfs', 'MSK_SATURA_B03.gml', 'MSK_SATURA_B04.gfs', 'MSK_SATURA_B04.gml', 'MSK_SATURA_B05.gfs', 'MSK_SATURA_B05.gml', 'MSK_SATURA_B06.gfs', 'MSK_SATURA_B06.gml', 'MSK_SATURA_B07.gml', 'MSK_SATURA_B08.gfs', 'MSK_SATURA_B08.gml', 'MSK_SATURA_B09.gml', 'MSK_SATURA_B10.gml', 'MSK_SATURA_B11.gfs', 'MSK_SATURA_B11.gml', 'MSK_SATURA_B12.gfs', 'MSK_SATURA_B12.gml', 'MSK_SATURA_B8A.gfs', 'MSK_SATURA_B8A.gml', 'MSK_SNWPRB_20m.jp2', 'MSK_SNWPRB_60m.jp2', 'MSK_TECQUA_B01.gfs', 'MSK_TECQUA_B01.gml', 'MSK_TECQUA_B02.gfs', 'MSK_TECQUA_B02.gml', 'MSK_TECQUA_B03.gfs', 'MSK_TECQUA_B03.gml', 'MSK_TECQUA_B04.gfs', 'MSK_TECQUA_B04.gml', 'MSK_TECQUA_B05.gfs', 'MSK_TECQUA_B05.gml', 'MSK_TECQUA_B06.gfs', 'MSK_TECQUA_B06.gml', 'MSK_TECQUA_B07.gml', 'MSK_TECQUA_B08.gfs', 'MSK_TECQUA_B08.gml', 'MSK_TECQUA_B09.gml', 'MSK_TECQUA_B10.gml', 'MSK_TECQUA_B11.gfs', 'MSK_TECQUA_B11.gml', 'MSK_TECQUA_B12.gfs', 'MSK_TECQUA_B12.gml', 'MSK_TECQUA_B8A.gfs', 'MSK_TECQUA_B8A.gml', 'RADIOMETRIC_QUALITY.xml', 'SENSOR_QUALITY.xml', 'T40REQ_20200114T065229_PVI.jp2', 'HTML', 'HTML', 'UserProduct_index.html', 'UserProduct_index.xsl', 'banner_1.png', 'banner_2.png', 'banner_3.png', 'star_bg.jpg', 'INSPIRE.xml', 'MTD_MSIL2A.xml', 'manifest.safe', 'rep_info', 'rep_info', 'S2_PDI_Level-2A_Datastrip_Metadata.xsd', 'S2_PDI_Level-2A_Tile_Metadata.xsd', 'S2_User_Product_Level-2A_Metadata.xsd']
This folder has been uploaded to a S3 compatible storage as usual, and it can be used without the final >>> # WORKAROUND
>>> if str(path).endswith("/"):
>>> path = AnyPath(str(path)[:-1])
I'm sorry, I've not found any public S3 enabling |
What library/implementation? We do test It does seem that the root cause is that our |
Ouch I don't know, it is my university's storage... I have no clue on this question 😢 If you want me to run additional tests on my side, just ask ! |
🙇 thanks, I'll see if we can find a Ceph server to test against. |
I have the same issue, but I'm using ordinary S3 (I think it is ordinary S3 at least) |
@remi-braun @karolzlot The issues you were seeing should (hopefully) be fixed by #190. I'm not sure about ceph since we test on MinIO and AWS S3. If you get a chance, could you test the latest on
|
Thanks @remi-braun. The recursion problem looks like a separate issue that we can investigate. Can you confirm that the |
Do I need to open an issue on that ? Or you are investigating it ? |
@pjbull @remi-braun I also faced the same issue fixed it like this |
Thanks @GeorgeSabu that fix looks good, and we'd accept a PR if you want to contribute! |
@remi-braun @GeorgeSabu ^ filed #198 to track the recursion issue and the fix you pointed out. If you can contribute referencing that issue, it would be great. Thanks! |
Hello,
For directories that have a suffix such as .SAFE directories for Sentinel satellites data,
pathlib
recognizes the directories correctly, althoughcloudpathlib
fails:However, it works when removing the final
\
!The text was updated successfully, but these errors were encountered: