Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"No Such File" when crawling a directory that ends with space("ThisDirHasSpaceAtEnd ") #1952

Open
jens-idoer opened this issue Oct 4, 2024 · 5 comments
Assignees
Labels
bug For confirmed bugs

Comments

@jens-idoer
Copy link

Describe the bug

When running FSCrawler (docker image) with a target directory on an SFTP server we get an error when we try to crawl a directory that has a space (" ") at the end of the name. The error we get is this:
WARN [f.p.e.c.f.FsParserAbstract] Error while crawling /xx: SFTP error (SSH_FX_NO_SUCH_FILE): No such file
Please not that the "/xx" here is the "root" directory from the _settings.json file and not the directory which has the space at the end.

Job Settings

Logs

+ fscrawler --debug --loop 1 fscrawler_job
12:11:21,106 WARN  [f.p.e.c.f.c.FsCrawlerCli] --debug option has been deprecated. Use FS_JAVA_OPTS="-DLOG_LEVEL=debug" instead.
12:11:21,150 INFO  [f.console] ,----------------------------------------------------------------------------------------------------.
|       ,---,.  .--.--.     ,----..                                     ,--,           2.10-SNAPSHOT |
|     ,'  .' | /  /    '.  /   /   \                                  ,--.'|                         |
|   ,---.'   ||  :  /`. / |   :     :  __  ,-.                   .---.|  | :               __  ,-.   |
|   |   |   .';  |  |--`  .   |  ;. /,' ,'/ /|                  /. ./|:  : '             ,' ,'/ /|   |
|   :   :  :  |  :  ;_    .   ; /--` '  | |' | ,--.--.       .-'-. ' ||  ' |      ,---.  '  | |' |   |
|   :   |  |-, \  \    `. ;   | ;    |  |   ,'/       \     /___/ \: |'  | |     /     \ |  |   ,'   |
|   |   :  ;/|  `----.   \|   : |    '  :  / .--.  .-. | .-'.. '   ' .|  | :    /    /  |'  :  /     |
|   |   |   .'  __ \  \  |.   | '___ |  | '   \__\/: . ./___/ \:     ''  : |__ .    ' / ||  | '      |
|   '   :  '   /  /`--'  /'   ; : .'|;  : |   ," .--.; |.   \  ' .\   |  | '.'|'   ;   /|;  : |      |
|   |   |  |  '--'.     / '   | '/  :|  , ;  /  /  ,.  | \   \   ' \ |;  :    ;'   |  / ||  , ;      |
|   |   :  \    `--'---'  |   :    /  ---'  ;  :   .'   \ \   \  |--" |  ,   / |   :    | ---'       |
|   |   | ,'               \   \ .'         |  ,     .-./  \   \ |     ---`-'   \   \  /             |
|   `----'                  `---`            `--`---'       '---"                `----'              |
+----------------------------------------------------------------------------------------------------+
|                                        You know, for Files!                                        |
|                                     Made from France with Love                                     |
|                           Source: https://github.com/dadoonet/fscrawler/                           |
|                          Documentation: https://fscrawler.readthedocs.io/                          |
`----------------------------------------------------------------------------------------------------'

12:11:21,201 DEBUG [f.p.e.c.p.FsCrawlerPluginsManager] Loading plugins
12:11:21,249 INFO  [f.p.e.c.f.c.BootstrapChecks] Memory [Free/Total=Percent]: HEAP [11.9mb/363.5mb=3.3%], RAM [1.3gb/1.4gb=95.22%], Swap [0b/0b=0.0].
12:11:21,251 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] Copying [6/_settings.json]...
12:11:21,266 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] Copying [6/_settings_folder.json]...
12:11:21,267 DEBUG [f.p.e.c.f.c.FsCrawlerCli] Starting job [fscrawler_job]...
12:11:21,521 WARN  [f.p.e.c.f.s.Elasticsearch] username is deprecated. Use apiKey instead.
12:11:21,521 WARN  [f.p.e.c.f.s.Elasticsearch] password is deprecated. Use apiKey instead.
12:11:21,523 DEBUG [f.p.e.c.p.FsCrawlerPluginsManager] Starting plugins
12:11:21,536 DEBUG [f.p.e.c.p.FsCrawlerPluginsManager] Found FsCrawlerExtensionFsProvider extension for type [http]
12:11:21,536 DEBUG [f.p.e.c.p.FsCrawlerPluginsManager] Found FsCrawlerExtensionFsProvider extension for type [s3]
12:11:21,536 DEBUG [f.p.e.c.p.FsCrawlerPluginsManager] Found FsCrawlerExtensionFsProvider extension for type [local]
12:11:21,549 DEBUG [f.p.e.c.f.FsParserAbstract] creating fs crawler thread [fscrawler_job] for [/home/spacetest/cases/spacetestcase2] every [15m]
12:11:21,552 INFO  [f.p.e.c.f.FsCrawlerImpl] Starting FS crawler
12:11:21,645 DEBUG [f.p.e.c.f.c.ElasticsearchClient] get version
12:11:22,547 DEBUG [f.p.e.c.f.c.ElasticsearchClient] get version returns 8.9.0 and 8 as the major version number
12:11:22,547 INFO  [f.p.e.c.f.c.ElasticsearchClient] Elasticsearch Client connected to a node running version 8.9.0
12:11:22,547 DEBUG [f.p.e.c.f.c.ElasticsearchClient] is existing pipeline [ddl_fscrawler_pipeline]
12:11:22,559 DEBUG [f.p.e.c.f.c.ElasticsearchClient] Pipeline [ddl_fscrawler_pipeline] was found
12:11:22,566 DEBUG [f.p.e.c.f.s.FsCrawlerManagementServiceElasticsearchImpl] Elasticsearch Management Service started
12:11:22,569 DEBUG [f.p.e.c.f.c.ElasticsearchClient] get version
12:11:22,662 DEBUG [f.p.e.c.f.c.ElasticsearchClient] get version returns 8.9.0 and 8 as the major version number
12:11:22,663 INFO  [f.p.e.c.f.c.ElasticsearchClient] Elasticsearch Client connected to a node running version 8.9.0
12:11:22,663 DEBUG [f.p.e.c.f.c.ElasticsearchClient] is existing pipeline [ddl_fscrawler_pipeline]
12:11:22,688 DEBUG [f.p.e.c.f.c.ElasticsearchClient] Pipeline [ddl_fscrawler_pipeline] was found
12:11:22,689 DEBUG [f.p.e.c.f.s.FsCrawlerDocumentServiceElasticsearchImpl] Elasticsearch Document Service started
12:11:22,689 DEBUG [f.p.e.c.f.c.ElasticsearchClient] Creating/updating component templates
12:11:22,695 DEBUG [f.p.e.c.f.c.ElasticsearchClient] push component template [fscrawler_alias]
12:11:22,717 DEBUG [f.p.e.c.f.c.ElasticsearchClient] push component template [fscrawler_settings_shards]
12:11:22,742 DEBUG [f.p.e.c.f.c.ElasticsearchClient] push component template [fscrawler_settings_total_fields]
12:11:22,753 DEBUG [f.p.e.c.f.c.ElasticsearchClient] push component template [fscrawler_mapping_attributes]
12:11:22,779 DEBUG [f.p.e.c.f.c.ElasticsearchClient] push component template [fscrawler_mapping_file]
12:11:22,792 DEBUG [f.p.e.c.f.c.ElasticsearchClient] push component template [fscrawler_mapping_path]
12:11:22,813 DEBUG [f.p.e.c.f.c.ElasticsearchClient] push component template [fscrawler_mapping_attachment]
12:11:22,825 DEBUG [f.p.e.c.f.c.ElasticsearchClient] push component template [fscrawler_mapping_content]
12:11:22,849 DEBUG [f.p.e.c.f.c.ElasticsearchClient] push component template [fscrawler_mapping_meta]
12:11:22,861 DEBUG [f.p.e.c.f.c.ElasticsearchClient] Creating/updating index templates
12:11:22,863 DEBUG [f.p.e.c.f.c.ElasticsearchClient] push index template [fscrawler_docs_ddl-lab-cust-shared-files]
12:11:22,889 DEBUG [f.p.e.c.f.c.ElasticsearchClient] push index template [fscrawler_folders_ddl-lab-cust-shared-files_folder]
12:11:22,905 INFO  [f.p.e.c.f.FsParserAbstract] FS crawler started for [fscrawler_job] for [/home/spacetest/cases/spacetestcase2] every [15m]
12:11:22,906 DEBUG [f.p.e.c.f.FsParserAbstract] Fs crawler thread [fscrawler_job] is now running. Run #1...
12:11:22,907 DEBUG [f.p.e.c.f.c.s.FsCrawlerSshClient] Create and start SSH client
12:11:23,635 DEBUG [f.p.e.c.f.c.s.FsCrawlerSshClient] Opening SSH connection to u322501@null
12:11:24,061 DEBUG [f.p.e.c.f.c.s.FsCrawlerSshClient] SSH connection successful
12:11:24,141 DEBUG [f.p.e.c.f.FsParserAbstract] indexing [/home/spacetest/cases/spacetestcase2] content
12:11:24,141 DEBUG [f.p.e.c.f.c.s.FileAbstractorSSH] Listing local files from [/home/spacetest/cases/spacetestcase2]
12:11:24,153 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] computeVirtualPathName(/home/spacetest/cases/spacetestcase2, /home/spacetest/cases/spacetestcase2/uploads) = /uploads
12:11:24,153 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] directory = [true], filename = [/uploads], includes = [null], excludes = [null]
12:11:24,153 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] filename = [/uploads], excludes = [null]
12:11:24,154 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] filename = [/uploads], includes = [null]
12:11:24,154 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] filename = [/uploads], excludes = [null]
12:11:24,154 DEBUG [f.p.e.c.f.FsParserAbstract] [/uploads] can be indexed: [true]
12:11:24,154 DEBUG [f.p.e.c.f.FsParserAbstract]   - folder: uploads
12:11:24,155 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] computeVirtualPathName(/home/spacetest/cases/spacetestcase2, /home/spacetest/cases/spacetestcase2/uploads) = /uploads
12:11:24,179 DEBUG [f.p.e.c.f.FsParserAbstract] indexing [/home/spacetest/cases/spacetestcase2/uploads] content
12:11:24,179 DEBUG [f.p.e.c.f.c.s.FileAbstractorSSH] Listing local files from [/home/spacetest/cases/spacetestcase2/uploads]
12:11:24,186 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] computeVirtualPathName(/home/spacetest/cases/spacetestcase2, /home/spacetest/cases/spacetestcase2/uploads/emails) = /uploads/emails
12:11:24,186 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] directory = [true], filename = [/uploads/emails], includes = [null], excludes = [null]
12:11:24,186 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] filename = [/uploads/emails], excludes = [null]
12:11:24,186 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] filename = [/uploads/emails], includes = [null]
12:11:24,186 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] filename = [/uploads/emails], excludes = [null]
12:11:24,186 DEBUG [f.p.e.c.f.FsParserAbstract] [/uploads/emails] can be indexed: [true]
12:11:24,186 DEBUG [f.p.e.c.f.FsParserAbstract]   - folder: emails
12:11:24,187 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] computeVirtualPathName(/home/spacetest/cases/spacetestcase2, /home/spacetest/cases/spacetestcase2/uploads/emails) = /uploads/emails
12:11:24,187 DEBUG [f.p.e.c.f.FsParserAbstract] indexing [/home/spacetest/cases/spacetestcase2/uploads/emails] content
12:11:24,188 DEBUG [f.p.e.c.f.c.s.FileAbstractorSSH] Listing local files from [/home/spacetest/cases/spacetestcase2/uploads/emails]
12:11:24,194 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] computeVirtualPathName(/home/spacetest/cases/spacetestcase2, /home/spacetest/cases/spacetestcase2/uploads/emails/[email protected]) = /uploads/emails/[email protected]
12:11:24,194 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] directory = [true], filename = [/uploads/emails/[email protected]], includes = [null], excludes = [null]
12:11:24,194 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] filename = [/uploads/emails/[email protected]], excludes = [null]
12:11:24,194 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] filename = [/uploads/emails/[email protected]], includes = [null]
12:11:24,195 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] filename = [/uploads/emails/[email protected]], excludes = [null]
12:11:24,195 DEBUG [f.p.e.c.f.FsParserAbstract] [/uploads/emails/[email protected]] can be indexed: [true]
12:11:24,195 DEBUG [f.p.e.c.f.FsParserAbstract]   - folder: [email protected]
12:11:24,195 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] computeVirtualPathName(/home/spacetest/cases/spacetestcase2, /home/spacetest/cases/spacetestcase2/uploads/emails/[email protected]) = /uploads/emails/[email protected]
12:11:24,196 DEBUG [f.p.e.c.f.FsParserAbstract] indexing [/home/spacetest/cases/spacetestcase2/uploads/emails/[email protected]] content
12:11:24,196 DEBUG [f.p.e.c.f.c.s.FileAbstractorSSH] Listing local files from [/home/spacetest/cases/spacetestcase2/uploads/emails/[email protected]]
12:11:24,202 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] computeVirtualPathName(/home/spacetest/cases/spacetestcase2, /home/spacetest/cases/spacetestcase2/uploads/emails/[email protected]/INBOX.Sent_350) = /uploads/emails/[email protected]/INBOX.Sent_350
12:11:24,203 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] directory = [true], filename = [/uploads/emails/[email protected]/INBOX.Sent_350], includes = [null], excludes = [null]
12:11:24,203 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] filename = [/uploads/emails/[email protected]/INBOX.Sent_350], excludes = [null]
12:11:24,203 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] filename = [/uploads/emails/[email protected]/INBOX.Sent_350], includes = [null]
12:11:24,203 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] filename = [/uploads/emails/[email protected]/INBOX.Sent_350], excludes = [null]
12:11:24,203 DEBUG [f.p.e.c.f.FsParserAbstract] [/uploads/emails/[email protected]/INBOX.Sent_350] can be indexed: [true]
12:11:24,203 DEBUG [f.p.e.c.f.FsParserAbstract]   - folder: INBOX.Sent_350
12:11:24,203 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] computeVirtualPathName(/home/spacetest/cases/spacetestcase2, /home/spacetest/cases/spacetestcase2/uploads/emails/[email protected]/INBOX.Sent_350) = /uploads/emails/[email protected]/INBOX.Sent_350
12:11:24,204 DEBUG [f.p.e.c.f.FsParserAbstract] indexing [/home/spacetest/cases/spacetestcase2/uploads/emails/[email protected]/INBOX.Sent_350] content
12:11:24,204 DEBUG [f.p.e.c.f.c.s.FileAbstractorSSH] Listing local files from [/home/spacetest/cases/spacetestcase2/uploads/emails/[email protected]/INBOX.Sent_350]
12:11:24,210 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] computeVirtualPathName(/home/spacetest/cases/spacetestcase2, /home/spacetest/cases/spacetestcase2/uploads/emails/[email protected]/INBOX.Sent_350/Ansatte.xlsx) = /uploads/emails/[email protected]/INBOX.Sent_350/Ansatte.xlsx
12:11:24,210 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] directory = [false], filename = [/uploads/emails/[email protected]/INBOX.Sent_350/Ansatte.xlsx], includes = [null], excludes = [null]
12:11:24,210 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] filename = [/uploads/emails/[email protected]/INBOX.Sent_350/Ansatte.xlsx], excludes = [null]
12:11:24,210 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] filename = [/uploads/emails/[email protected]/INBOX.Sent_350/Ansatte.xlsx], includes = [null]
12:11:24,210 DEBUG [f.p.e.c.f.FsParserAbstract] [/uploads/emails/[email protected]/INBOX.Sent_350/Ansatte.xlsx] can be indexed: [true]
12:11:24,210 DEBUG [f.p.e.c.f.FsParserAbstract]   - file: /uploads/emails/[email protected]/INBOX.Sent_350/Ansatte.xlsx
12:11:24,210 DEBUG [f.p.e.c.f.FsParserAbstract]     - not modified: creation date null , file date 2024-10-04T11:29:43, last scan date 2024-10-04T11:50:21.765268275
12:11:24,210 DEBUG [f.p.e.c.f.FsParserAbstract] Looking for removed files in [/home/spacetest/cases/spacetestcase2/uploads/emails/[email protected]/INBOX.Sent_350]...
12:11:24,250 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] computeVirtualPathName(/home/spacetest/cases/spacetestcase2, /home/spacetest/cases/spacetestcase2/uploads/emails/[email protected]/INBOX.Sent_350/Ansatte.xlsx) = /uploads/emails/[email protected]/INBOX.Sent_350/Ansatte.xlsx
12:11:24,250 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] directory = [false], filename = [/uploads/emails/[email protected]/INBOX.Sent_350/Ansatte.xlsx], includes = [null], excludes = [null]
12:11:24,250 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] filename = [/uploads/emails/[email protected]/INBOX.Sent_350/Ansatte.xlsx], excludes = [null]
12:11:24,250 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] filename = [/uploads/emails/[email protected]/INBOX.Sent_350/Ansatte.xlsx], includes = [null]
12:11:24,251 DEBUG [f.p.e.c.f.FsParserAbstract] Looking for removed directories in [/home/spacetest/cases/spacetestcase2/uploads/emails/[email protected]/INBOX.Sent_350]...
12:11:24,277 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] computeVirtualPathName(/home/spacetest/cases/spacetestcase2, /home/spacetest/cases/spacetestcase2/uploads/emails/[email protected]/INBOX.Sent_350.eml) = /uploads/emails/[email protected]/INBOX.Sent_350.eml
12:11:24,277 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] directory = [false], filename = [/uploads/emails/[email protected]/INBOX.Sent_350.eml], includes = [null], excludes = [null]
12:11:24,277 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] filename = [/uploads/emails/[email protected]/INBOX.Sent_350.eml], excludes = [null]
12:11:24,277 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] filename = [/uploads/emails/[email protected]/INBOX.Sent_350.eml], includes = [null]
12:11:24,277 DEBUG [f.p.e.c.f.FsParserAbstract] [/uploads/emails/[email protected]/INBOX.Sent_350.eml] can be indexed: [true]
12:11:24,278 DEBUG [f.p.e.c.f.FsParserAbstract]   - file: /uploads/emails/[email protected]/INBOX.Sent_350.eml
12:11:24,278 DEBUG [f.p.e.c.f.FsParserAbstract]     - not modified: creation date null , file date 2024-10-04T11:29:43, last scan date 2024-10-04T11:50:21.765268275
12:11:24,278 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] computeVirtualPathName(/home/spacetest/cases/spacetestcase2, /home/spacetest/cases/spacetestcase2/uploads/emails/[email protected]/INBOX.Arkiv^2019_3671) = /uploads/emails/[email protected]/INBOX.Arkiv^2019_3671
12:11:24,278 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] directory = [true], filename = [/uploads/emails/[email protected]/INBOX.Arkiv^2019_3671], includes = [null], excludes = [null]
12:11:24,278 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] filename = [/uploads/emails/[email protected]/INBOX.Arkiv^2019_3671], excludes = [null]
12:11:24,278 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] filename = [/uploads/emails/[email protected]/INBOX.Arkiv^2019_3671], includes = [null]
12:11:24,278 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] filename = [/uploads/emails/[email protected]/INBOX.Arkiv^2019_3671], excludes = [null]
12:11:24,279 DEBUG [f.p.e.c.f.FsParserAbstract] [/uploads/emails/[email protected]/INBOX.Arkiv^2019_3671] can be indexed: [true]
12:11:24,279 DEBUG [f.p.e.c.f.FsParserAbstract]   - folder: INBOX.Arkiv^2019_3671
12:11:24,279 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] computeVirtualPathName(/home/spacetest/cases/spacetestcase2, /home/spacetest/cases/spacetestcase2/uploads/emails/[email protected]/INBOX.Arkiv^2019_3671) = /uploads/emails/[email protected]/INBOX.Arkiv^2019_3671
12:11:24,280 DEBUG [f.p.e.c.f.FsParserAbstract] indexing [/home/spacetest/cases/spacetestcase2/uploads/emails/[email protected]/INBOX.Arkiv^2019_3671] content
12:11:24,280 DEBUG [f.p.e.c.f.c.s.FileAbstractorSSH] Listing local files from [/home/spacetest/cases/spacetestcase2/uploads/emails/[email protected]/INBOX.Arkiv^2019_3671]
12:11:24,286 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] computeVirtualPathName(/home/spacetest/cases/spacetestcase2, /home/spacetest/cases/spacetestcase2/uploads/emails/[email protected]/INBOX.Arkiv^2019_3671/0270064298 ) = /uploads/emails/[email protected]/INBOX.Arkiv^2019_3671/0270064298 
12:11:24,287 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] directory = [true], filename = [/uploads/emails/[email protected]/INBOX.Arkiv^2019_3671/0270064298 ], includes = [null], excludes = [null]
12:11:24,287 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] filename = [/uploads/emails/[email protected]/INBOX.Arkiv^2019_3671/0270064298 ], excludes = [null]
12:11:24,287 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] filename = [/uploads/emails/[email protected]/INBOX.Arkiv^2019_3671/0270064298 ], includes = [null]
12:11:24,287 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] filename = [/uploads/emails/[email protected]/INBOX.Arkiv^2019_3671/0270064298 ], excludes = [null]
12:11:24,287 DEBUG [f.p.e.c.f.FsParserAbstract] [/uploads/emails/[email protected]/INBOX.Arkiv^2019_3671/0270064298 ] can be indexed: [true]
12:11:24,287 DEBUG [f.p.e.c.f.FsParserAbstract]   - folder: 0270064298 
12:11:24,288 DEBUG [f.p.e.c.f.f.FsCrawlerUtil] computeVirtualPathName(/home/spacetest/cases/spacetestcase2, /home/spacetest/cases/spacetestcase2/uploads/emails/[email protected]/INBOX.Arkiv^2019_3671/0270064298 ) = /uploads/emails/[email protected]/INBOX.Arkiv^2019_3671/0270064298 
12:11:24,288 DEBUG [f.p.e.c.f.FsParserAbstract] indexing [/home/spacetest/cases/spacetestcase2/uploads/emails/[email protected]/INBOX.Arkiv^2019_3671/0270064298 ] content
12:11:24,288 DEBUG [f.p.e.c.f.c.s.FileAbstractorSSH] Listing local files from [/home/spacetest/cases/spacetestcase2/uploads/emails/[email protected]/INBOX.Arkiv^2019_3671/0270064298 ]
12:11:24,292 WARN  [f.p.e.c.f.FsParserAbstract] Error while crawling /home/spacetest/cases/spacetestcase2: SFTP error (SSH_FX_NO_SUCH_FILE): No such file
12:11:24,293 WARN  [f.p.e.c.f.FsParserAbstract] Full stacktrace
java.io.UncheckedIOException: SFTP error (SSH_FX_NO_SUCH_FILE): No such file
	at org.apache.sshd.sftp.client.impl.SftpIterableDirEntry.iterator(SftpIterableDirEntry.java:67) ~[sshd-sftp-2.13.2.jar:2.13.2]
	at org.apache.sshd.sftp.client.impl.SftpIterableDirEntry.iterator(SftpIterableDirEntry.java:35) ~[sshd-sftp-2.13.2.jar:2.13.2]
	at java.base/java.lang.Iterable.spliterator(Iterable.java:101) ~[?:?]
	at fr.pilato.elasticsearch.crawler.fs.crawler.ssh.FileAbstractorSSH.getFiles(FileAbstractorSSH.java:100) ~[fscrawler-crawler-ssh-2.10-SNAPSHOT.jar:?]
	at fr.pilato.elasticsearch.crawler.fs.FsParserAbstract.addFilesRecursively(FsParserAbstract.java:248) ~[fscrawler-core-2.10-SNAPSHOT.jar:?]
	at fr.pilato.elasticsearch.crawler.fs.FsParserAbstract.addFilesRecursively(FsParserAbstract.java:314) ~[fscrawler-core-2.10-SNAPSHOT.jar:?]
	at fr.pilato.elasticsearch.crawler.fs.FsParserAbstract.addFilesRecursively(FsParserAbstract.java:314) ~[fscrawler-core-2.10-SNAPSHOT.jar:?]
	at fr.pilato.elasticsearch.crawler.fs.FsParserAbstract.addFilesRecursively(FsParserAbstract.java:314) ~[fscrawler-core-2.10-SNAPSHOT.jar:?]
	at fr.pilato.elasticsearch.crawler.fs.FsParserAbstract.addFilesRecursively(FsParserAbstract.java:314) ~[fscrawler-core-2.10-SNAPSHOT.jar:?]
	at fr.pilato.elasticsearch.crawler.fs.FsParserAbstract.addFilesRecursively(FsParserAbstract.java:314) ~[fscrawler-core-2.10-SNAPSHOT.jar:?]
	at fr.pilato.elasticsearch.crawler.fs.FsParserAbstract.run(FsParserAbstract.java:162) [fscrawler-core-2.10-SNAPSHOT.jar:?]
	at java.base/java.lang.Thread.run(Thread.java:840) [?:?]
Caused by: org.apache.sshd.sftp.common.SftpException: No such file
	at org.apache.sshd.sftp.client.impl.AbstractSftpClient.throwStatusException(AbstractSftpClient.java:277) ~[sshd-sftp-2.13.2.jar:2.13.2]
	at org.apache.sshd.sftp.client.impl.AbstractSftpClient.checkHandleResponse(AbstractSftpClient.java:299) ~[sshd-sftp-2.13.2.jar:2.13.2]
	at org.apache.sshd.sftp.client.impl.AbstractSftpClient.checkHandle(AbstractSftpClient.java:290) ~[sshd-sftp-2.13.2.jar:2.13.2]
	at org.apache.sshd.sftp.client.impl.AbstractSftpClient.openDir(AbstractSftpClient.java:887) ~[sshd-sftp-2.13.2.jar:2.13.2]
	at org.apache.sshd.sftp.client.impl.SftpDirEntryIterator.<init>(SftpDirEntryIterator.java:61) ~[sshd-sftp-2.13.2.jar:2.13.2]
	at org.apache.sshd.sftp.client.impl.SftpIterableDirEntry.iterator(SftpIterableDirEntry.java:65) ~[sshd-sftp-2.13.2.jar:2.13.2]
	... 11 more
12:11:24,298 INFO  [f.p.e.c.f.FsParserAbstract] Closing FS crawler file abstractor [FileAbstractorSSH].
12:11:24,298 DEBUG [f.p.e.c.f.c.s.FsCrawlerSshClient] Closing FsCrawlerSshClient
12:11:24,312 INFO  [f.p.e.c.f.FsParserAbstract] FS crawler is stopping after 1 run
12:11:24,312 DEBUG [f.p.e.c.f.FsCrawlerImpl] Closing FS crawler [fscrawler_job]
12:11:24,313 DEBUG [f.p.e.c.f.c.s.FsCrawlerSshClient] Closing FsCrawlerSshClient
12:11:24,313 DEBUG [f.p.e.c.f.FsCrawlerImpl] FS crawler thread is now stopped
12:11:24,313 DEBUG [f.p.e.c.f.c.ElasticsearchClient] Closing Elasticsearch client manager
12:11:24,313 DEBUG [f.p.e.c.f.f.b.FsCrawlerBulkProcessor] Closing BulkProcessor
12:11:24,313 DEBUG [f.p.e.c.f.f.b.FsCrawlerBulkProcessor] BulkProcessor is now closed
12:11:24,313 DEBUG [f.p.e.c.f.f.b.FsCrawlerBulkProcessor] Executing [6] remaining actions
12:11:24,314 DEBUG [f.p.e.c.f.f.b.FsCrawlerSimpleBulkProcessorListener] Going to execute new bulk composed of 6 actions
12:11:24,326 DEBUG [f.p.e.c.f.c.ElasticsearchEngine] Sending a bulk request of [6] documents to the Elasticsearch service
12:11:24,326 DEBUG [f.p.e.c.f.c.ElasticsearchClient] bulk a ndjson of 2103 characters
12:11:24,378 DEBUG [f.p.e.c.f.f.b.FsCrawlerSimpleBulkProcessorListener] Executed bulk composed of 6 actions
12:11:24,382 DEBUG [f.p.e.c.f.s.FsCrawlerManagementServiceElasticsearchImpl] Elasticsearch Management Service stopped
12:11:24,382 DEBUG [f.p.e.c.f.c.ElasticsearchClient] Closing Elasticsearch client manager
12:11:24,382 DEBUG [f.p.e.c.f.f.b.FsCrawlerBulkProcessor] Closing BulkProcessor
12:11:24,382 DEBUG [f.p.e.c.f.f.b.FsCrawlerBulkProcessor] BulkProcessor is now closed
12:11:24,383 DEBUG [f.p.e.c.f.s.FsCrawlerDocumentServiceElasticsearchImpl] Elasticsearch Document Service stopped
12:11:24,383 DEBUG [f.p.e.c.f.FsCrawlerImpl] ES Client Manager stopped
12:11:24,384 INFO  [f.p.e.c.f.FsCrawlerImpl] FS crawler [fscrawler_job] stopped
12:11:24,384 DEBUG [f.p.e.c.f.FsCrawlerImpl] Closing FS crawler [fscrawler_job]
12:11:24,385 DEBUG [f.p.e.c.f.c.s.FsCrawlerSshClient] Closing FsCrawlerSshClient
12:11:24,385 DEBUG [f.p.e.c.f.FsCrawlerImpl] FS crawler thread is now stopped
12:11:24,385 DEBUG [f.p.e.c.f.c.ElasticsearchClient] Closing Elasticsearch client manager
12:11:24,385 DEBUG [f.p.e.c.f.s.FsCrawlerManagementServiceElasticsearchImpl] Elasticsearch Management Service stopped
12:11:24,385 DEBUG [f.p.e.c.f.c.ElasticsearchClient] Closing Elasticsearch client manager
12:11:24,385 DEBUG [f.p.e.c.f.s.FsCrawlerDocumentServiceElasticsearchImpl] Elasticsearch Document Service stopped
12:11:24,386 DEBUG [f.p.e.c.f.FsCrawlerImpl] ES Client Manager stopped
12:11:24,386 INFO  [f.p.e.c.f.FsCrawlerImpl] FS crawler [fscrawler_job] stopped
12:11:24,387 DEBUG [f.p.e.c.p.FsCrawlerPluginsManager] Stopping plugins
FULL FSCrawler LOGS HERE

Expected behavior

We expected the contents of the folder with the space (" ") at the end to be indexed

Versions:

  • LInux (runing inside Kubernetes with FSCrawler image as of 2 weeks ago)
  • Version 2.10-SNAPSHOT
@jens-idoer jens-idoer added the check_for_bug Needs to be reproduced label Oct 4, 2024
@jens-idoer jens-idoer changed the title "No Such File" when crawling a directory that ends en space("ThisDirHasSpaceAtEnd ") "No Such File" when crawling a directory that ends with space("ThisDirHasSpaceAtEnd ") Oct 4, 2024
@jens-idoer
Copy link
Author

Just to be a bit more precise the directory with whith the space at the end is this: [/uploads/emails/[email protected]/INBOX.Arkiv^2019_3671/0270064298 ]

@jens-idoer
Copy link
Author

Any more information I can provide? .. Any way I can seif (or when) this is going to be fixed ?
This is my first "issue" report for this fine application so pls. let me know if there is more I need to do.

@dadoonet
Copy link
Owner

It's all good. Thanks for the report. I'm just overloaded ATM

@dadoonet
Copy link
Owner

dadoonet commented Nov 2, 2024

Thanks for the report. I'm able to reproduce it in an integration test. Which means that I can now work on a patch. ;)

@dadoonet
Copy link
Owner

dadoonet commented Nov 3, 2024

I opened apache/mina-sshd#628 as the error seems to be coming from the mina-sshd library.

@dadoonet dadoonet added bug For confirmed bugs and removed check_for_bug Needs to be reproduced labels Nov 3, 2024
@dadoonet dadoonet self-assigned this Nov 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug For confirmed bugs
Projects
None yet
Development

No branches or pull requests

2 participants