Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues encountered with zip files - attached sample doc folder #1922

Open
sai26FS opened this issue Aug 20, 2024 · 3 comments
Open

Issues encountered with zip files - attached sample doc folder #1922

sai26FS opened this issue Aug 20, 2024 · 3 comments
Labels
feature_request for feature request

Comments

@sai26FS
Copy link

sai26FS commented Aug 20, 2024

doc.zip

Attached the zip file here. And attached the description link - (https://discuss.elastic.co/t/issues-encountered-for-zip-files/365071)

@sai26FS sai26FS added the check_for_bug Needs to be reproduced label Aug 20, 2024
@dadoonet dadoonet added feature_request for feature request and removed check_for_bug Needs to be reproduced labels Sep 13, 2024
@dadoonet
Copy link
Owner

I'm considering this as a valid feature request.
The first implementation I'd probably do is to apply the same "filters" we set for files with includes/excludes.

@sai26FS
Copy link
Author

sai26FS commented Sep 13, 2024

Thank you @dadoonet.

I would also like to mention you that if you could add a config for skipping the file names in the indexed content, it would be helpful.

abc.txt
Lorem ipsum dolor sit amet.


ab.js
// the hello world program
console.log('Hello World');

@dadoonet
Copy link
Owner

I'm not sure it's doable with Tika though but I just asked the Tika user mailing list...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature_request for feature request
Projects
None yet
Development

No branches or pull requests

2 participants