Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect automatically what is the type of document that will be parsed. #194

Open
Urahara opened this issue Jun 12, 2020 · 1 comment · May be fixed by #234
Open

Detect automatically what is the type of document that will be parsed. #194

Urahara opened this issue Jun 12, 2020 · 1 comment · May be fixed by #234

Comments

@Urahara
Copy link

Urahara commented Jun 12, 2020

Today when user don't provide a type on construct we always fallback to html parser type.

def _st(st):
if st is None:
return 'html'
elif st in _ctgroup:
return st

@shivamshan shivamshan linked a pull request Mar 2, 2022 that will close this issue
@Gallaecio
Copy link
Member

I wonder if we should be using https://github.com/scrapy/xtractmime to implement this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants