Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cloudflare iuam challenge [nhentai] '503 Service Temporarily Unavailable' #2636

Closed
kkuulol opened this issue May 29, 2022 · 18 comments
Closed

Comments

@kkuulol
Copy link

kkuulol commented May 29, 2022

this only happens on nhentai, is this temporary ? how can i do the recaptcha ? thanks

gallery-dl --chapter-range 1-25 -v https://nhentai.net/artist/hiroya/popular
[gallery-dl][debug] Version 1.21.1
[gallery-dl][debug] Python 3.10.2 - Windows-10-10.0.19044-SP0
[gallery-dl][debug] requests 2.27.1 - urllib3 1.26.8
[gallery-dl][debug] Starting DownloadJob for 'https://nhentai.net/artist/hiroya/popular'
[nhentai][debug] Using NhentaiTagExtractor for 'https://nhentai.net/artist/hiroya/popular'
[urllib3.connectionpool][debug] Starting new HTTPS connection (1): nhentai.net:443
[urllib3.connectionpool][debug] https://nhentai.net:443 "GET /artist/hiroya/popular?page=1 HTTP/1.1" 503 None
[nhentai][warning] Cloudflare IUAM challenge
[nhentai][error] HttpError: '503 Service Temporarily Unavailable' for 'https://nhentai.net/artist/hiroya/popular'
@ScottGriffin213
Copy link

Looks like nhentai figured out what was going on and shut down the API.

I should note that I am able to reproduce this on Kubuntu 20.04 with Python 3.8.

@ScottGriffin213
Copy link

And here's a log of the OP's original commandline:

$ gallery-dl --chapter-range 1-25 -v https://nhentai.net/artist/hiroya/popular
[gallery-dl][debug] Version 1.22.0
[gallery-dl][debug] Python 3.8.10 - Linux-5.4.0-110-generic-x86_64-with-glibc2.29
[gallery-dl][debug] requests 2.27.1 - urllib3 1.26.4
[gallery-dl][debug] Starting DownloadJob for 'https://nhentai.net/artist/hiroya/popular'
[nhentai][debug] Using NhentaiTagExtractor for 'https://nhentai.net/artist/hiroya/popular'
[urllib3.connectionpool][debug] Starting new HTTPS connection (1): nhentai.net:443
[urllib3.connectionpool][debug] https://nhentai.net:443 "GET /artist/hiroya/popular?page=1 HTTP/1.1" 503 None
[nhentai][warning] Cloudflare IUAM challenge
[nhentai][error] HttpError: '503 Service Temporarily Unavailable' for 'https://nhentai.net/artist/hiroya/popular'

@mikf
Copy link
Owner

mikf commented May 30, 2022

See #2537 (comment).

Instead of exporting your cookies manually, it is also possible to use --cookies-from BROWSERNAME.
(This option's full name is --cookies-from-browser, but there's a typo: #2630)

@ScottGriffin213
Copy link

There's a much quicker solution I've found, as well:

https://pypi.org/project/cfscrape/

@mikf
Copy link
Owner

mikf commented May 30, 2022

Last commit: Mar 23, 2020
This does not work anymore.

There previously was a cloudflare module in gallery-dl itself, which had more or less the same functionality as cfscrape, but cloudflare updated its challenge to one that requires a browser engine to solve: d656892

@kkuulol
Copy link
Author

kkuulol commented May 31, 2022

i dont quite get these coding things but it looks p doomed i guess, but here is what i got after trying out the cookies method, cut down on repeated warnings. thanks for the help anyways

gallery-dl --cookies-from chrome -v https://nhentai.net/g/***/
[gallery-dl][debug] Version 1.22.0
[gallery-dl][debug] Python 3.10.2 - Windows-10-10.0.19044-SP0
[gallery-dl][debug] requests 2.27.1 - urllib3 1.26.8
[gallery-dl][debug] Starting DownloadJob for 'https://nhentai.net/g/***/'
[cookies][debug] Extracting cookies from C:\Users\*****\AppData\Local\Google\Chrome\User Data\Default\Network\Cookies
[cookies][debug] Found local state file at 'C:\Users\*****\AppData\Local\Google\Chrome\User Data\Local State'
[cookies][warning] failed to decrypt cookie (AES-GCM) because MAC check failed. Possibly the key is wrong?
[cookies][warning] failed to decrypt cookie (AES-GCM) because MAC check failed. Possibly the key is wrong?
[cookies][info] Extracted 2525 cookies from chrome (343 could not be decrypted)
[cookies][debug] cookie version breakdown: {'v10': 2868, 'other': 0, 'unencrypted': 0}
[nhentai][debug] Using NhentaiGalleryExtractor for 'https://nhentai.net/g/***/'
[urllib3.connectionpool][debug] Starting new HTTPS connection (1): nhentai.net:443
[urllib3.connectionpool][debug] https://nhentai.net:443 "GET /api/gallery/*** HTTP/1.1" 503 None
[nhentai][warning] Cloudflare IUAM challenge
[nhentai][error] HttpError: '503 Service Temporarily Unavailable' for 'https://nhentai.net/api/gallery/***'

@kkuulol
Copy link
Author

kkuulol commented May 31, 2022

i just realised what doujin it was that i used to test that method sigh i should've read it beforehand woops

@mikf
Copy link
Owner

mikf commented May 31, 2022

You did not set the same useragent string for gallery-dl as was used by your browser when solving the challenge, did you:

gallery-dl --cookies-from chrome -o user-agent="..." https://nhentai.net/g/***/

You can edit and/or delete your own posts, by the way.

@kkuulol
Copy link
Author

kkuulol commented Jun 1, 2022

I have never heard of this user agent thing so I don't think I did, how do I use it ? Do I have to set it up on my end because I'm not sure what to put in the '...' ? Do I have to format it like this "user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Firefox/91.0", where can I find this information, sorry I don't know what those number mean or where to start. And thank you now I know lol.

@mikf
Copy link
Owner

mikf commented Jun 1, 2022

A User Agent string is a bit of information the your web browser sends with each request to identify itself. What this string is depends on what browser you are using.

To find out what your browser is sending, go to https://httpbin.org/user-agent
I'm getting

{
  "user-agent": "Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Firefox/91.0"
}

so I would have to use the following command with gallery-dl:

gallery-dl --cookies-from firefox -o user-agent="Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Firefox/91.0" https://nhentai.net/g/***/

(replace my values with whatever you are getting)

You can also put these options in a config file, so you don't have to type them every time you want to download from nhentai or any other Cloudflare Protected ™️ website.

{
    "cookies": ["firefox"],
    "user-agent": "Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Firefox/91.0"
}

@kkuulol
Copy link
Author

kkuulol commented Jun 2, 2022

Well I just tried this, aside from the repeated cookies warnings, it works fine. I've think i found the config files as well so thank you very much for this fix, makes things a lot easier lol. ps ty for this gallery-dl script, its opened my interest to coding as a whole cus you can do pretty cool stuff with coding actually lol. im not sure if i should b pressing the 'close with comment' option but yea i'd say this has been resolved, thanks.

@arisboch
Copy link

arisboch commented Jun 2, 2022

@mikf Wouldn't it be useful to automatically extract the user agent from the browser, as well, so it wouldn't break when the browser gets an update?

@mikf
Copy link
Owner

mikf commented Jun 3, 2022

This would certainly be useful, but I don't think that's possible.

Cookies are stored in an external file that can be accessed and read from, but the user agent string is embedded somewhere in the browser executable file itself.

@N3X15
Copy link

N3X15 commented Jun 5, 2022

You could cheat by starting firefox with the necessary CLI arguments to navigate it to a local webserver, then grab the headers from that. Wouldn't need much special other than a dependency with Flask.

Pretty invasive, but it'd work.

@mo-han
Copy link
Contributor

mo-han commented Nov 3, 2022

i'm using "cookies": ["firefox"], in config, but it's not enough to pass cloudflare.
i need to ensure the "user-agent" is EXACTLY THE SAME with my firefox's real UA, which would change if firefox got updated.
just now, i got a cloudflare 503 issue. the "old" cloudflare cookie from firefox is valid, but the UA in config file is not identical with firefox (it got an update recently, the version number in UA is changed from 105 to 106). after update the UA in the config file, cloudflare let me throught.

@mikf
suggestion: add an option, to extract the current real user-agent from the browser, firefox or chrome.

@mikf
Copy link
Owner

mikf commented Nov 7, 2022

I mean, I could add such an option, but it would have to open a dummy page on localhost with your browser each time you run gallery-dl. Is this really something you'd want? Is it really this inconvenient to replace the user agent string when you update your browser?

@mo-han
Copy link
Contributor

mo-han commented Nov 10, 2022

but it would have to open a dummy page on localhost with your browser each time you run gallery-dl

wow that's complicated!

@mikf
Copy link
Owner

mikf commented Nov 15, 2022

Commit 9f06e79 makes it possible to set user-agent to "browser" to have gallery-dl try to fetch the User-Agent string of the system's default browser. The result gets currently cached for 24h, but I'm not sure if this a reasonable time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants