-
Notifications
You must be signed in to change notification settings - Fork 141
fails to work with tor/socks proxy #233
Comments
Hi @izidan, If it works with Replace your calls to const cloudscraper = require('request-promise') // <-- Does it work in this case? If the fault lies with |
it throws the following error when trying the following default settings even though it
|
same error as above when setup defaults with the following settings
|
and when called with default settings being
|
You may log the response that you're getting: const cloudscraper = require('cloudscraper')
cloudscraper.get(url).then(console.log, error => console.error(error.response.body.toString('utf-8')))
I'm guessing that those settings aren't working with |
using |
Yeh, unfortunately Cloudscraper can't solve those for you but we do have some convenience methods to help solve them using a third party service. It's mentioned in the README and there are examples. I hope that helps |
I am really confused with your reply above, Cloudscraper already works to resolve the captcha and pass using normal http proxy yet it fails with socks proxy, and I believe this has nothing to do with request or request-promise simply this works
whilst the same page over socks proxy fails
I shall leave it to you to figure out why and please reopen the issue as it is a bug in the captcha handler itself. |
I apologize if my response wasn't clear but Cloudscraper has never handled CAPTCHA, has never solved CAPTCHA ever. If you have a CAPTCHA handler, please share the code so I'll know what we're dealing with here.
You're connecting to the target server using two different IP's. It's most probable that the target has blocked Tor users. I don't find it odd in the least that the web proxy gets through without a CAPTCHA but Tor gets flagged.
Please identify a problem with the library as I have not. I'll reopen this for a while. |
@pro-src you are right, it seems to be an issue with cloudflare blocking the tor proxy itself even though it works via the tor browser itself, I shall close the issue and do further investigation. Thanks for your time. |
If it works in the Tor Browser then I'd definitely like to have it working in Cloudscraper. The only Tor related conversation besides this issue is #202. I'd be glad to know what your investigation yields. |
To be clear, we want to imitate the browser to avoid the CAPTCHA being sent in the first place. This might be TLS/SSL related and could be pretty complicated... If you want, open another issue: "CAPTCHA when using Tor", the issue template would be great for this. The Tor exit node is more than likely changing which could give you different results. It'd help if you could configure your instance of Tor to only use one exit node, the same exit node as Tor Browser is currently using... Cheers |
@pro-src It's also failing on my end with regular proxies with or without user & password. Trying to access your own website, without proxy, works as expected, with proxy, it fails with the same error as in the OP. Though the interesting part is, if i make my network to use this proxy or even the browser alone, it works fine. Do you have any pointers where to look? I'm a tad new into CF bypassing, but i can manage, just need a bit of pointers to where to look so i can also help you fixing this :) Thanks! |
Hi @brunogaspar, Give this a couple of tries
issue-233.jsconst ProxyLists = require('proxy-lists')
const cloudscraper = require('cloudscraper')
const jar = cloudscraper.jar
// cloudscraper.debug = true
const results = {}
const failing = '\u001B[0;31m\u001B[1m\u001B[5mx\u001B[0m'
const passing = '\u001B[0;32m\u001B[1m\u001B[5m✓\u001B[0m'
let id = -1, attempts = 0, max = 30, timeout = 60000
process.on('uncaughtException', console.error)
process.on('unhandledRejection', console.error)
ProxyLists.getProxies({ protocols: ['http'] })
.on('data', proxies =>
proxies.map(o => test(`http://${o.ipAddress}:${o.port}`)))
.on('error', error => console.error(error.message))
async function test(proxy, url = 'https://pro-src.com') {
if (attempts++ > max) return
if (id === -1) id = setTimeout(stop, timeout)
try {
results[proxy] = 'timed out'
const html = await cloudscraper.get({ proxy, url, jar: jar() })
console.error(proxy, '\t\t', results[proxy] = passing)
console.error('Result:\n', preview(html))
} catch(error) {
console.error(proxy, '\t\t', results[proxy] = failing)
console.error(error.name + ':', error.message)
if (error.response) {
console.error('Result:\n', preview(error.response.body))
}
}
}
function preview(html) {
try {
return String(html).match(/<title>([\S\s]+)<\/title>/i)[0]
}
catch(e) {
return html ? String(html).slice(0, 77) + '...' : html
}
}
function stop() {
for (let url in results) {
// console.log(url, '\t\t', results[url])
console.log(results[url] === passing ? 'pass:' : 'fail:', url)
}
process.exit(0)
} results.txt
|
Based on those results, the problem is most likely the proxies. I had gotten much better results but I felt that output was the most realistic. Random proxies aren't reliable and they're bound to trigger a CAPTCHA at some point. If despite a couple of runs, all of them fail, you're most likely dealing with the TLS/SSL issue that I mentioned above. However, I seriously doubt that to be the case due to the fact that it's working when you don't use a proxy. If it's the worst case scenario, have a look at #229 Cheers |
Will see what i can come up, but it is a tad weird where it works fine browser wise. One of the python implementations sort of works, of course the more requests you do, the likelihood of getting a captcha will be higher i suppose. But thanks for the pointers! |
@pro-src it works with anonymous proxies, example https://free-proxy-list.net/anonymous-proxy.html reason being is that once cloudflare marks your ip as suspicious it gets blocked and won't be able to get to the captcha page. @brunogaspar the reason it works with browsers is because cloudflare uses cookies once the captcha page is passed hence browsers works fine even behind tor. tried it with firefox and opera. problem still remains regards solving the captcha first to get to the cookies to be able to reuse it later regardless of making the nodejs requests direct, via http proxy or via tor now when i came across cloudscraper i was under the impression that it takes care of solving the captcha, but when tried it with tor it failed to work, so not sure if the statement on the first page stands true here
here is a simple 1 line code to test it with it will be really great to get cloudscraper to works with socks proxies by just using The strategy in place now is to cycle the requests through a list of anonymous proxies to workaround hitting the captcha page in first place. |
on a side note, cloudflare will never block an ip for a mobile network operator no matter how many requests is done, as all of the mobile operator clients likely to share few internet gateways so try to test it via mobile hotspot :) |
There seems to be some general unawareness of Cloudflare challenges, CAPTCHA, the browser, and this project. All I can say is learn to read. |
Related to #234 |
Would you please provide sample on how it should work using tor/socks proxy.
tried both of the following settings using .defaults() and still no luck
{ proxy: "socks://127.0.0.1:9050" }
{ agentClass: SocksProxyAgent, agentOptions: { protocol: "socks:", host: "localhost", port: 9050 } }
The text was updated successfully, but these errors were encountered: