This package prevents Puppeteer from being detected as a bot in services like Cloudflare and allows you to pass captchas without any problems. It behaves like a real browser.
If you are only interested in Cloudflare WAF, please check this repo:
https://github.com/zfcsoftware/cf-clearance-scraper
2024-08-23.04-57-34.mp4
If you are using a Linux operating system, xvfb must be installed for the library to work correctly.
npm i puppeteer-real-browser
if you are using linux:
sudo apt-get install xvfb
const { connect } = require('puppeteer-real-browser');
const start = async () => {
const { page, browser } = await connect()
}
import { connect } from 'puppeteer-real-browser'
const { page, browser } = await connect()
const { connect } = require("puppeteer-real-browser")
async function test() {
const { browser, page } = await connect({
headless: false,
args: [],
customConfig: {},
turnstile: true,
connectOption: {},
disableXvfb: false,
ignoreAllFlags: false
// proxy:{
// host:'<proxy-host>',
// port:'<proxy-port>',
// username:'<proxy-username>',
// password:'<proxy-password>'
// }
})
await page.goto('<url>')
}
test()
headless: The default value is false. Values such as “new”, true, “shell” can also be sent, but it works most stable when false is used.
args: If there is an additional flag you want to add when starting Chromium, you can send it with this string. Supported flags: https://github.com/GoogleChrome/chrome-launcher/blob/main/docs/chrome-flags-for-tools.md
customConfig: https://github.com/GoogleChrome/chrome-launcher The browser is initialized with this library. What you send with this object is added as a direct initialization argument. You should use the initialization values in this repo. You should set the userDataDir option here and if you want to specify a custom chrome path, you should set it with the chromePath value.
turnstile: Cloudflare Turnstile automatically clicks on Captchas if set to true
connectOption: The variables you send when connecting to chromium created with puppeteer.connect are added
disableXvfb: In Linux, when headless is false, a virtual screen is created and the browser is run there. You can set this value to true if you want to see the browser.
ignoreAllFlags If true, all initialization arguments are overridden. This includes the let's get started page that appears on the first load.
Some plugins, such as puppeteer-extra-plugin-anonymize-ua, may cause you to be detected. You can use the plugin installation test in the library's test file to see if it will cause you to be detected.
The following is an example of installing a plugin. You can install other plugins in the same way as this example.
npm i puppeteer-extra-plugin-click-and-wait
const test = require('node:test');
const assert = require('node:assert');
const { connect } = require('puppeteer-real-browser');
test('Puppeteer Extra Plugin', async () => {
const { page, browser } = await connect({
args: ["--start-maximized"],
turnstile: true,
headless: false,
// disableXvfb: true,
customConfig: {},
connectOption: {
defaultViewport: null
},
plugins: [
require('puppeteer-extra-plugin-click-and-wait')()
]
})
await page.goto("https://google.com", { waitUntil: "domcontentloaded" })
await page.clickAndWaitForNavigation('body')
await browser.close()
})
You can use the Dockerfile file in the main directory to use this library with docker. It has been tested with docker on Ubuntu server operating systems.
To run a test, you can follow these steps
git clone https://github.com/zfcsoftware/puppeteer-real-browser
cd puppeteer-real-browser
docker build -t puppeteer-real-browser-project .
docker run puppeteer-real-browser-project
This library is completely open source and is constantly being updated. Please star this repo to support this project. Starring and supporting the project will ensure that it receives updates. If you want to support it further, you can consider sponsoring me (https://github.com/sponsors/zfcsoftware)
This problem is probably caused by the runtime being closed by the rebrowser used. https://github.com/zfcsoftware/puppeteer-real-browser/tree/access-window I created a branch for this. You can access the value you want by adding javascript to the page source with puppeteer-intercept-and-modify-requests as done in success.js. If you know about the Chrome plugin, you can also use it.
As with the initialization arguments in the test module, you can set the defaultViewport in connectOption. If you set null, it will take up as much space as the width of the Browser.
using puppeteer-core patched with rebrowser. Tested with the challenging sites in the test file in headless false mode and passed with flying colors. The only known issue is that the mouse screeenX does not match the mouse position. This has been patched in the library.
The ghost-cursor is included in the library. (https://github.com/zfcsoftware/puppeteer-real-browser/blob/2a5fba37a85c15625fb3c8d1f7cf8dcb109b9492/lib/cjs/module/pageController.js#L54) You can use ghost-cursor with page.realCursor. page.click It is recommended to use page.realClick instead of page.click.
This library lets you launch and use Chrome in its most natural state. It tries to get the best results with minimal patching. Thanks to @nwebson who fixed the Runtime.enable issue from this point. If using rebrowser solves your problem, I don't recommend using real browser.
Real browser does not give you full control over launching. It launches Chrome with Chrome launcher and connects to it with rebrowser.
https://stackoverflow.com/questions/52546045/how-to-pass-recaptcha-v3
Please see the answers in the link above. When there is no Google session, no matter how good your browser is, recaptcha identifies you as a bot. It is a common problem.
Distributed under the MIT License. See LICENSE for more information.
Contributions to the current version
-
rebrowser™ - rebrowser™ - Created a patch pack for Runtime, which left many traces behind. Since Runtime was not used, most problems were solved. TargetFilter, which was used in the past and caused many problems, was switched to this patch. The Puppeteer-core library was patched and added to this repo. A lot of good bot detection systems are not caught thanks to rebrowser. Please star the rebrowser repo. Thank you. (https://github.com/rebrowser/rebrowser-patches)
-
Skill Issue™ - TheFalloutOf76 - He realized that mouse movements could not be simulated accurately and created a solution for this. His solution is used in this library. (https://github.com/TheFalloutOf76/CDP-bug-MouseEvent-.screenX-.screenY-patcher)
No responsibility is accepted for the use of this software. This software is intended for educational and informational purposes only. Users should use this software at their own risk. The developer cannot be held liable for any damages that may result from the use of this software.
This software is not intended to bypass Cloudflare Captcha or any other security measure. It must not be used for malicious purposes. Malicious use may result in legal consequences.
This software is not officially endorsed or guaranteed. Users can visit the GitHub page to report bugs or contribute to the software, but they are not entitled to make any claims or request service fixes.
By using this software, you agree to this disclaimer.