Overview
This script demonstrates how to set up a headless browser (using Chromium via Playwright) for web automation tasks while routing traffic through a trusted proxy. Although the code currently uses Chromium, it can be adapted to work with Firefox as well by modifying the browser launch settings. The main purpose of the script is to obtain a proxy server from command-line arguments, fetch a random user agent from a trusted source, and then launch a headless (or non-headless) browser session that uses these settings. The session is used to navigate to a target URL (in this case, https://httpbin.org/ip) while evading detection by removing the navigator.webdriver property.
Installation and Dependencies:
Prerequisites:
Python 3.7 or higher must be installed.
A stable internet connection is required.
Virtual Environment (Recommended):
From the command-line, navigate to the project folder.
Create a virtual environment with:
python -m venv venv
Activate the environment:
source venv/bin/activate (Linux/macOS) or venv\Scripts\activate (Windows)
Required Python Packages:
Install Playwright and requests using pip:
pip install playwright requests
Install the necessary browsers for Playwright by running:
python -m playwright install
Script Configuration:
The script expects a proxy server URL to be passed with the --proxy-server flag. If the proxy is not provided, the script exits with an error message.
User agents are fetched from a Trusted Proxies's site :
https://customers.trustedproxies.com/downloads/desktop_useragents.txt
One is chosen randomly for each run.
The script is designed to launch a headless Chrome session by default (headless can be set to False for debugging). It sets various Chromium arguments to disable sandboxing, disable infobars, and remove automation flags.
A JavaScript snippet is injected into every browser context to remove the navigator.webdriver property, helping to prevent detection.
Usage:
Run the script from the command line, providing the proxy setting, for example:
python playwright-script.py --proxy-server=http://your-proxy:port
The script prints diagnostic messages indicating which proxy and user agent are in use, along with the target URL.
It then launches the browser, navigates to the target URL, and performs a scroll-down action to mimic user interaction.
Testing Procedure:
Verify that you can access the user agent URL from your network.
Run the script with a valid proxy server.
Confirm via the output (and optionally by checking httpbin.org/ip in the browser) that the proxy is being used and the user agent is randomized.
Check the printed diagnostic messages for errors during navigation or proxy setup.
In non-headless mode, observe the browser window to ensure that it loads the target page and scrolls as intended.
Additional Notes:
The script uses asynchronous programming (async/await) with Playwright's asynchronous API for efficient browser automation.
Error handling is in place for issues such as a missing proxy parameter, failures when fetching user agents, or timeouts when loading the target URL.
To adapt the script for Firefox or to run in headless mode for production, update the browser launch options accordingly.
This script is ideal for running automated tests, scraping data while using trusted proxies, and ensuring that browser automation tasks remain undetected.
Sample Code: