New video: Web-Scrapers vs Anti-Scrapers is now on YouTube:
In summary, you’ll learn about various Web-Scrapers & Anti-Scrapers, such as open-source frameworks and commercial options. Python-focus.
Open-source frameworks covered:
- undetected-chromedriver
- nodriver
- seleniumbase
- patchright
- puppeteer-real-browser
- scrapling
- pydoll
- botasaurus
- cloudscraper
- drissionpage
Commercial tools for scraping:
- BrightData
- ZenRows
Commercial tools for preventing scraping:
- Google reCAPTCHA
- hCaptcha
- Cloudflare Turnstile
- PerimeterX
- DataDome
- Imperva/Incapsula
- Kasada
- Akamai
Open-source tools for preventing scraping:
- Brotector CAPTCHA
As always, we especially keep things fun!
At the end of the video, expect ten minutes of live demos, where we’ll be bypassing all kinds of bot-detection systems in order to scrape data from various web pages. Live demos will also include the Python code used. We’ll certainly be bypassing various Cloudflare CAPTCHAs. We’ll also scrape data that includes:
- Items and prices from Walmart
- Hotel prices from Best Western
- Nike shoe prices
- ChatGPT queries
Additionally, we’ll even show you how to scrape data from Indeed so that you can automate your job search! The live demos use the Python programming language along with the SeleniumBase automation framework.
Also check out the SeleniumBase Playlist on YouTube!
If you’re new to SeleniumBase, check out the GitHub page. SeleniumBase is the professional toolkit for web automation activities. It’s built for testing websites, bypassing CAPTCHAs, enhancing productivity, completing tasks, and scaling your business. SeleniumBase includes lots of advanced tools (all open-source) such as the “Recorder”, which lets you instantly generate automation scripts after manually performing actions in a web browser. There are two different stealth modes: UC Mode and CDP Mode, each with their own special abilities for bypassing bot-detection. CDP Mode uses the Chrome Devtools Protocol for advanced stealth capabilities.
For the first SeleniumBase tutorial on the Chrome DevTools Protocol, see https://seleniumbase.com/new-video-undetectable-automation-4/.