Main Navigation

Content

Perl for Web Automation

Learn Perl Now!
And get a job doing Perl.

One can use Perl to automate web-sites: perform operations on them using a Perl-based web user-agent that speaks HTTP and HTTPS.

This page aims to give a few pointers to popular and high-quality Perl resources for web-automation.

Modules for Web Automation

WWW-Mechanize

WWW::Mechanize is the de-facto standard for web-automation for Perl, but you may find more convenient interfaces for many tasks.

HTML::TreeBuilder and HTML-TreeBuilder-LibXML

These are alternative HTML parsers that provide more flexibility. For more information about HTML parsing see our text parsing page.

Browser Wrappers

WWW::Selenium, Mozilla::Mechanize and IE::Mechanize allow one to perform web automation by interfacing with existing, full-fledged, JavaScript-enabled browsers, seeing everything from the browser's point-of-view.

High-Level Web Automation Frameworks

Web-Scraper is a web scraping toolkit that is a high-level abstraction above WWW-Mechanize and other modules. Scrappy is an even higher-level abstraction and integrates some other modules.

Lower Level Web Modules

libwww-perl, WWW-Curl and LWP-Curl provide lower-level access to HTTP, HTTPS and related protocols. libwww-perl (LWP) is the most popular API and is pure-Perl ; WWW-Curl is a wrapper to libcurl which is written in C and is fast, and LWP-Curl provides a libwww-perl-like interface above WWW-Curl.

Share/Bookmark

Sidebar

Footer