Download Shareware and Freeware Software for Windows, Linux, Macintosh, PDA

line Home  |  About Us  |  Link To Us  |  FAQ  |  Contact

Serving Software Downloads in 956 Categories, Downloaded 50.335.224 Times

Web::Scraper 0.32

Company: Tatsuhiko Miyagawa
Date Added: August 12, 2013  |  Visits: 532

Web::Scraper

Report Broken Link
Printer Friendly Version


Product Homepage
Download (34 downloads)



Web::Scraper is a web scraper toolkit, inspired by Ruby's equivalent Scrapi. It provides a DSL-ish interface for traversing HTML documents and returning a neatly arranged Perl data strcuture.<br /><br />The scraper and process blocks provide a method to define what segments of a document to extract. It understands CSS and HTML Selectors as well as XPath expressions.<br /><br />SYNOPSIS<br /><br /> use URI;<br /> use Web::Scraper;<br /><br /> # First, create your scraper block<br /> my $tweets = scraper {<br /> # Parse all LIs with the class "status", store them into a resulting<br /> # array 'tweets'. We embed another scraper for each tweet.<br /> process "li.status", "tweets[]" => scraper {<br /> # And, in that array, pull in the elementy with the class<br /> # "entry-content", "entry-date" and the link<br /> process ".entry-content", body => 'TEXT';<br /> process ".entry-date", when => 'TEXT';<br /> process 'a[rel="bookmark"]', link => '@href';<br /> };<br /> };<br /><br /> my $res = $tweets->scrape( URI->new("http://twitter.com/miyagawa") );<br /><br /> # The result has the populated tweets array<br /> for my $tweet (@{$res->{tweets}}) {<br /> print "$tweet->{body} $tweet->{when} (link: $tweet->{link})n";<br /> }

Requirements: No special requirements
Platforms: *nix, Linux
Keyword: Array Class Embed Parse Process Quotentrycontentquot Quotentrydatequot Quotstatusquot Quottweetgtbody Resulting Scraper Store Text Tweet Tweetgtlinkquot Tweetgtwhen Tweets Web Web Scraper Webscraper
Users rating: 0/10

License: Freeware Size: 61.44 KB
USER REVIEWS
More Reviews or Write Review


WEB::SCRAPER RELATED
Libraries  -  Class::Generate 1.09
Class::Generate is a Perl module that can generate Perl class hierarchies. SYNOPSIS use Class::Generate qw(class subclass delete_class); # Declare class Class_Name, with the following types of members: class Class_Name => [ s => $, #...
53.25 KB  
Miscellaneous  -  Content Parser 1.0
There is an abstract base class that can check the strings and call concrete methods that do the actual processing of the texts in the strings. Arrays can be traversed to process string entry values. It can also traverse array entries recursively...
 
Networking  -  mYLastRSS for Scripts r20090704
mYLastRSS is a PHP class to parse several RSS/RDF feeds and give an ordered result. It include extension class to make RSS-rewriting (with values changing or results filtering).It supports modules for Media RSS, iTunes and Dublin Core.
184.32 KB  
Libraries  -  Object::Relation::Meta::Class::Schema 0.1.0
Object::Relation::Meta::Class::Schema is a Perl module for Object::Relation database store builder. This module is provides metadata for all Object::Relation classes while building a storage schema. Loading Object::Relation::Schema causes it to...
24.58 KB  
Programming  -  LibWebta 1.0.0
PHP5 class library, incorporating classes for caching, multi-process execution, diskless ImageMagick routines, network protocols,Cryptography, API and application bindings, various Web services
1.8 MB  
Networking  -  mYLastRSS 1.0
mYLastRSS is a PHP class to parse several RSS/RDF feeds and give an ordered result. It include extension class to make RSS-rewriting (with values changing or results filtering). mYLastRSS support these modules: Media RSS, iTunes and Dublin Core.
 
Miscellaneous  -  Sax Filters 1.1
This is a set of classes implementing SAX filters, the classes include a SAX class to parse XML documents using Expat and defines a way to create SAX filters to perform SAX- based queries, updates and transformations of documents. Simple filters...
 
Database Tools  -  Simple MySQL wrapper replicator 1.0
This PHP script is mainly used to access multiple MySQL servers. It connects to multiple MySQL database servers, parsing a configuration list and executing a given SQL query. If it fails the class repeats the process trying to access another...
10 KB  
Database Tools  -  NemoDB 0.0.5
NemoDB is the Php class which provide simple, quasi relational database. Features: - tables can store text, images etc. - automatic column value compression and decomp.; - sequences, DBA account, stored procedures;
 
Database Tools  -  NemoDB for Scripts 0.0.5
NemoDB is the Php class which provide simple, quasi relational database.Features:- tables can store text, images etc.- automatic column value compression and decomp.;- sequences, DBA account, stored procedures;
10 KB  
NEW DOWNLOADS IN LINUX SOFTWARE, PROGRAMMING
Linux Software  -  EasyEDA PCB Designer for Linux 2.0.0
EasyEDA, a great web based EDA(Electronics Design Automation) tool, online PCB tool, online PCB software for electronics engineers, educators, students, makers and enthusiasts. Theres no need to install any software. Just open EasyEDA in any...
34.4 MB  
Linux Software  -  wpCache® WordPress HTTP Cache 1.9
wpCache® is a high-performance, distributed object, caching system application, generic in nature, but intended for use in speeding up dynamic web applications, by decreasing database load time. wpCache® decreases dramatically the page...
3.51 MB  
Linux Software  -  Polling Autodialer Software 3.4
ICTBroadcast Auto Dialer software has a survey campaign for telephone surveys and polls. This auto dialer software automatically dials a list of numbers and asks them a set of questions that they can respond to, by using their telephone keypad....
488 B  
Linux Software  -  Total Video Converter Mac Free 3.5.5
Total Video Converter Mac Free developed by EffectMatrix Ltd is the official legal version of Total Video Converter which was a globally recognized brand since 2006. Total Video Converter Mac Free is a free but powerful all-in-one video...
17.7 MB  
Linux Software  -  Skeith mod_log_sql Analyzer 2.10beta2
Skeith is a php based front end for analyzing logs for Apache using mod_log_sql.
47.5 KB  
Programming  -  Cedalion for Linux 0.2.6
Cedalion is a programming language that allows its users to add new abstractions and define (and use) internal DSLs. Its innovation is in the fact that it uses projectional editing to allow the new abstractions to have no syntactic limitations.
471.04 KB  
Programming  -  Math::GMPf 0.29
Math::GMPf - perl interface to the GMP library's floating point (mpf) functions.
30.72 KB  
Programming  -  Net::Wire10 1.08
Net::Wire10 is a Pure Perl connector that talks to Sphinx, MySQL and Drizzle servers. Net::Wire10 implements the low-level network protocol, alias the MySQL wire protocol version 10, necessary for talking to one of the aforementioned...
30.72 KB  
Programming  -  logilab-common 0.56.2
a bunch of modules providing low level functionnalities shared among some python projects devel Please note that some of the modules have some extra dependencies. For instance, logilab.common.db will require a db-api 2.0 compliant...
174.08 KB  
Programming  -  OpenSSL for linux 1.0.0a
The OpenSSL Project is a collaborative effort to develop a robust, commercial-grade, full-featured, and Open Source toolkit implementing the Secure Sockets Layer (SSL v2/v3) and Transport Layer Security (TLS v1) protocols as well as a...
3.83 MB