Parse Html
Java Mozilla Html Parser project is a Java package that enables you to parse html pages into a Java Document object. The parser is a wrapper around Mozillas Html Parser, thus giving the user a browser-quality html parser. Limitiations and known issues The most major limitation is performance...
Platforms: *nix
License: Freeware | Size: 1.5 MB | Download (111): Java Mozilla Html Parser Download |
This class can be used to parse HTML lists to extract contained structure.It can take a string with well-formed HTML UL lists tags and extracted the contained list elements structure. It supports nested lists.The class returns an array of items elements.
Platforms: PHP
License: Freeware | Size: 10 KB | Download (54): UL to PHP array Download |
PHP HTML parser allows you to to parse HTML from php scripts.
Platforms: Windows, Mac, *nix, PHP, BSD Solaris
License: Freeware | Download (61): PHP HTML parser Download |
HTML::FromText is a Perl module that can convert plain text to HTML. SYNOPSIS use HTML::FromText; text2html( $text, %options ); # or use HTML::FromText (); my $t2h = HTML::FromText->new( %options ); my $html = $t2h->parse( $html ); HTML::FromText converts plain text to HTML. There...
Platforms: *nix
License: Freeware | Size: 13.31 KB | Download (113): HTML::FromText Download |
Syntax::Highlight::HTML is a Perl module to highlight a HTML syntax. SYNOPSIS use Syntax::Highlight::HTML; my $highlighter = new Syntax::Highlight::HTML; $output = $highlighter->parse($html); If $html contains the following HTML fragment: < !-- a description list --> < dl...
Platforms: *nix
License: Freeware | Size: 16.38 KB | Download (97): Syntax::Highlight::HTML Download |
HTML-to-XML is a .NET component that can help you transform a HTML file into a well-formed XML for parsing. If effect, it is designed to be an HTML parser / scraper.
Once HTML is converted to XHTML (i.e. well-formed XML), the plethora of existing XML parsing components and libraries can be...
Platforms: Windows, XP, 2003, Windows Vista
License: Freeware | Download (63): Chilkat .NET HTML-to-XML Download |
p3pmail provides a tool that removes dangerous HTML tags from email. p3pmail will remove dangerous HTML tags from email messages to make them safer for viewing. It does this by skipping the header of the email message before parsing it for dangerous HTML tags. It will only parse HTML email....
Platforms: *nix
License: Freeware | Download (95): p3pmail Download |
ShaniXmlParser is an XML/HTML DOM/SAX parser that can be validating. It can parse badly formed XML files. ShaniXmlParser can parse files with inverted tags and bad escaped &,< and >. ShaniXmlParser expands all HTML entities. ShaniXmlParser is well suited to parse HTML files. It is up to 3...
Platforms: *nix
License: Freeware | Size: 2 MB | Download (88): ShaniXmlParser Download |
It will scan HTML tags for certain keywords in the class name, and apply effects to it.The framework is formatted as jQuery plugins and can be implemented without writing a single line of code.Installation:Download the classbehaviours.zip archive. Extract the ZIP file into a folder. (An empty...
Platforms: Windows, Mac, *nix, JavaScript, BSD Solaris
License: Freeware | Download (52): classBehaviours Download |
jsoup is a Java library for working with real-world HTML. It provides a very convenient API for extracting and manipulating data, using the best of DOM, CSS, and jquery-like methods. * parse HTML from a URL, file, or string * find and extract data, using DOM traversal or CSS selectors *...
Platforms: Mac
License: Freeware | Size: 61.44 KB | Download (49): jsoup Download |
The parser can scan HTML files and "fix up" many common mistakes that human (and computer) authors make in writing HTML documents. NekoHTML adds missing parent elements; automatically closes elements with optional end tags; and can handle mismatched inline element tags. NekoHTML is written using...
Platforms: Mac
License: Freeware | Size: 8.38 MB | Download (35): NekoHTML Download |
WWW::Dict::Zdic is a Zdic Chinese Dictionary interface. SYNOPSIS use WWW::Dict::Zdic; my $dic = WWW::Dict::Zdic->new(); my $def = $dic->define("劉"); print YAML::Dump($def); This module provides simple interface to zdic.net Chinese character dictionary website. INTERFACE new()...
Platforms: *nix
License: Freeware | Size: 22.53 KB | Download (116): WWW::Dict::Zdic Download |
PyKHTML is a Python module for writing website scrapers/spiders. Whereas traditional methods focus on writing the code to parse HTML/forms themselves, PyKHTML uses the excellent KHTML engine to do all the trudge work. It therefore handles webpages very well (even the severely crufty ones) and...
Platforms: *nix
License: Freeware | Size: 26.62 KB | Download (95): PyKHTML Download |
Claros Mini is a multi-protocol (POP3/IMAP) Web mail client with a user interface that is specially designed for devices with small screens. Claros Mini can parse HTML and extract text to reduce the size of messages. The installation process takes less than two minutes, and no database setup is...
Platforms: *nix
License: Freeware | Size: 4.4 MB | Download (127): Claros Mini Download |
G-Cows is a software project consisting in: - definition of a scripting language designed for creation of web sites; - interpreter for the scripting language (Cows); - a makefile generator (Cows-mkgen). Cows is the interpreter for the Cows scripting language, used to parse HTML files...
Platforms: Windows, Mac, *nix, C/C++, BSD
License: Freeware | Download (58): G-Cows Download |
django-webtest is an almost trivial application for instant integration of Ian Bicking's WebTest (http://pythonpaste.org/webtest/) with django's testing framework. Installation pip install webtest pip install django-webtest or easy_install webtest easy_install django-webtest or grab latest...
Platforms: *nix
License: Freeware | Size: 10.24 KB | Download (43): django-webtest Download |
SYNOPSIS use HTML::TreeBuilder; use HTML::WikiConverter::Normalizer; my $tree = new HTML::TreeBuilder(); $tree->parse( text ); my $norm = new HTML::WikiConverter::Normalizer(); $norm->normalize($tree); # Roughly gives " text " print $tree->as_HTML(); HTML::WikiConverter dialects...
Platforms: *nix
License: Freeware | Size: 34.82 KB | Download (102): HTML::WikiConverter::Normalizer Download |
HTML::TreeBuilder is a parser that builds a HTML syntax tree. SYNOPSIS foreach my $file_name (@ARGV) { my $tree = HTML::TreeBuilder->new; # empty tree $tree->parse_file($file_name); print "Hey, heres a dump of the parse tree of $file_name:n"; $tree->dump; # a method we inherit from...
Platforms: *nix
License: Freeware | Size: 122.88 KB | Download (114): HTML::TreeBuilder Download |
HTML::FormHighlight Perl module can help you to highlights fields in an HTML form. SYNOPSIS use HTML::FormHighlight; my $h = new HTML::FormHighlight; print $h->highlight( scalarref => $form, fields => [ A, B, C ], ); print $h->highlight( scalarref => $form, fields => [ A, B, C ],...
Platforms: *nix
License: Freeware | Size: 5.12 KB | Download (94): HTML::FormHighlight Download |
Blatte::HTML is a Perl module that contains tools for generating HTML with Blatte. SYNOPSIS use Blatte; use Blatte::Builtins; use Blatte::HTML; $perl = &Blatte::Parse(...string of Blatte code...); $val = eval $perl; &Blatte::HTML::render($val, &emit); sub emit { print shift; }.
Platforms: *nix
License: Freeware | Size: 14.34 KB | Download (101): Blatte::HTML Download |