Download Shareware and Freeware Software for Windows, Linux, Macintosh, PDA

line Home  |  About Us  |  Link To Us  |  FAQ  |  Contact

Serving Software Downloads in 976 Categories, Downloaded 29.546.940 Times

Tag Soup 1.0.5

  Date Added: October 23, 2010  |  Visits: 815

Tag Soup

Report Broken Link
Printer Friendly Version


Product Homepage
Download (77 downloads)



TagSoup is a SAX2 parser written in Java that, instead of parsing well-formed or valid XML. Tag Soup parses HTML as it is found in the wild: nasty and brutish, though quite often far from short. By providing a SAX interface, it allows standard XML tools to be applied to even the worst HTML. It is a parser, not a whole application; it isnt intended to permanently clean up bad HTML, as HTML Tidy does, only to parse it on the fly. The following options are understood: --files Output into individual files, with html extensions changed to xhtml. Otherwise, all output is sent to the standard output. --html Output is in clean HTML: the XML declaration is suppressed, as are end-tags for the known empty elements. --omit-xml-declaration The XML declaration is suppressed. --method=html End-tags for the known empty HTML elements are suppressed. --pyx Output is in PYX format. --pyxin Input is in PYXoid format (need not be well-formed). --nons Namespaces are suppressed. Normally, all elements are in the XHTML 1.x namespace, and all attributes are in no namespace. --nobogons Bogons (unknown elements) are suppressed. Normally, they are treated as empty. --nodefaults suppress default attribute values --nocolons change explicit colons in element and attribute names to underscores --norestart dont restart any normally restartable elements --any Bogons are given a content model of ANY rather than EMPTY. --lexical Pass through HTML comments. Has no effect when output is in PYX format. --reuse Reuse a single instance of TagSoup parser throughout. Normally, a new one is instantiated for each input file. --nocdata Change the content models of the script and style elements to treat them as ordinary #PCDATA (text-only) elements, as in XHTML, rather than with the special CDATA content model. --encoding=encoding Specify the input encoding. The default is the Java platform default. --help Print help. --version Print the version number..

Requirements: No special requirements
Platforms: Linux
Keyword: Elements Html Markup Output Parser Soup Suppressed Tag Tag Soup Tagsoup Text Editing Processing Written In Xml
Users rating: 0/10

License: Freeware Size: 51.2 KB
USER REVIEWS
More Reviews or Write Review


TAG SOUP RELATED
Network & Internet  -  media4moin 0.1.9
media4moin project is a parser plugin for the MoinMoin Wiki software to parse pages written in the MediaWiki syntax..
20.48 KB  
Utilities  -  ShaniXmlParser 1.4.15
ShaniXmlParser is an XML/HTML DOM/SAX parser that can be validating. It can parse badly formed XML files. ShaniXmlParser can parse files with inverted tags and bad escaped &, . ShaniXmlParser expands all HTML entities. ShaniXmlParser is...
2 MB  
Utilities  -  Arabica January 2007
Arabica is a C++ XML parser toolkit that has a full SAX2 implementation (the Simple API for XML), including the optional interfaces and helper classes. It also implements the W3C DOM (Document Object Model) Level 2.0 Core, together with XPath 1.0....
256 KB  
Utilities  -  Papyrus 1.7.0
Papyrus project is an XML reporting engine for Linux. Papyrus enables you to generate reports from a variety of different SQL databases. Your reports can be generated as PDF, PostScript, XML, HTML, DVI, Latex or straight ANSI text. Papyrus will...
788.48 KB  
Utilities  -  RXP 1.4.4
RXP is a validating XML parser written in C. RXP project is used by the LT XML toolkit, and the Festival speech synthesis system. The current version of RXP supports XML 1.1, Namespaces 1.1, xml:id, and XML Catalogs. To use an XML Catalog, set...
153.6 KB  
Utilities  -  libsgml 1.1.4
libsgml is a fast, lightweight state machine SGML parser capable of parsing HTML, XML, and most other markup languages in their most elementary forms. libsgml library natively supports parsing HTML and XML documents into a tree format (DOM). All...
102.4 KB  
Libraries  -  SafeHTML 1.3.7
SafeHTML is an anti-XSS HTML parser, written in PHP. This parser strips down all potentially dangerous content within HTML: - opening tag without its closing tag - closing tag without its opening tag - any of these tags: “base”,...
15.36 KB  
Utilities  -  Grutatxt 2.0.13
Grutatxt is a plain text to HTML (and other formats) converter. Grutatxt project succesfully converts subtle text markup to lists, bold, italics, tables and headings to their corresponding HTML, troff, man page or LaTeX markup without having to...
29.7 KB  
Utilities  -  AsmXml 0.3
AsmXml is a very fast XML parser and decoder for x86 platforms. The project achieves high speed by using the following features: - Written in pure assembler - Optimized memory accesses - Parsing and decoding at the same time To give an idea...
95.23 KB  
Utilities  -  lMaker 1.11
lMaker is a php class designed for web masters and programmers who want a simple way to generate complex, dynamic web sites from easily maintainable text files. lMaker project is designed to help automate some of the most repetitive features of...
6.14 KB  
NEW DOWNLOADS IN LINUX SOFTWARE, UTILITIES
Linux Software  -  Polling Autodialer Software 3.4
ICTBroadcast Auto Dialer software has a survey campaign for telephone surveys and polls. This auto dialer software automatically dials a list of numbers and asks them a set of questions that they can respond to, by using their telephone keypad....
488 B  
Linux Software  -  Total Video Converter Mac Free 3.5.5
Total Video Converter Mac Free developed by EffectMatrix Ltd is the official legal version of Total Video Converter which was a globally recognized brand since 2006. Total Video Converter Mac Free is a free but powerful all-in-one video...
17.7 MB  
Linux Software  -  Skeith mod_log_sql Analyzer 2.10beta2
Skeith is a php based front end for analyzing logs for Apache using mod_log_sql.
47.5 KB  
Linux Software  -  SLAX 6.0+
Slax is a modern, portable, small and fast Linux operating system with a modular approach and outstanding design. Despite its small size, Slax provides a wide collection of pre-installed software for daily use, including a well organized graphical...
190 KB  
Linux Software  -  GTK+ 2.5
GTK+, which stands for the GIMP Toolkit, is a library for creating graphical user interfaces for the X Window System. It is designed to be small, efficient, and flexible. GTK+ is written in C with a very object-oriented approach. Language bindings...
60 MB  
Utilities  -  LPAR2RRD 4.95-4
LPAR2RRD collects performance data and generates actual, historical and future trends utilization graphs of your virtual environment. It is agentless (it receives everything from the management stations like vCenter or HMC). The product supports...
2.25 MB  
Utilities  -  Nessconnect 1.0.2
Nessconnect is a GUI, CLI and API client for Nessus and Nessus compatible servers. With an improved user interface, it provides local session management, scan templates, report generation through XSLT, charts and graphs, and vulnerability trending.
819.2 KB  
Utilities  -  Dynamic Power Management 2.6.16
The Dynamic Power Management (DPM) project explores technologies to improve power conservation capabilities of platforms based on open source software. Of particular interest are techniques applicable to running systems, adjusting power parameters...
30.72 KB  
Utilities  -  Ethernet bridge tables 2.4.37.9
Ethernet bridge tables - Linux Ethernet filter for the Linux bridge. The 2.4-ebtables-brnf package contains the ebtables+bridge-nf patch. Be sure to check out the ebtables hp. This site also contains the arptables userspace tool.
40.96 KB  
Utilities  -  SaraB 1.0.0
SaraB works with DAR (Disk ARchive) to schedule and rotate backups on random-access media (i.e. hard drives, CDs, DVDs, Zip, etc. Basically anything except magnetic tapes.) This reduces hassle for the administrator by providing an automatic backup...
20.48 KB