Indexing Documents
Minnow is a Web-based PDF discussion engine for discussing and annotating documents in an online, multi-user environment. Minnow was developed by the Mathematical Biology group at the Northwest Fisheries Science Center in Seattle, WA with support by NOAA/NMFS. Whats New in This Release: -...
Platforms: *nix
License: Freeware | Size: 440.32 KB | Download (91): Minnow Download |
JODConverter, the Java OpenDocument Converter, converts documents between different office formats. The project leverages OpenOffice.org, which provides arguably the best import/export filters for OpenDocument and Microsoft Office formats available today. JODConverter automates all...
Platforms: *nix
License: Freeware | Download (206): JODConverter Download |
JOOReports (Java/OpenOffice Reports) is an open source solution for creating office documents and reports in Java, using OpenOffice.org. Its primary goal is making template composition easy. Templates are regular word processor documents, created the OpenOffice.org Writer, with just a few...
Platforms: *nix
License: Freeware | Size: 5 MB | Download (115): JooReports Download |
XMLParser is a library that assists in parsing XML documents into generic PHP arrays. It also comes with RSSParser, an extension of XMLParser that creates simple RSS-specific array structures from RSS feeds..
Platforms: *nix
License: Freeware | Size: 11.26 KB | Download (100): XMLParser for PHP Download |
PdfRipImage is a program to automatically extract images from PDF documents and convert them to a format of your choice (such as JPEG or TIFF). It runs on UNIX-like platforms and requires utilities from netpbm and xpdf.
Platforms: *nix
License: Freeware | Size: 10.24 KB | Download (111): PdfRipImage Download |
mod_xhtml_neg performs content-negotiation for XHTML documents conforming to Appendix C of the XHTML 1.0 specification. This module for the the Apache HTTP server gives it the ability to correctly negotiate content types for XHTML documents. Without negotiation, these would be sent as...
Platforms: *nix
License: Freeware | Size: 32.77 KB | Download (88): mod_xhtml_neg Download |
js-search is a javascript indexing and searching. A client-side library for building a simple inverted index, and searching it. You can download the source code from SVN with the following command: svn checkout http://js-search.googlecode.com/svn/trunk/ js-search.
Platforms: *nix
License: Freeware | Download (117): js-search Download |
PDFKreator is an easy to use KDE tool for creating PDF documents out of a bunch of image files. It heavily uses ImageMagicks convert tool, tiff2ps and ps2pdf..
Platforms: *nix
License: Freeware | Size: 60.42 KB | Download (113): PDFKreator Download |
Rextile project allows you to build XHTML documents and entire Web sites with ease. You write text using Textile (a format much more concise than XHTML), automate document parts with Ruby scripting, and generate the site offline (the server gets static XHTML). Rextile was inspired by Xilize....
Platforms: *nix
License: Freeware | Size: 43.01 KB | Download (103): Rextile Download |
mairix is a tool for indexing and searching email messages stored in Maildir, MH, or mbox folders. The index contains a map of which words occur in which parts of which messages. Searches on this index are fast and generate symlinks to the matching messages in a new Maildir or MH folder, or...
Platforms: *nix, C/C++, BSD
License: Freeware | Download (113): mairix Download |
mod_auth_useragent2 is an Apache module that can be used to limit access to documents by means of the User-Agent. As an authentication method, this is really unsafe because it is easy to change the User-Agent in most browsers, but it could be used to prevent stupid bots and spiders from...
Platforms: *nix
License: Freeware | Size: 11.26 KB | Download (85): mod_auth_useragent2 Download |
eCromedos is a document preparation system that allows concurrent publication of documents in print and web. Documents are written in an XML-conforming markup language and converted to HTML or printable document formats by means of a special software. eCromedoss defines document formats for a...
Platforms: *nix
License: Freeware | Size: 368.64 KB | Download (92): eCromedos Download |
PlainDoc (pd2tex) document production system allows you to write documents as normal text files. pd2tex tool converts the plain text files to: - TeX which then gets converted to pdf (you need pdflatex tool installed) - DocBook (dbx) which can be fed to various tool chains (not supplied) to...
Platforms: *nix
License: Freeware | Size: 102.4 KB | Download (99): PlainDoc Download |
docbook2X is a software package that converts DocBook documents into the GNU Texinfo format and the traditional Unix man page format. Notable features include table support for man pages, internationalization support, and easy customization of the output using XSLT. (Easy, because unlike other...
Platforms: *nix
License: Freeware | Size: 450.56 KB | Download (95): docbook2X Download |
Pavuk is UNIX program used to mirror contents of WWW documents or files. It transfers documents from HTTP, FTP, Gopher and optionaly from HTTPS (HTTP over SSL) servers. The project has an optional GUI based on GTK2 widget set.. Main page of pavuk web grabber.
Platforms: *nix
License: Freeware | Size: 563.2 KB | Download (98): Pavuk Download |
Search::FreeText is a free text indexing module for medium-to-large text corpuses. SYNOPSIS my $test = new Search::FreeText(-db => [DB_File, "stories.db"]); $text->open_index(); $text->clear_index(); $text->index_document(1, "Hello world"); $text->index_document(2, "World in motion");...
Platforms: *nix
License: Freeware | Size: 10.24 KB | Download (95): Search::FreeText Download |
Pod::WSDL is a Perl module that creates WSDL documents from (extended) pod. SYNOPSIS use Pod::WSDL; my $pod = new Pod::WSDL(source => My::Server, location => http://localhost/My/Server, pretty => 1, withDocumentation => 1); print $pod->WSDL; Parsing the pod How does Pod::WSDL work?...
Platforms: *nix
License: Freeware | Size: 27.65 KB | Download (91): Pod::WSDL Download |
Although mod_tidy makes the impression of being a validator, it isnt one mod_tidy is just a handy and comfortable tool to help make web documents being valid. mod_tidy is a TidyLib based DSO module for the Apache HTTP Server Version 2 to parse, clean-up and pretty-print the webservers (X)HTML...
Platforms: *nix
License: Freeware | Size: 30.72 KB | Download (94): mod_tidy Download |
LaTeX Symbols Selector project is a symbol browser to help creating LaTeX documents with many math symbols. All symbols are grouped into categories and user can copy symbol name to system-wide clipboard (or insert it directly to first running copy of gVIM) by selecting symbol icon from list....
Platforms: *nix
License: Freeware | Size: 368.64 KB | Download (253): LaTeX Symbols Selector Download |
Zoho QuickRead allows you to open documents, spreadsheets and more. Zoho Office Suite is a suite of web applications which let you create documents (http://www.zoho writer.com), spreadsheets (http://www.zohosheet.com) & presentations (http://www.zohoshow.com) using just your browser & internet...
Platforms: *nix
License: Freeware | Size: 14.34 KB | Download (100): Zoho QuickRead Download |