Download Shareware and Freeware Software for Windows, Linux, Macintosh, PDA

line Home  |  About Us  |  Link To Us  |  FAQ  |  Contact

Serving Software Downloads in 976 Categories, Downloaded 29.895.546 Times

WAIT 1.800

  Date Added: August 14, 2010  |  Visits: 613

WAIT

Report Broken Link
Printer Friendly Version


Product Homepage
Download (70 downloads)



WAIT Perl module is a rewrite of the freeWAIS-sf engine in Perl and XS. The central idea of the system is to provide a framework and the building blocks for any indexing and search system the users might want to build. Obviously the framework limits the class of system which can be build. +------+ +-----+ +------+ ==> |Access| ==> |Parse| ==> | | +------+ +-----+ | | || | | +-----+ || |Filter| ==> |Index| / | | +-----+ +-------+ +-----+ | | <= |Display| <== |Query| <-> | | +-------+ +-----+ +------+ A collection (aka table) is defined by the instances of the access and parse module together with the filter definitions. At query time in addition a query and a display module must be choosen. Access The access module defines which documents are members of a database. Usually an access module is a tied hash, whose keys are the Ids of the documents (did = document id) and whose values are the documents themselves. The indexing process loops over the keys using FIRSTKEY and NEXTKEY. Documents are retrieved with FETCH. By convention access modules should be members of the WAIT::Document hierarchy. Have a look at the WAIT::Document::Split module to get the idea. Parse The task of the parse module is to split the documents into logical parts via the split method. E.g. the WAIT::Parse::Nroff splits manuals piped through nroff(1) into the sections name, synopsis, options, description, author, example, bugs, text, see, and environment. Here is the implementation of WAIT::Parse::Base which handles documents with a pretty simple tagged format: AU: Pfeifer, U.; Fuhr, N.; Huynh, T. TI: Searching Structured Documents with the Enhanced Retrieval Functionality of freeWAIS-sf and SFgate ER: D. Kroemker BT: Computer Networks and ISDN Systems; Proceedings of the third International World-Wide Web Conference PN: Elsevier PA: Amsterdam - Lausanne - New York - Oxford - Shannon - Tokyo PP: 1027-1036 PY: 1995 sub split { # called as method my %result; my $fld; for (split /n/, $_[1]) { if (s/^(S+):s*//) { $fld = lc $1; } $result{$fld} .= $_ if defined $fld; } return %result; } Since the original document cannot be reconstructed from its attributes, we need a second method (tag) which marks the regions of the document with tags for the different attributes. This tagged form is used by the display module to hilight search terms in the documents. Besides the tags for the attributes, the method might assign the special tags _b and _i for indicating bold and italic regions. sub tag { my @result; my $tag; for (split /n/, $_[1]) { next if /^ww:s*$/; if (s/^(S+)://) { push @result, {_b => 1}, "$1:"; $tag = lc $1; } if (defined $tag) { push @result, {$tag => 1}, "$_n"; } else { push @result, {}, "$_n"; } } return @result; # we dont go for speed } Obviously one could implement split via tag. The reason for having two functions is speed. We need to call split for each document when indexing a collection. Therefore speed is essential. On the other hand, tag is called in order to display a single document and may be a little slower. It may care about tagging bold and italic regions. See WAIT::Parse::Nroff how this might decrease performance. Filter definition From the Information Retrieval perspective, the hardest part of the system is the filter module. The database administrator defines for each attribute, how the contents should be processed before it is stored in the index. Usually the processing contains steps to restrict the character set, case transformation, splitting to words and transforming to word stems. In WAIT these steps are defined naturally as a pipeline of processing steps. The pipelines are made up by functions in the package WAIT::Filter which is pre-populated by the most common functions but may be extended any time. The equivalent for a typical freeWAIS-sf processing would be this pipeline: [ isotr, isolc, split2, stop, Stem] The function isotr replaces unknown characters by blanks. isolc transforms to lower case. split2 splits into words and removes words shorter than two characters. stop removes the freeWAIS-sf stopwords and Stem applies the Porter algorithm for computing the stem of the words. The filter definition for a collection defines a set of pipelines for the attributes and modifies the pipelines which should be used for prefix and interval searches. Several complete working examples come with WAIT in the script directory. It is recommended to follow the pattern of the scripts smakewhatis and sman..

Requirements: No special requirements
Platforms: Linux
Keyword: Engine Filter Libraries Module Perl Perl Module Programming System Wait Wait Perl Words Xs
Users rating: 0/10

License: Freeware Size: 98.3 KB
USER REVIEWS
More Reviews or Write Review


WAIT RELATED
Libraries  -  OpenGeoDB Perl module 0.4
OpenGeDB Perl module is a module to access the OpenGeoDB database and calculate all ZIP codes in a certain radius..
3.07 KB  
Libraries  -  Opcode 5.8.8
Opcode is a Perl module created to disable named opcodes when compiling perl code. SYNOPSIS use Opcode; Perl code is always compiled into an internal format before execution. Evaluating perl code (e.g. via "eval" or "do file") causes the...
12.29 KB  
Scripts  -  WebAPP 1.0 SE
WebAPP is a popular, open source Content Management System (cms) written in the Perl programming language. The name WebAPP is an abbreviation of Web Automated Perl Portal. Available under the GNU General Public License, WebAPP is free software....
927.73 KB  
Backup Utilities  -  NFS Backup System 0.0.1
NFS Backup System is a Perl script that backup system for NFS. The basic function nfsbu is an automated perl-script backup system between an NFS server and an NFS client. Either by crontab or manual execution, directories listed in the...
5.12 KB  
Modules  -  Flickr API 5.x-1.x-dev 1.0
You don't need this module unless another module requires it or you want to develop a new Flickr-based module.To use this module a Flickr API key is required.InstallationUnpack in your modules folder (usually /sites/all/modules/) and enable under...
 
Dictionaries  -  General Intensional Programming System 1.0
The General Intensional Programming System (GIPSY) consists in three modular sub-systems: The General Intensional Programming Language Compiler (GIPC) ; the General Eduction Engine (GEE), and the Intensional Run-time Programming Environment (RIPE).
96.45 KB  
Code Management Tools  -  GIPSpin 0.1.4
GIPSpin is a graphical interface programming system which allows code to be visualized and which can generate threaded code. The user constructs code segments using visual boxes. The program flow is represented as links between the boxes....
3.2 MB  
Development Tools  -  PfP Studio 2.1b 1.0
PfP Studio is a visual programming system for rapid application development (RAD) of Web based forms using PHP and Javascript. The frontend runs in a browser. It is intended to complement the skills of the developer rather than masking out the...
 
Libraries  -  Module::Reload::Selective 1.02
Module::Reload::Selective can reload Perl modules during development. SYNOPSIS Instead of: use Foobar::MyModule; Do this: use Module::Reload::Selective; &Module::Reload::Selective->reload(qw(Foobar::MyModule)); Or, if you need the...
10.24 KB  
Libraries  -  Module::Build::JSAN 0.01
Module::Build::JSAN is a Perl module to build JavaScript modules for JSAN. SYNOPSIS use Module::Build::JSAN; my $build = Module::Build::JSAN->new( module_name => Foo-Bar, license => perl, dist_author => Joe Developer , dist_abstract =>...
5.12 KB  
NEW DOWNLOADS IN PROGRAMMING, LIBRARIES
Programming  -  FLEX-db Digital Asset Manager 3.0.9
FLEX-db - an enterprise Digital Asset Manager (DAM). It ingests and links metadata with files, creates thumbnails, and processes files using business rules. FLEX-db has a JSP client, Java app server for file input and output and an EJB metadata...
21.57 MB  
Programming  -  Libicom 0.9.0
The libicom library is a character based dynamicly linked library for Linux. It is used to remotely control the Icom IC-R8500 wide band receiver via an RS232 link. All call and return parameters to the control functions are character string based....
20.48 KB  
Programming  -  dotdesktop 0.3
Dotdesktop library provides ability to parse desktop entry file and access the information in a convenient way. Desktop entry file format is defined by freedesktop.org, it is used to describe information about an application such as the name and...
327.68 KB  
Programming  -  Cedalion for Linux 0.2.6
Cedalion is a programming language that allows its users to add new abstractions and define (and use) internal DSLs. Its innovation is in the fact that it uses projectional editing to allow the new abstractions to have no syntactic limitations.
471.04 KB  
Programming  -  libyasl 0.2
Libyasl is a C++ class library to easily realize TCP/UDP/Multicast clientsand servers in IPv4 and IPv6 environments under GNU/Linux systems.
143.36 KB  
Libraries  -  wolfSSL 3.11.0
The wolfSSL embedded SSL/TLS library is a lightweight SSL library written in ANSI standard C and targeted for embedded and RTOS environments - primarily because of its small size, speed, and feature set. It is commonly used in standard operating...
2.73 MB  
Libraries  -  EuGTK 4.8.9
Makes it easy to develop good- looking, fast, cross-platform programs that run on Linux, OS X, and Windows. Euphoria is a very fast interpreted/compiled language with straight-forward syntax. EuGTK allows programming in a clean, object-oriented...
10.68 MB  
Libraries  -  Linux User Group Library Manager 1.0
The LUG Library Manager is a project to help Linux User Groups start their own library. A LUG library is helpful to the community at large because it increases access to information, and gives everyone the opportunity to become more knowledgeable.
5.35 KB  
Libraries  -  Module::MakefilePL::Parse 0.12
Module::MakefilePL::Parse is a Perl module to parse required modules from Makefile.PL. SYNOPSIS use Module::MakefilePL::Parse; open $fh, Makefile.PL; $parser = Module::MakefilePL::Parse->new( join("", ) ); $info = $parser->required;...
8.19 KB  
Libraries  -  sqlpp 0.06
sqlpp Perl package is a SQL preprocessor. sqlpp is a conventional cpp-alike preprocessor taught to understand SQL ( PgSQL, in particular) syntax specificities. In addition to the standard #define/#ifdef/#else/#endif cohort, provides also...
10.24 KB