Download Shareware and Freeware Software for Windows, Linux, Macintosh, PDA

line Home  |  About Us  |  Link To Us  |  FAQ  |  Contact

Serving Software Downloads in 976 Categories, Downloaded 29.545.312 Times

Search::FreeText 0.05

  Date Added: March 04, 2010  |  Visits: 559

Search::FreeText

Report Broken Link
Printer Friendly Version


Product Homepage
Download (73 downloads)



Search::FreeText is a free text indexing module for medium-to-large text corpuses. SYNOPSIS my $test = new Search::FreeText(-db => [DB_File, "stories.db"]); $text->open_index(); $text->clear_index(); $text->index_document(1, "Hello world"); $text->index_document(2, "World in motion"); $text->index_document(3, "Cruel crazy beautiful world"); $text->index_document(4, "Hey crazy"); $text->close_index(); $text->open_index(); foreach ($text->search("Crazy", 10)) { print "$_->[0], $_->[1]n"; }; $text->close_index(); This module provides free text searching in a relatively open manner. It allows a persistent inverted file index to be constructed and managed (within limits), and then to be searched fairly efficiently. The module depends on a DBM module of some kind to manage the inverted file (DB_File is usually the best choice, as it is quite fast, quite scaleable, and accepts the long values that are needed for performance. The free text searching algorithm used is the BM25 weighting scheme described in Robertson, S. E., Walker, S., Beaulieu, M. M., Gatford, M., and Payne, A. (1995). Okapi at TREC-4, in NIST Special Publication 500-236, the Fourth Text Retrieval Conference (TREC-4), pages 73-96. Much of the module depends on an open lexical analysis system, which is implemented by Search::FreeText::LexicalAnalysis. This is where all the word splitting and stemming is handled (Lingua::Stem is used for the stemming). Using the module is quite simple: you can open an index and close it, and while it is open you add documents as strings, each with a key of your own choosing. You can search the corpus using a string, and you get back a list of matches, each an array of your own document key and a relevance measure. So, for example, the keys might be database table keys, URLs, file names, anything like that will do. This makes Search::FreeText a very useful package to implement fairly efficient and high quality search systems..

Requirements: No special requirements
Platforms: Linux
Keyword: Document File Freetext Index Libraries Module Open Programming Searchfreetext Text
Users rating: 0/10

License: Freeware Size: 10.24 KB
USER REVIEWS
More Reviews or Write Review


SEARCH::FREETEXT RELATED
Libraries  -  MP3::Tag::File 0.9708
MP3::Tag::File is a Perl module for reading / writing files. SYNOPSIS my $mp3 = MP3::Tag->new($filename); ($title, $artist, $no, $album, $year) = $mp3->parse_filename(); see MP3::Tag MP3::Tag::File is designed to be called from the...
174.08 KB  
Libraries  -  Config::File 1.4
Config::File is a Perl module to parse a simple configuration file. SYNOPSIS use Config::File; my $config_hash = Config::File::read_config_file($configuration_file); read_config_file parses a simple configuration file and stores its values...
4.1 KB  
Libraries  -  App::Conf::File 0.965
App::Conf::File is a Perl module to load and access configuration data. SYNOPSIS use App::Conf; $config = App::Conf->new(); $config = App::Conf->new(configFile => $file); print $config->dump(), "n"; # use Data::Dumper to spit out the Perl...
122.88 KB  
Site Search Tools  -  Fts7 1.2.1
Full-text search engine library written in Java. Builds a search index on Java objects, having a text content, and performs a quick search via this index. Main features: - you can index any Java objects having a text content. That may be...
27.93 MB  
Libraries  -  File::Format::RIFF 1.0.1
File::Format::RIFF is a Perl module to Resource Interchange File Format/RIFF files. SYNOPSIS use File::Format::RIFF; open( IN, file ) or die "Could not open file: $!"; my ( $riff1 ) = File::Format::RIFF->read( *IN ); close( IN );...
9.22 KB  
Libraries  -  File::Where 0.05
File::Where is a Perl module to find the absolute file for a program module; absolute dir for a repository. SYNOPSIS ####### # Subroutine interface # use File::Where qw(pm2require where where_dir where_file where_pm where_repository);...
83.97 KB  
Libraries  -  perlfaq3 5.8.8
perlfaq3 Perl module contains programming tools. How do I do (anything)? Have you looked at CPAN (see perlfaq2)? The chances are that someone has already written a module that can solve your problem. Have you read the appropriate manpages?...
12.2 MB  
Modules  -  Module Dependency Browser 6.x-1.0
Compiles a browsable index of module dependencies, allowing users to see which modules depend on a given module. Requires a local checkout of the contributions/modules directory. See the README.txt for more details.
10 KB  
Programming  -  PyOFC2 0.1.2
PyOFC2 - Python libraries for Open Flash Chart Installation Using Python Packaging Index: $ easy_install PyOFC2 From the source: $ git://github.com/btbytes/pyofc2.git
10.24 KB  
Text Management  -  WP2PDF 0.4.2
“WordPress to PDF” (Short: WP2PDF) is a script (or rather a collection of scripts) which can convert the output of WordPress to PDF (Portable Document File), a very popular document format created by Adobe which has the advantage that...
 
NEW DOWNLOADS IN PROGRAMMING, LIBRARIES
Programming  -  FLEX-db Digital Asset Manager 3.0.9
FLEX-db - an enterprise Digital Asset Manager (DAM). It ingests and links metadata with files, creates thumbnails, and processes files using business rules. FLEX-db has a JSP client, Java app server for file input and output and an EJB metadata...
21.57 MB  
Programming  -  Libicom 0.9.0
The libicom library is a character based dynamicly linked library for Linux. It is used to remotely control the Icom IC-R8500 wide band receiver via an RS232 link. All call and return parameters to the control functions are character string based....
20.48 KB  
Programming  -  dotdesktop 0.3
Dotdesktop library provides ability to parse desktop entry file and access the information in a convenient way. Desktop entry file format is defined by freedesktop.org, it is used to describe information about an application such as the name and...
327.68 KB  
Programming  -  Cedalion for Linux 0.2.6
Cedalion is a programming language that allows its users to add new abstractions and define (and use) internal DSLs. Its innovation is in the fact that it uses projectional editing to allow the new abstractions to have no syntactic limitations.
471.04 KB  
Programming  -  libyasl 0.2
Libyasl is a C++ class library to easily realize TCP/UDP/Multicast clientsand servers in IPv4 and IPv6 environments under GNU/Linux systems.
143.36 KB  
Libraries  -  EuGTK 4.8.9
Makes it easy to develop good- looking, fast, cross-platform programs that run on Linux, OS X, and Windows. Euphoria is a very fast interpreted/compiled language with straight-forward syntax. EuGTK allows programming in a clean, object-oriented...
10.68 MB  
Libraries  -  Linux User Group Library Manager 1.0
The LUG Library Manager is a project to help Linux User Groups start their own library. A LUG library is helpful to the community at large because it increases access to information, and gives everyone the opportunity to become more knowledgeable.
5.35 KB  
Libraries  -  Module::MakefilePL::Parse 0.12
Module::MakefilePL::Parse is a Perl module to parse required modules from Makefile.PL. SYNOPSIS use Module::MakefilePL::Parse; open $fh, Makefile.PL; $parser = Module::MakefilePL::Parse->new( join("", ) ); $info = $parser->required;...
8.19 KB  
Libraries  -  sqlpp 0.06
sqlpp Perl package is a SQL preprocessor. sqlpp is a conventional cpp-alike preprocessor taught to understand SQL ( PgSQL, in particular) syntax specificities. In addition to the standard #define/#ifdef/#else/#endif cohort, provides also...
10.24 KB  
Libraries  -  App::SimpleScan::Substitution::Line 2.02
App::SimpleScan::Substitution::Line is a line with optional fixed variable values. SYNOPSIS my $line = App::SimpleScan::Substitution::Line->new(" this "); # Use only this value when substituting " ". $line->fix(substituite =>...
54.27 KB