Tokenizer
Devel::Tokenizer::C is a Perl module that can generate C source for fast keyword tokenizer. SYNOPSIS use Devel::Tokenizer::C; $t = new Devel::Tokenizer::C TokenFunc => sub { "return U$_[0];n" }; $t->add_tokens(qw( bar baz ))->add_tokens([for]); $t->add_tokens([qw( foo )], defined...
Platforms: *nix
License: Freeware | Size: 11.26 KB | Download (91): Devel::Tokenizer::C Download |
String::Tokenizer is a simple string tokenizer. SYNOPSIS use String::Tokenizer; # create the tokenizer and tokenize input my $tokenizer = String::Tokenizer->new("((5+5) * 10)", +*()); # create tokenizer my $tokenizer = String::Tokenizer->new(); # ... then tokenize the string...
Platforms: *nix
License: Freeware | Size: 8.19 KB | Download (95): String::Tokenizer Download |
Library text-sentence is text tokenizer and sentence splitter. Input is for main function is text, list of known names and abbreviations. Result is list of tokens. Each token has type and other attributes i.e. is word, is number, is roman number, is sentence end, is name, is end of chapter etc....
Platforms: *nix
License: Freeware | Size: 20.48 KB | Download (39): text-sentence Download |
A run-time configurable character stream tokenizer that allows the user to define token classes via regular expressions. The developer is not limited to predefined notions of whitespace, commenting, or word modalities.
Platforms: Windows, Mac, Linux
License: Freeware | Size: 11.61 KB | Download (47): Lexer Download |
It implements a template engine with XSL-like syntax that is tokenizer-driven.Nstag works based on the idea to have special tags in a separate namespace to apply view related logic or just assignments.Nstag uses a tokenizer for parsing the dynamic contents which is a good choice regarding to...
Platforms: Windows, Mac, *nix, PHP, BSD Solaris
License: Freeware | Download (51): Nstag 2.0.0RC4 Download |
PHP Formatter is meant to reformat PHP source code. It uses the PHP tokenizer functions to parse PHP source and rewrites the same code using consistent indentation).
Platforms: PHP
License: Freeware | Size: 10 KB | Download (41): PHP Formatter Download |
DParser is an simple but powerful tool for parsing. You can specify the form of the text to be parsed using a combination of regular expressions and grammar productions. Because of the parsing technique (technically a scannerless GLR parser based on the Tomita algorithm) there are no...
Platforms: *nix
License: Freeware | Size: 184.32 KB | Download (57): DParser for Linux Download |
libUTL++ is a cross-platform C++ class library that provides a set of commonly useful functionality and abstractions to expedite C++ application development. Here are some of the highlights in terms of functionality: utl::Object provides a common interface for basic object functionality such...
Platforms: *nix
License: Freeware | Size: 5.67 MB | Download (101): libUTL++ Download |
Uplug is a collection of tools for linguistic corpus processing, word alignment, and term extraction from parallel corpora. Several tools have been integrated in Uplug. Pre-processing tools include a sentence splitter, tokenizer, and external part-of-speech tagger and shallow parsers. The...
Platforms: *nix
License: Freeware | Size: 21.9 MB | Download (108): Uplug Download |
DParser project is an simple but powerful tool for parsing. You can specify the form of the text to be parsed using a combination of regular expressions and grammar productions. Because of the parsing technique (technically a scannerless GLR parser based on the Tomita algorithm) there are no...
Platforms: *nix
License: Freeware | Size: 266.24 KB | Download (98): DParser Download |
ccovinstrument package contains instruments C/C++ code for test coverage analysis. SYNOPSIS ccovinstrument code.c > covcode.c ccovinstrument code.c [-f] -o covcode.c [-e errs] -f instrument fatal code as well as normal code Scans C/C++ source (before cpp) and inserts trip-wires in each...
Platforms: *nix
License: Freeware | Size: 15.36 KB | Download (95): ccovinstrument Download |
MillScript-XML project is an alternative Java XML parsing library with its own custom API. The underlying tokenizer can be configured to permit non-well-formed XML. This librarys API provides both an event model and a more conventional token stream model. The authors believe that the token...
Platforms: *nix
License: Freeware | Size: 67.58 KB | Download (100): MillScript-XML Download |
Plucene::Analysis::PorterStemFilter - Porter stemming on the token stream. SYNOPSIS # isa Plucene::Analysis:::TokenFilter my $token = $porter_stem_filter->next; This class transforms the token stream as per the Porter stemming algorithm. Note: the input to the stemming filter must...
Platforms: *nix
License: Freeware | Size: 327.68 KB | Download (88): Plucene::Analysis::PorterStemFilter Download |
Acme::OneHundredNotOut is a raise of the bat, a tip of the hat. I have just released my 100th module to CPAN, the first time that anyone has reached that target. As some of you may know, I am getting ready to go back to college and reinvent myself from being a programmer into being a...
Platforms: *nix
License: Freeware | Size: 14.34 KB | Download (95): Acme::OneHundredNotOut Download |
Lucene is a Perl API to the C++ port of the Lucene search engine. SYNOPSIS Initialize/Empty Lucene index my $analyzer = new Lucene::Analysis::Standard::StandardAnalyzer(); my $store = Lucene::Store::FSDirectory->getDirectory("/home/lucene", 1); my $tmp_writer = new...
Platforms: *nix
License: Freeware | Size: 18.43 KB | Download (93): Lucene Download |
Gives you back your inbox by bringing powerful Bayesian spam filtering to popular email clients. It learns what your spam looks like, so it can block nearly all of it. It looks at your address book and learns what your good messages look like, so it wont confuse them with spam.
Other spam...
Platforms: Mac
License: Freeware | Size: 7.5 MB | Download (53): SpamSieve for Mac OS Download |
Pure Python implementation of GOLD Parser Engine.GOLD Parser Engine is a LALR(1) parser with DFA tokenizer.It uses compiled grammar table generated by GOLD Parser Builder (not included - available on http://www.devincook.com/goldparser)
Platforms: Windows, Mac, Linux
License: Freeware | Size: 15.64 KB | Download (46): pygold Download |
PHPDoctor is an attempt to create a simpler and faster PHPDoc (Javadoc style comment parser for PHP) that produces standards compliant HTML. It is designed with an emphasis on speed and simplicity, meaning it is not as fully featured as the PEAR PHPDoc program, but is simple to configure, use,...
Platforms: Windows, Mac, *nix, PHP, BSD Solaris
License: Freeware | Download (51): PHPDoctor 2.0.0RC3 Download |
cssutils is a Python package to parse and build Cascading Style Sheets (CSS). DOM only, not any rendering facilities! Based upon and partly implementing the following specifications : CSS 2.1 General CSS rules and properties are defined here CSS 2.1 Errata A few errata, mainly the definition...
Platforms: Python
License: Freeware | Size: 542.72 KB | Download (42): cssutils Download |