Download Shareware and Freeware Software for Windows, Linux, Macintosh, PDA

line Home  |  About Us  |  Link To Us  |  FAQ  |  Contact

Serving Software Downloads in 956 Categories, Downloaded 50.347.000 Times

Corpus freeware
Filter: All | Freeware | Demo
 

Corpus

1 2 > 
Added: October 07, 2013 | Visits: 306

Arabic Corpus The Arabic Corpus is composed of arabic texts for text categorization. The corpus Khaleej-2004 contains 5690 documents. It is divided to 4 topics (categories). The corpus Watan-2004 contains 20291 documents organized in 6 topics (categories).



Platforms: *nix

License: Freeware Size: 13.76 MB Download (32): Arabic Corpus Download

Added: July 05, 2013 | Visits: 467

Bitextor Bitextor is an application created to generate translation memories using multilingual websites as a corpus source. It downloads an entire website and applies a set of heuristics (based mainly on HTML tag structure and text block length) to find bitexts.





Platforms: *nix

License: Freeware Size: 204.8 KB Download (35): Bitextor Download

Added: October 03, 2013 | Visits: 337

CorpusSearch for Linux CorpusSearch is a tool that finds syntactic structures in a corpus of annotated sentence trees. It can be used as a research tool on a corpus, or as a development tool for building the corpus. CorpusSearch 2 is a Java program that supports research in corpus linguistics. It is useful both for...


Platforms: *nix

License: Freeware Size: 2.92 MB Download (36): CorpusSearch for Linux Download

Added: November 14, 2010 | Visits: 899

Uplug Uplug is a collection of tools for linguistic corpus processing, word alignment, and term extraction from parallel corpora. Several tools have been integrated in Uplug. Pre-processing tools include a sentence splitter, tokenizer, and external part-of-speech tagger and shallow parsers. The...


Platforms: *nix

License: Freeware Size: 21.9 MB Download (108): Uplug Download

Released: June 22, 2012  |  Added: June 22, 2012 | Visits: 322

Corsis (formerly Tenka Text) An open-source corpus analysis class library written in C#. GUI of Tenka Text 0.1.3 comes with Wordlister - an advanced, extremely fast graphical wordlist tool and a simple regex concordance tool. Tenka Text - the open-source answer to WordSmith Tool


Platforms: Windows, Mac, BSD, Solaris, Linux

License: Freeware Size: 707.74 KB Download (51): Corsis (formerly Tenka Text) Download

Released: December 17, 2012  |  Added: December 17, 2012 | Visits: 371

Emdros Emdros is a corpus query system for storage and retrieval of linguistic analyses of text. It is especially applicable in corpus linguistics dealing with syntax, morphology, phonology, and/or discourse. It is also a generally useful text database engine.


Platforms: Windows, Mac, Solaris, Linux

License: Freeware Size: 8.33 MB Download (48): Emdros Download

Released: August 26, 2012  |  Added: August 26, 2012 | Visits: 233

PyAnnotation PyAnnotation is a Python Library to access and manipulate linguistically annotated corpus files. Supported file formats are Kura XML, Elan XML and Toolbox files. A Corpus Reader API is provided to support statistical analysis within the NLTK.


Platforms: Windows, Mac, Linux

License: Freeware Size: 45.38 KB Download (46): PyAnnotation Download

Released: August 25, 2012  |  Added: August 25, 2012 | Visits: 478

TXM TXM is a free and open-source cross-platform Unicode & XML based text/corpus analysis environment and graphical client, supporting Windows, Linux and Mac OS X. It can also be used online as a J2EE standard compliant web portal (GWT based) with access control built in. It offers a comprehensive...


Platforms: Windows, Mac, Linux

License: Freeware Size: 2.46 MB Download (46): TXM Download

Added: March 31, 2013 | Visits: 446

Solace Node Reference This module extends the nodereference fields by providing a filter based searching engine in order to automatically fill it using the Solace API filters features as backend.This means you can attach a SolR filter instance to node reference fields. Then, any node owner can enable and configure a...


Platforms: PHP

License: Freeware Size: 30.72 KB Download (46): Solace Node Reference Download

Added: November 15, 2013 | Visits: 407

Cunei Machine Translation Platform Cunei is a data-driven machine translation system that builds dynamic, statistical models based on instances of known translations found in a corpus.


Platforms: *nix

License: Freeware Size: 174.08 KB Download (38): Cunei Machine Translation Platform Download

Added: November 07, 2013 | Visits: 301

minitage.core A meta package-manager to deploy projects on UNIX Systemes sponsored by Makina Corpus. FEATURES; * Auto Update system. When minimerge upgrade (easy_install -U), we have now the infrastructure to run update callbacks. * Now minibuilds have revisions, this can facilitate their reinstallation as...


Platforms: *nix

License: Freeware Size: 133.12 KB Download (32): minitage.core Download

Added: June 15, 2013 | Visits: 336

minitage.paste PasteScripts to facilitate use of minitage and creation of minitage based projects sponsored by Makina Corpus. Projects templates * minitage.zope3: A sample layout for a zope 3 application * minitage.plone25: A sample layout for a plone 25 application * minitage.plone3: A sample layout for a...


Platforms: *nix

License: Freeware Size: 634.88 KB Download (40): minitage.paste Download

Added: July 05, 2010 | Visits: 743

Statistics::MaxEntropy MaxEntropy is a Perl5 module for Maximum Entropy Modeling and Feature Induction. SYNOPSIS use Statistics::MaxEntropy; # debugging messages; default 0 $Statistics::MaxEntropy::debug = 0; # maximum number of iterations for IIS; default 100 $Statistics::MaxEntropy::NEWTON_max_it = 100; #...


Platforms: *nix

License: Freeware Size: 41.98 KB Download (100): Statistics::MaxEntropy Download

Added: April 11, 2010 | Visits: 957

DadaDodo DadaDodo project is a program that generates random sentences based on input files. Sometimes these sentences are nonsense; but sometimes they cut right through to the heart of the matter, and reveal hidden meanings. DadaDodo works rather differently than Dissociated Press; whereas...


Platforms: *nix

License: Freeware Size: 22.53 KB Download (106): DadaDodo Download

Added: November 06, 2010 | Visits: 695

Knorpora Knorpora is a modified version of the Knoppix 3.3 Live CD for students of corpus-based computational linguistics. Like Knoppix, the Knorpora CD allows you to run a fully operational Debian/Linux operating system from the CD-ROM drive, without installing anything on the computer. The Knorpora...


Platforms: *nix

License: Freeware Size: 676.4 MB Download (91): Knorpora Download

Added: January 26, 2010 | Visits: 836

Netkit 4 Understanding computer networks without performing practical experiments is really difficult, not to say it is almost impossible. Unfortunately, setting up a networking lab can be very expensive. Netkit has been conceived as an environment for setting up and performing networking experiments at...


Platforms: *nix

License: Freeware Size: 778.24 KB Download (137): Netkit 4 Download

Added: March 07, 2010 | Visits: 621

Search::Lemur Search::Lemur is a Perl class to query a Lemur server, and parse the results. SYNOPSYS use Search::Lemur; my $lem = Search::Lemur->new("http://url/to/lemur.cgi"); # run some queries, and get back an array of results # a query with a single term: my @results1 = $lem->query("encryption");...


Platforms: *nix

License: Freeware Size: 8.19 KB Download (89): Search::Lemur Download

Added: March 04, 2010 | Visits: 780

Search::FreeText Search::FreeText is a free text indexing module for medium-to-large text corpuses. SYNOPSIS my $test = new Search::FreeText(-db => [DB_File, "stories.db"]); $text->open_index(); $text->clear_index(); $text->index_document(1, "Hello world"); $text->index_document(2, "World in motion");...


Platforms: *nix

License: Freeware Size: 10.24 KB Download (95): Search::FreeText Download

Added: June 12, 2010 | Visits: 759

TextSearch TextSearch is a program that helps you search through a set of text files which are in a hierarchical structure, i.e. a directory structure. Each document is searched using a regular expression and an overview of the results is shown as a tree structure. By clicking on a file, it can be viewed,...


Platforms: *nix

License: Freeware Size: 15.36 KB Download (96): TextSearch Download

Added: April 04, 2010 | Visits: 890

mime4j mime4j project provides a parser, MimeStreamParser , for e-mail message streams in plain rfc822 and MIME format. The parser uses a callback mechanism to report parsing events such as the start of an entity header, the start of a body, etc. If you are familiar with the SAX XML parser interface you...


Platforms: *nix

License: Freeware Download (96): mime4j Download

1 2 >