Download Shareware and Freeware Software for Windows, Linux, Macintosh, PDA

line Home  |  About Us  |  Link To Us  |  FAQ  |  Contact

Serving Software Downloads in 956 Categories, Downloaded 50.479.605 Times

Corpora software
 

Corpora

Released: July 20, 2012  |  Added: July 20, 2012 | Visits: 331

Poliqarp for Linux Poliqarp is designed to be a universal suite of utilities for large corpora processing. You can use this accessible tool to create corpora of texts written in almost any language in its native script -  be it English, Polish, Japanese or Thai -  as long as they are encoded in the UTF-8... Platforms: Linux

License: Freeware Size: 1.6 MB Download (45): Poliqarp for Linux Download

Added: May 12, 2010 | Visits: 1.043

JBootCat JBootCat is a Java implemention of the BootCat scripts written by Marco Baroni et al for generating corpora from the Internet. JBootCats main goal is to encapsulate the BootCat functionality within a user-friendly desktop application. The advantage of using the Java platform is that JBootCat... Platforms: *nix

License: Freeware Size: 1013.76 KB Download (93): JBootCat Download

Added: March 02, 2010 | Visits: 569

Poliqarp Poliqarp is a utility for searching large corpora.. Platforms: *nix

License: Freeware Size: 798.72 KB Download (88): Poliqarp Download

Added: November 14, 2010 | Visits: 902

Uplug Uplug is a collection of tools for linguistic corpus processing, word alignment, and term extraction from parallel corpora. Several tools have been integrated in Uplug. Pre-processing tools include a sentence splitter, tokenizer, and external part-of-speech tagger and shallow parsers. The... Platforms: *nix

License: Freeware Size: 21.9 MB Download (108): Uplug Download

Released: July 23, 2012  |  Added: July 23, 2012 | Visits: 268

CorpusFiltergraph CorpusFiltergraph is a framework installed on every edition of DoMY that empowers users with "Graphs" of "Plug-ins". CorpusFiltergraph allows you to extract, filter, align and transform text data from multilingual documents into parallel training corpora. The application has already transformed... Platforms: Windows

License: Freeware Download (44): CorpusFiltergraph Download

Released: August 21, 2012  |  Added: August 21, 2012 | Visits: 413

The NITE XML Toolkit The NITE XML Toolkit supports the creation, analysis, and browsing of annotated multimodal, text, or spoken language corpora, and represents both timing and rich linguistic structure. It contains libraries for developers and some end user tools. Platforms: Windows, Mac, Linux

License: Freeware Size: 38.79 MB Download (47): The NITE XML Toolkit Download

Added: September 04, 2013 | Visits: 309

ABNER ABNER is a software tool for molecular biology text analysis. It began as a user-friendly interface for a system developed as part of the NLPBA/BioNLP 2004 Shared Task challenge. The details of that system are described in the paper below (Settles, 2004). At ABNER's core is a statistical machine... Platforms: Mac

License: Shareware Cost: $0.00 USD Size: 9.5 MB Download (37): ABNER Download

Added: October 10, 2013 | Visits: 353

TTC Term Suite This is the Open Source and UIMA-based application drawn out from the European project TTC Terminology Extraction, Translation Tools and Comparable Corpora. This project aims at leveraging machine translation, computer-assisted translation and multilingual content management tools by... Platforms: Mac

License: Freeware Size: 4.68 MB Download (38): TTC Term Suite Download

Added: June 24, 2013 | Visits: 388

Stanford Named Entity Recognizer Stanford NER (also known as CRFClassifier) is a Java implementation of a Named Entity Recognizer. Named Entity Recognition (NER) labels sequences of words in a text which are the names of things, such as person and company names, or gene and protein names. The software provides a general... Platforms: Mac

License: Shareware Cost: $0.00 USD Size: 59.59 MB Download (35): Stanford Named Entity Recognizer Download

Added: July 10, 2013 | Visits: 180

DeSR DeSR is a multilingual statistical dependency parser. It produces dependency parse trees for natural language sentences using a parsing model learned from annotated corpora. Platforms: *nix

License: Freeware Size: 3.26 MB Download (36): DeSR Download

Added: August 12, 2013 | Visits: 379

iracema Iracema is a Named Entity Recognition and Classification (NERC) library that aims to provide algorithms and commonly used functionality for both implementing and evaluating NERC systems. It is implemented in Java. Iracema features: * A Flexible architecture for implementing and evaluating NERC... Platforms: Mac

License: Freeware Size: 41.91 MB Download (41): iracema Download

Added: October 28, 2013 | Visits: 311

Knowtator Knowtator is a general-purpose text annotation tool that is integrated with the Prot?*A*g?*A* knowledge representation system. Knowtator facilitates the manual creation of training and evaluation corpora for a variety of biomedical language processing tasks. Building on the strengths of the... Platforms: Mac

License: Freeware Size: 1.45 MB Download (37): Knowtator Download

Added: October 25, 2013 | Visits: 486

Emdros for linux Emdros is an Open-Source text database engine for storage and retrieval of analyzed or annotated text. Emdros has a powerful query-language for asking relevant questions of the data. Emdros has wide applicability in fields that deal with analyzed or annotated text. Application domains include... Platforms: *nix

License: Freeware Size: 8.33 MB Download (47): Emdros for linux Download

Added: November 12, 2013 | Visits: 451

libleipzig libleipzig-python provides a wrapper to the web services provided by the Deutscher Wortschatz project of the University of Leipzig. Deutscher Wortschatz is a German database of text corpora and can be utilized to analyze and contextualize words in the thesaurus. libleipzig currently supports all... Platforms: *nix

License: Freeware Size: 10.24 KB Download (38): libleipzig Download

Added: October 03, 2013 | Visits: 339

CorpusSearch for Linux CorpusSearch is a tool that finds syntactic structures in a corpus of annotated sentence trees. It can be used as a research tool on a corpus, or as a development tool for building the corpus. CorpusSearch 2 is a Java program that supports research in corpus linguistics. It is useful both for... Platforms: *nix

License: Freeware Size: 2.92 MB Download (36): CorpusSearch for Linux Download