Download Shareware and Freeware Software for Windows, Linux, Macintosh, PDA

line Home  |  About Us  |  Link To Us  |  FAQ  |  Contact

Serving Software Downloads in 976 Categories, Downloaded 35.235.442 Times

Mguesser 0.4

Company: Alexander Barkov
Date Added: August 04, 2013  |  Visits: 328


Report Broken Link
Printer Friendly Version

Product Homepage
Download (26 downloads)

WHAT'S THIS?<br /><br />mguesser is a standalone part of libmnogosearch (a core of mnogo search engine which allows to guess character set and language of a text file.<br /><br />mguesser is implemented using "N-Gram-Based Text Categorization" technique which is implemented in TextCat language guesser written in Perl ( mguesser is significantly faster than TextCat especially on large texts.<br /><br />This package consist of C written N-gram based algorithms as well as a number of maps for texts in various languages and character sets. Take a look into "maps" directory of this package to check the currently supported languages and character sets.<br />INSTALLATION<br /><br /> * Download source package from<br /> * Unpack the distribution<br /> * Change directory to the unpacked distribution, then type "make". <br /><br />By default, mguesser will seek for language maps in "maps" subdirectory of the current directory. You can change the default language map location in Makefile by redefining the "-DLMDIR" value.<br />USAGE<br /><br />mguesser takes a plain text data to STDIN. Note that other "almost text" formats like HTML will return bad results. In later releases I'll possibly add a command line switch to tell mguesser that the input data is HTML. mguesser works fine for texts with size starting from 500 bytes and longer. Shorter texts are guessed not so well.<br /><br />To guess language and character set of some text file use:<br /><br /> mguesser < text_file<br /><br />mguesser will display how much your file corresponds to various language maps in the order of quality. mguesser returns values between 0 and 1.<br /><br />You can also display a specified number of the best results using -n command line switch. For example, this command will display 3 best results:<br /><br /> mguesser -n3 < text_file<br /><br />To make mguesser load language maps from a non-default directory, use:<br /><br /> mguesser -d/path/to/maps/<br /><br />To load language maps from multiple directories, use a colon separated list:<br /><br /> mguesser -d/path/to/maps1/:/path/to/maps2/:/path/to/maps3/<br /><br />To create a new language map, use:<br /><br /> mguesser -p -c charset -l language < text_file<br /><br />When executed with -p command line parameter, mguesser creates a new language map built on text_file and prints it to STDOUT. Please note that to create a high quality language map, the source text file should be large enough. A 500 Kb text is usually enough to produce a high quality map.<br /><br />You can also include these files into your own applications. Take a look into main() function which is located in the guesser.c to check the order of guesser functions calls.<br />TODO<br /><br /> * Make it possible to guess other than text formats: HTML, XML<br /> * Implement various command line switches to choose output format <br /><br />Alexander Barkov <>

Requirements: No special requirements
Platforms: *nix, Linux
Keyword: Character Command Directory Display Guess Language Languages Large Mguesser Number Package Quality Quotmapsquot Results Switch Texts
Users rating: 0/10

License: Freeware Size: 143.36 KB
More Reviews or Write Review

Libraries  -  LCDML 1.2
LCDML project (or Liquid Crystal Display Markup Language) is a description language based on XML and used to describe the text that should be displayed on a LCD. It supports both static and dynamic text messages and bar charts and allows to...
6.14 KB  
Text Management  -  Guess language of text using ZIP 1.2
This script isĀ a small tool for language and author identificationĀ for textfiles.
Utilities  -  LargeFileViewer 0.61
Software to display the content of large text files LargeFileViewer is a free and easy to use software that can display the content of large, big, huge (text) files.
10.24 KB  
Desktop Toys  -  unicode-screensaver 0.2
unicode-screensaver is a simple screensaver application that repeatedly and picks an unicode character and displays it in a very large font size together with its unicode code point and the character name.
491.52 KB  
Games  -  Quiz for 007 : Detective Agent Bond Character Name Movie Edition Guess Game 1.2
Bond Guess Game It is the most fun game for those who love jame Bond. Lots of pictures in more than 300 levels are awaiting you to guess. Let's see if you can prove it! - Compete with players from all over the world!...
17.3 MB  
Networking  -  LJ Longtail SEO 1.8
LJ Longtail SEO is a tool that detects search engine visits and uses this information to display a list of links based on second page search results.The results in the database are aged off based on customizable settings so that once your longtail...
10 KB  
Programming  -  Yazoo 1.3.1
"Yazoo" is a command-line, interpreted scripting language that provides a ready-made environment for C or C++ functions. A user embeds his own routines into the language by referencing them in one of Yazoo's own source files, then recompiling...
81.92 KB  
Games  -  Guess The Game - quiz 1.25
Guess The Game guess games from screenshots! Do you think you know more games than anyone else? Do you argue with your friends about whos the best games expert? Guess games from screenshots, win achievements and share your results with friends...
11.5 MB  
Utilities  -  Label&Mark 1.0
It is the search & viewer of G-BookMarks. ? Filter a label and a title by a search character and can display it. ? Choose BookMark which I searched and can be maintained in a file. ? Choose BookMark to use with a special screen and...
102.4 KB  
Input Device Utilities  -  QuickiHash 1.00
QuickiHash is a lightweight application that has been designed to load quickly and display hashes for files in a number of different formats. Files can be dropped into the main window, on the application icon, specified via the commandline,...
45.73 KB  
Shell & Desktop  -  Glunarclock 0.32.4
GNOME Lunar Clock Applet displays the current phase of the Moon as an applet for the gnome panel. In the properties box you can choose between a real image Features Pointing with the mouse at the applet...
522.24 KB  
Shell & Desktop  -  Fekete 5
Icon theme for Linux For all possible desktop, and Linux distro Special additives: Suse's Yast icons. Xfce system icons, and archaic mimetypes icons Mandriva "special placed" status icons. Libreoffice icons.
71.59 MB  
Shell & Desktop  -  XFast 0.9
XFast is a slim and lightweighted desktop environment that incorporates X and Window Manager within the same project.
1.15 MB  
Shell & Desktop  -  print selection konqueror service menu 0.1
This service menu give you a *silly* way to print fast your selection on konqueror USE select the text copy the text rigt button on the webpage select print selection a kdialog will appear paste the text
10.24 KB  
Shell & Desktop  -  Faenza 1.2
Faenza icon theme is available to install for Ubuntu users via a PPA repository. View the README file for instructions and a list of known issues.
23.49 MB  
Text Editors  -  DocBook Doclet 6.0.3
DocBook Doclet (dbdoclet) creates DocBook XML and class diagrams from Javadoc comments, converts HTML to DocBook, and transfoms DocBook XML into various output formats. It consists of a complete DocBook distribution containing schemas and the...
57.64 MB  
Text Editors  -  text-hr 0.17
text-hr is Morphological/Inflection Engine for Croatian language written in Python programming language. Includes stopwords and Part-Of-Speech tagging engine (POS tagging) based on inverse inflection algorithm for detection. Since API...
112.64 KB  
Text Editors  -  SeaScope 0.4
A pyQt GUI front-end for cscope. Written in python using pyQt, QScintilla libraries. Features: * Search features o cscope search features o Call tree for functions o Call tree for symbols ...
10.24 KB  
Text Editors  -  Val(a)IDE 0.7.1
Val(a)IDE is an IDE (Integrated Development Environment) application for the Vala programming language. Here are some key features of "Val(a)IDE": ?*A* Syntax highlighting for Vala ?*A* Project compilation
1.52 MB  
Text Editors  -  greyd 1.0
greyd is a transparent Greylist proxy for the purpose of rejecting spam send by spambot armies. The first generation of code which has been running in production for about 3 months has greatly reduced the amount of spam that needs to be processed...
10.24 KB