# Text::NSP::Measures::2D::MI::ps 1.03

Date Added: August 14, 2010

Text::NSP::Measures::2D::MI::ps is a Perl module that implements Poisson-Stirling measure of association for bigrams. SYNOPSIS Basic Usage use Text::NSP::Measures::2D::MI::ps; my \$npp = 60; my \$n1p = 20; my \$np1 = 20; my \$n11 = 10; \$ps_value = calculateStatistic( n11=>\$n11, n1p=>\$n1p, np1=>\$np1, npp=>\$npp); if( (\$errorCode = getErrorCode())) { print STDERR \$errorCode." - ".getErrorMessage()."n""; } else { print getStatisticName."value for bigram is ".\$ps_value."n""; } The log-likelihood ratio measures the devitation between the observed data and what would be expected if < word1 > and < word2 > were independent. The higher the score, the less evidence there is in favor of concluding that the words are independent. Assume that the frequency count data associated with a bigram < word1 >< word2 > as shown by a 2x2 contingency table: word2 ~word2 word1 n11 n12 | n1p ~word1 n21 n22 | n2p -------------- np1 np2 npp where n11 is the number of times < word1 >< word2 > occur together, and n12 is the number of times < word1 > occurs with some word other than word2, and n1p is the number of times in total that word1 occurs as the first word in a bigram. The expected values for the internal cells are calculated by taking the product of their associated marginals and dividing by the sample size, for example: np1 * n1p m11= --------- npp The poisson stirling measure is a negative lograthimic approximation of the poisson-likelihood measure. It uses the stirlings firmula to approximate the factorial in poisson-likelihood measure. Posson-Stirling = n11 * ( log(n11) - log(m11) - 1) which is same as Posson-Stirling = n11 * ( log(n11/m11) - 1) Methods calculateStatistic() - This method calculates the ps value INPUT PARAMS : \$count_values .. Reference of an hash containing the count values computed by the count.pl program. RETURN VALUES : \$poissonStirling .. Poisson-Stirling value for this bigram. getStatisticName() - Returns the name of this statistic INPUT PARAMS : none RETURN VALUES : \$name .. Name of the measure..

