edu.stanford.nlp.util
Class Counters

java.lang.Object
  extended by edu.stanford.nlp.util.Counters

public class Counters
extends java.lang.Object

Static methods for operating on Counters.


Constructor Summary
Counters()
           
 
Method Summary
static Counter average(GenericCounter c1, GenericCounter c2)
          Returns a new Counter with counts averaged from the two given Counters.
static double cosine(GenericCounter c1, GenericCounter c2)
           
static Counter createCounterFromList(java.util.List l)
           
static double crossEntropy(GenericCounter from, Distribution to, Counter justification)
          Note that this implementation doesn't normalize the "from" Counter.
static double crossEntropy(GenericCounter from, GenericCounter to)
          Note that this implementation doesn't normalize the "from" Counter.
static Counter division(GenericCounter c1, GenericCounter c2)
          Returns c1 divided by c2.
static double entropy(GenericCounter c)
          Calculates the entropy of the given counter (in bits).
static Counter getCountCounts(GenericCounter c)
           
static void incrementNonzero(Counter c1, Counter c2)
          Increments counts on all those keys in c1 for which c2 has a nonzero count (i.e., for which c2 has in its keyset).
static double informationRadius(GenericCounter c1, GenericCounter c2)
          Calculates the information radius (aka the Jensen-Shannon divergence) between the two Counters.
static double jensenShannonDivergence(GenericCounter c1, GenericCounter c2)
          Calculates the Jensen-Shannon divergence between the two counters.
static double klDivergence(GenericCounter from, GenericCounter to)
          Calculates the KL divergence between the two counters.
static Counter linearCombination(GenericCounter c1, double w1, GenericCounter c2, double w2)
          Returns a Counter which is a weighted average of c1 and c2.
static Counter loadCounter(java.lang.String filename, java.lang.Class c)
          Loads a Counter from a text file.
static IntCounter loadIntCounter(java.lang.String filename, java.lang.Class c)
          Loads a Counter from a text file.
static Counter perturbCounts(GenericCounter c, java.util.Random random, double stdev, boolean allowNegative)
           
static void printCounterComparison(GenericCounter a, GenericCounter b)
          Great for debugging.
static void printCounterComparison(GenericCounter a, GenericCounter b, java.io.PrintStream out)
          Great for debugging.
static void printCounterSortedByKeys(GenericCounter c)
           
static Counter product(GenericCounter c1, GenericCounter c2)
          Returns the product of c1 and c2.
static void saveCounter(GenericCounter c, java.lang.String filename)
          Saves a Counter to a text file.
static Counter scale(GenericCounter c, double s)
          Scales each element in the Counter by the given scale factor.
static double skewDivergence(GenericCounter c1, GenericCounter c2, double skew)
          Calculates the skew divergence between the two counters.
static PriorityQueue toPriorityQueue(GenericCounter c)
           
static java.util.List toSortedList(GenericCounter c)
           
static Counter union(GenericCounter c1, GenericCounter c2)
          Returns a Counter that is the union of the two Counters passed in (counts are added).
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Counters

public Counters()
Method Detail

union

public static Counter union(GenericCounter c1,
                            GenericCounter c2)
Returns a Counter that is the union of the two Counters passed in (counts are added).

Parameters:
c1 -
c2 -
Returns:

product

public static Counter product(GenericCounter c1,
                              GenericCounter c2)
Returns the product of c1 and c2.

Parameters:
c1 -
c2 -
Returns:

division

public static Counter division(GenericCounter c1,
                               GenericCounter c2)
Returns c1 divided by c2. Note that this can create NaN if c1 has non-zero counts for keys that c2 has zero counts.

Parameters:
c1 -
c2 -
Returns:

entropy

public static double entropy(GenericCounter c)
Calculates the entropy of the given counter (in bits). This method internally uses normalized counts (so they sum to one), but the value returned is meaningless if some of the counts are negative.

Returns:
The entropy of the given counter (in bits)

crossEntropy

public static double crossEntropy(GenericCounter from,
                                  GenericCounter to)
Note that this implementation doesn't normalize the "from" Counter. It does, however, normalize the "to" Counter. Result is meaningless if any of the counts are negative.

Returns:

crossEntropy

public static double crossEntropy(GenericCounter from,
                                  Distribution to,
                                  Counter justification)
Note that this implementation doesn't normalize the "from" Counter. Result is meaningless if any of the counts are negative.

Returns:

klDivergence

public static double klDivergence(GenericCounter from,
                                  GenericCounter to)
Calculates the KL divergence between the two counters. That is, it calculates KL(from || to). This method internally uses normalized counts (so they sum to one), but the value returned is meaningless if any of the counts are negative. In other words, how well can c1 be represented by c2. if there is some value in c1 that gets zero prob in c2, then return positive infinity.

Parameters:
from -
to -
Returns:
The KL divergence between the distributions

jensenShannonDivergence

public static double jensenShannonDivergence(GenericCounter c1,
                                             GenericCounter c2)
Calculates the Jensen-Shannon divergence between the two counters. That is, it calculates 1/2 [KL(c1 || avg(c1,c2)) + KL(c2 || avg(c1,c2))] .

Parameters:
c1 -
c2 -
Returns:
The KL divergence between the distributions

skewDivergence

public static double skewDivergence(GenericCounter c1,
                                    GenericCounter c2,
                                    double skew)
Calculates the skew divergence between the two counters. That is, it calculates KL(c1 || (c2*skew + c1*(1-skew))) . In other words, how well can c1 be represented by a "smoothed" c2.

Parameters:
c1 -
c2 -
skew -
Returns:
The skew divergence between the distributions

informationRadius

public static double informationRadius(GenericCounter c1,
                                       GenericCounter c2)
Calculates the information radius (aka the Jensen-Shannon divergence) between the two Counters. This measure is defined as:
iRad(p,q) = D(p||(p+q)/2)+D(q,(p+q)/2)
where p is one Counter, q is the other counter, and D(p||q) is the KL divergence bewteen p and q. Note that iRad(p,q) = iRad(q,p).

Returns:
The information radius between the distributions

cosine

public static double cosine(GenericCounter c1,
                            GenericCounter c2)

average

public static Counter average(GenericCounter c1,
                              GenericCounter c2)
Returns a new Counter with counts averaged from the two given Counters. The average Counter will contain the union of keys in both source Counters, and each count will be the average of the two source counts for that key, where as usual a missing count in one Counter is treated as count 0.

Returns:
A new counter with counts that are the mean of the resp. counts in the given counters.

linearCombination

public static Counter linearCombination(GenericCounter c1,
                                        double w1,
                                        GenericCounter c2,
                                        double w2)
Returns a Counter which is a weighted average of c1 and c2. Counts from c1 are weighted with weight w1 and counts from c2 are weighted with w2.


perturbCounts

public static Counter perturbCounts(GenericCounter c,
                                    java.util.Random random,
                                    double stdev,
                                    boolean allowNegative)

createCounterFromList

public static Counter createCounterFromList(java.util.List l)

toSortedList

public static java.util.List toSortedList(GenericCounter c)

toPriorityQueue

public static PriorityQueue toPriorityQueue(GenericCounter c)

printCounterComparison

public static void printCounterComparison(GenericCounter a,
                                          GenericCounter b)
Great for debugging.

Parameters:
a -
b -

printCounterComparison

public static void printCounterComparison(GenericCounter a,
                                          GenericCounter b,
                                          java.io.PrintStream out)
Great for debugging.

Parameters:
a -
b -

getCountCounts

public static Counter getCountCounts(GenericCounter c)

scale

public static Counter scale(GenericCounter c,
                            double s)
Scales each element in the Counter by the given scale factor.


printCounterSortedByKeys

public static void printCounterSortedByKeys(GenericCounter c)

loadCounter

public static Counter loadCounter(java.lang.String filename,
                                  java.lang.Class c)
                           throws java.lang.Exception
Loads a Counter from a text file. File must have the format of one key/count pair per line, separated by whitespace.

Parameters:
filename - the path to the file to load the Counter from
c - the Class to instantiate each member of the set. Must have a String constructor.
Returns:
Throws:
java.lang.Exception

loadIntCounter

public static IntCounter loadIntCounter(java.lang.String filename,
                                        java.lang.Class c)
                                 throws java.lang.Exception
Loads a Counter from a text file. File must have the format of one key/count pair per line, separated by whitespace.

Parameters:
filename - the path to the file to load the Counter from
c - the Class to instantiate each member of the set. Must have a String constructor.
Returns:
Throws:
java.lang.Exception

saveCounter

public static void saveCounter(GenericCounter c,
                               java.lang.String filename)
                        throws java.io.IOException
Saves a Counter to a text file. Counter written as one key/count pair per line, separated by whitespace.

Parameters:
c -
filename -
Throws:
java.io.IOException

incrementNonzero

public static void incrementNonzero(Counter c1,
                                    Counter c2)
Increments counts on all those keys in c1 for which c2 has a nonzero count (i.e., for which c2 has in its keyset).