edu.stanford.nlp.util
Class StringUtils

java.lang.Object
  extended by edu.stanford.nlp.util.StringUtils

public class StringUtils
extends java.lang.Object

StringUtils is a class for random String things.


Method Summary
static java.util.Map<java.lang.String,java.lang.String[]> argsToMap(java.lang.String[] args)
          Parses command line arguments into a Map.
static java.util.Map<java.lang.String,java.lang.String[]> argsToMap(java.lang.String[] args, java.util.Map<java.lang.String,java.lang.Integer> flagsToNumArgs)
          Parses command line arguments into a Map.
static java.util.Properties argsToProperties(java.lang.String[] args)
           
static java.util.Properties argsToProperties(java.lang.String[] args, java.util.Map flagsToNumArgs)
          Analagous to argsToMap(java.lang.String[]).
static java.lang.String escapeString(java.lang.String s, char[] charsToEscape, char escapeChar)
           
static java.lang.String escapeStringForXML(java.lang.String s)
           
static java.lang.String escapeTextAroundXMLTags(java.lang.String s)
           
static java.lang.String exactlyN(java.lang.Object obj, int totalChars)
          Pad or trim the toString value of the given Object.
static java.lang.String exactlyN(java.lang.String inStr, int num)
          Pad or trim so as to produce a string of exactly a certain length.
static java.lang.String fileNameClean(java.lang.String s)
          Returns a "clean" version of the given filename in which spaces have been converted to dashes and all non-alphaneumeric chars are underscores.
static boolean find(java.lang.String str, java.lang.String regex)
          Say whether this regular expression can be found inside this String.
static java.lang.String join(java.lang.Iterable l, java.lang.String glue)
          Joins each elem in the Collection with the given glue.
static java.lang.String join(java.util.List l)
          Joins elems with a space.
static java.lang.String join(java.util.List l, java.lang.String glue)
          Joins each elem in the List with the given glue.
static java.lang.String join(java.lang.Object[] elements)
          Joins elems with a space.
static java.lang.String join(java.lang.Object[] elements, java.lang.String glue)
          Joins each elem in the array with the given glue.
static java.lang.String leftPad(double d, int totalChars)
           
static java.lang.String leftPad(int i, int totalChars)
           
static java.lang.String leftPad(java.lang.Object obj, int totalChars)
           
static java.lang.String leftPad(java.lang.String str, int totalChars)
          Pads the given String to the left with spaces to ensure that it's at least totalChars long.
static boolean lookingAt(java.lang.String str, java.lang.String regex)
          Say whether this regular expression can be found at the beginning of this String.
static void main(java.lang.String[] args)
           
static boolean matches(java.lang.String str, java.lang.String regex)
          Say whether this regular expression matches this String.
static int nthIndex(java.lang.String s, char ch, int n)
          Returns the index of the nth occurrence of ch in s, or -1 if there are less than n occurrences of ch.
static java.lang.String pad(java.lang.Object obj, int totalChars)
          Pads the toString value of the given Object.
static java.lang.String pad(java.lang.String str, int totalChars)
          Return a String of length a minimum of totalChars characters by padding the input String str with spaces.
static java.util.Map parseCommandLineArguments(java.lang.String[] args)
          A simpler form of command line argument parsing.
static void printStringOneCharPerLine(java.lang.String s)
           
static void printToFile(java.io.File file, java.lang.String message)
          Prints to a file.
static void printToFile(java.io.File file, java.lang.String message, boolean append)
          Prints to a file.
static void printToFile(java.lang.String filename, java.lang.String message)
          Prints to a file.
static void printToFile(java.lang.String filename, java.lang.String message, boolean append)
          Prints to a file.
static java.lang.String slurpFile(java.io.File file)
          Returns all the text in the given File.
static java.lang.String slurpFile(java.lang.String filename)
          Returns all the text in the given File.
static java.lang.String slurpFileNoExceptions(java.io.File file)
          Returns all the text in the given File.
static java.lang.String slurpFileNoExceptions(java.lang.String filename)
          Returns all the text in the given File.
static java.lang.String slurpURL(java.net.URL u)
          Returns all the text at the given URL.
static java.lang.String slurpURLNoExceptions(java.net.URL u)
          Returns all the text at the given URL.
static java.util.List split(java.lang.String s)
          Splits on whitespace (\\s+).
static java.util.List split(java.lang.String str, java.lang.String regex)
          Splits the given string using the given regex as delimiters.
static java.lang.String[] splitOnCharWithQuoting(java.lang.String s, char splitChar, char quoteChar, char escapeChar)
          This function splits the String s into multiple Strings using the splitChar.
static java.util.Properties stringToProperties(java.lang.String str)
          This method converts a comma-separated String (with whitespace optionally allowed after the comma) representing properties to a Properties object.
static java.lang.String trim(java.lang.Object obj, int maxWidth)
           
static java.lang.String trim(java.lang.String s, int maxWidth)
          Returns s if it's at most maxWidth chars, otherwise chops right side to fit.
static java.lang.String truncate(int n, int smallestDigit, int biggestDigit)
          This returns a string from decimal digit smallestDigit to decimal digit biggest digit.
static java.lang.String unescapeStringForXML(java.lang.String s)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

find

public static boolean find(java.lang.String str,
                           java.lang.String regex)
Say whether this regular expression can be found inside this String. This method provides one of the two "missing" convenience methods for regular expressions in the String class in JDK1.4. This is the one you'll want to use all the time if you're used to Perl. What were they smoking?

Parameters:
str - String to search for match in
regex - String to compile as the regular expression
Returns:
Whether the regex can be found in str

lookingAt

public static boolean lookingAt(java.lang.String str,
                                java.lang.String regex)
Say whether this regular expression can be found at the beginning of this String. This method provides one of the two "missing" convenience methods for regular expressions in the String class in JDK1.4.

Parameters:
str - String to search for match at start of
regex - String to compile as the regular expression
Returns:
Whether the regex can be found at the start of str

matches

public static boolean matches(java.lang.String str,
                              java.lang.String regex)
Say whether this regular expression matches this String. This method is the same as the String.matches() method, and is included just to give a call that is parallel to the other static regex methods in this class.

Parameters:
str - String to search for match at start of
regex - String to compile as the regular expression
Returns:
Whether the regex matches the whole of this str

slurpFile

public static java.lang.String slurpFile(java.io.File file)
                                  throws java.io.IOException
Returns all the text in the given File.

Throws:
java.io.IOException

slurpFile

public static java.lang.String slurpFile(java.lang.String filename)
                                  throws java.io.IOException
Returns all the text in the given File.

Returns:
The text in the file. May be an empty string if the file is empty. If the file cannot be read (non-existent, etc.), then and only then the method returns null.
Throws:
java.io.IOException

slurpFileNoExceptions

public static java.lang.String slurpFileNoExceptions(java.io.File file)
Returns all the text in the given File.


slurpFileNoExceptions

public static java.lang.String slurpFileNoExceptions(java.lang.String filename)
Returns all the text in the given File.

Returns:
The text in the file. May be an empty string if the file is empty. If the file cannot be read (non-existent, etc.), then and only then the method returns null.

slurpURL

public static java.lang.String slurpURL(java.net.URL u)
                                 throws java.io.IOException
Returns all the text at the given URL.

Throws:
java.io.IOException

slurpURLNoExceptions

public static java.lang.String slurpURLNoExceptions(java.net.URL u)
Returns all the text at the given URL.


join

public static java.lang.String join(java.lang.Iterable l,
                                    java.lang.String glue)
Joins each elem in the Collection with the given glue. For example, given a list of Integers, you can create a comma-separated list by calling join(numbers, ", ").


join

public static java.lang.String join(java.util.List l,
                                    java.lang.String glue)
Joins each elem in the List with the given glue. For example, given a list of Integers, you can create a comma-separated list by calling join(numbers, ", ").


join

public static java.lang.String join(java.lang.Object[] elements,
                                    java.lang.String glue)
Joins each elem in the array with the given glue. For example, given a list of ints, you can create a comma-separated list by calling join(numbers, ", ").


join

public static java.lang.String join(java.util.List l)
Joins elems with a space.


join

public static java.lang.String join(java.lang.Object[] elements)
Joins elems with a space.


split

public static java.util.List split(java.lang.String s)
Splits on whitespace (\\s+).


split

public static java.util.List split(java.lang.String str,
                                   java.lang.String regex)
Splits the given string using the given regex as delimiters. This method is the same as the String.split() method (except it throws the results in a List), and is included just to give a call that is parallel to the other static regex methods in this class.

Parameters:
str - String to split up
regex - String to compile as the regular expression
Returns:
List of Strings resulting from splitting on the regex

pad

public static java.lang.String pad(java.lang.String str,
                                   int totalChars)
Return a String of length a minimum of totalChars characters by padding the input String str with spaces. If str is already longer than totalChars, it is returned unchanged.


pad

public static java.lang.String pad(java.lang.Object obj,
                                   int totalChars)
Pads the toString value of the given Object.


exactlyN

public static java.lang.String exactlyN(java.lang.String inStr,
                                        int num)
Pad or trim so as to produce a string of exactly a certain length.

Parameters:
inStr - The String to be padded or truncated
num - The desired length

exactlyN

public static java.lang.String exactlyN(java.lang.Object obj,
                                        int totalChars)
Pad or trim the toString value of the given Object.


leftPad

public static java.lang.String leftPad(java.lang.String str,
                                       int totalChars)
Pads the given String to the left with spaces to ensure that it's at least totalChars long.


leftPad

public static java.lang.String leftPad(java.lang.Object obj,
                                       int totalChars)

leftPad

public static java.lang.String leftPad(int i,
                                       int totalChars)

leftPad

public static java.lang.String leftPad(double d,
                                       int totalChars)

trim

public static java.lang.String trim(java.lang.String s,
                                    int maxWidth)
Returns s if it's at most maxWidth chars, otherwise chops right side to fit.


trim

public static java.lang.String trim(java.lang.Object obj,
                                    int maxWidth)

fileNameClean

public static java.lang.String fileNameClean(java.lang.String s)
Returns a "clean" version of the given filename in which spaces have been converted to dashes and all non-alphaneumeric chars are underscores.


nthIndex

public static int nthIndex(java.lang.String s,
                           char ch,
                           int n)
Returns the index of the nth occurrence of ch in s, or -1 if there are less than n occurrences of ch.


truncate

public static java.lang.String truncate(int n,
                                        int smallestDigit,
                                        int biggestDigit)
This returns a string from decimal digit smallestDigit to decimal digit biggest digit. Smallest digit is labeled 1, and the limits are inclusive.


argsToMap

public static java.util.Map<java.lang.String,java.lang.String[]> argsToMap(java.lang.String[] args)
Parses command line arguments into a Map. Arguments of the form

-flag1 arg1a arg1b ... arg1m -flag2 -flag3 arg3a ... arg3n

will be parsed so that the flag is a key in the Map (including the hyphen) and its value will be a String[] containing the optional arguments (if present). The non-flag values not captured as flag arguments are collected into a String[] array and returned as the value of null in the Map. In this invocation, flags cannot take arguments, so all the String array values other than the value for null will be zero-length.

Parameters:
args -
Returns:
a Map of flag names to flag argument String[] arrays.

argsToMap

public static java.util.Map<java.lang.String,java.lang.String[]> argsToMap(java.lang.String[] args,
                                                                           java.util.Map<java.lang.String,java.lang.Integer> flagsToNumArgs)
Parses command line arguments into a Map. Arguments of the form

-flag1 arg1a arg1b ... arg1m -flag2 -flag3 arg3a ... arg3n

will be parsed so that the flag is a key in the Map (including the hyphen) and its value will be a String[] containing the optional arguments (if present). The non-flag values not captured as flag arguments are collected into a String[] array and returned as the value of null in the Map. In this invocation, the maximum number of arguments for each flag can be specified as an Integer value of the appropriate flag key in the flagsToNumArgs Map argument. (By default, flags cannot take arguments.)

Example of usage:

Map flagsToNumArgs = new HashMap(); flagsToNumArgs.put("-x",new Integer(2)); flagsToNumArgs.put("-d",new Integer(1)); Map result = argsToMap(args,flagsToNumArgs);

Parameters:
args - the argument array to be parsed
flagsToNumArgs - a Map of flag names to Integer values specifying the maximum number of allowed arguments for that flag (default 0).
Returns:
a Map of flag names to flag argument String[] arrays.

argsToProperties

public static java.util.Properties argsToProperties(java.lang.String[] args)

argsToProperties

public static java.util.Properties argsToProperties(java.lang.String[] args,
                                                    java.util.Map flagsToNumArgs)
Analagous to argsToMap(java.lang.String[]). However, there are several key differences between this method and argsToMap(java.lang.String[]):

stringToProperties

public static java.util.Properties stringToProperties(java.lang.String str)
This method converts a comma-separated String (with whitespace optionally allowed after the comma) representing properties to a Properties object. Each property is "property=value". The value for properties without an explicitly given value is set to "true".


printToFile

public static void printToFile(java.io.File file,
                               java.lang.String message,
                               boolean append)
Prints to a file. If the file already exists, appends if append=true, and overwrites if append=false


printToFile

public static void printToFile(java.io.File file,
                               java.lang.String message)
Prints to a file. If the file does not exist, rewrites the file; does not append.


printToFile

public static void printToFile(java.lang.String filename,
                               java.lang.String message,
                               boolean append)
Prints to a file. If the file already exists, appends if append=true, and overwrites if append=false


printToFile

public static void printToFile(java.lang.String filename,
                               java.lang.String message)
Prints to a file. If the file does not exist, rewrites the file; does not append.


parseCommandLineArguments

public static java.util.Map parseCommandLineArguments(java.lang.String[] args)
A simpler form of command line argument parsing. Dan thinks this is highly superior to the overly complexified code that comes before it. Parses command line arguments into a Map. Arguments of the form -flag1 arg1 -flag2 -flag3 arg3 will be parsed so that the flag is a key in the Map (including the hyphen) and the optional argument will be its value (if present).

Parameters:
args -
Returns:
A Map from keys to possible values (String or null)

printStringOneCharPerLine

public static void printStringOneCharPerLine(java.lang.String s)

unescapeStringForXML

public static java.lang.String unescapeStringForXML(java.lang.String s)

escapeStringForXML

public static java.lang.String escapeStringForXML(java.lang.String s)

escapeTextAroundXMLTags

public static java.lang.String escapeTextAroundXMLTags(java.lang.String s)

escapeString

public static java.lang.String escapeString(java.lang.String s,
                                            char[] charsToEscape,
                                            char escapeChar)

splitOnCharWithQuoting

public static java.lang.String[] splitOnCharWithQuoting(java.lang.String s,
                                                        char splitChar,
                                                        char quoteChar,
                                                        char escapeChar)
This function splits the String s into multiple Strings using the splitChar. However, it provides an quoting facility: it is possible to quote strings with the quoteChar. If the quoteChar occurs within the quotedExpression, it must be prefaced by the escapeChar

Parameters:
s - The String to split
splitChar -
quoteChar -
Returns:
An array of Strings that s is split into

main

public static void main(java.lang.String[] args)
                 throws java.io.IOException
Throws:
java.io.IOException