edu.stanford.nlp.parser.lexparser
Class EnglishTreebankParserParams

java.lang.Object
  extended by edu.stanford.nlp.parser.lexparser.AbstractTreebankParserParams
      extended by edu.stanford.nlp.parser.lexparser.EnglishTreebankParserParams
All Implemented Interfaces:
TreebankLangParserParams, java.io.Serializable

public class EnglishTreebankParserParams
extends AbstractTreebankParserParams

Parser parameters for the Penn English Treebank (WSJ, Brown, Switchboard).

See Also:
Serialized Form

Nested Class Summary
static class EnglishTreebankParserParams.EnglishTest
           
static class EnglishTreebankParserParams.EnglishTrain
           
protected  class EnglishTreebankParserParams.SubcategoryStripper
           
 
Nested classes/interfaces inherited from class edu.stanford.nlp.parser.lexparser.AbstractTreebankParserParams
AbstractTreebankParserParams.DependencyTyper<T>
 
Field Summary
 
Fields inherited from class edu.stanford.nlp.parser.lexparser.AbstractTreebankParserParams
inputEncoding, outputEncoding, tlp
 
Constructor Summary
EnglishTreebankParserParams()
           
 
Method Summary
 TreeTransformer collinizer()
          the tree transformer used to produce trees for evaluation.
 TreeTransformer collinizerEvalb()
          the tree transformer used to produce trees for evaluation.
 java.util.List defaultTestSentence()
          Return a default sentence for the language (for testing)
 DiskTreebank diskTreebank()
          Allows you to read in trees from the source you want.
 void display()
          display language-specific settings
 HeadFinder headFinder()
          the HeadFinder to use for your treebank.
static void main(java.lang.String[] args)
           
 MemoryTreebank memoryTreebank()
          Allows you to read in trees from the source you want.
 java.io.PrintWriter pw(java.io.OutputStream o)
          The PrintWriter used to print output to OutputStream o.
 int setOptionFlag(java.lang.String[] args, int i)
          Set language-specific options according to flags.
 java.lang.String[] sisterSplitters()
          Returns the splitting strings used for selective splits.
 TreeTransformer subcategoryStripper()
          Returns a TreeTransformer appropriate to the Treebank which can be used to remove functional tags (such as "-TMP") from categories.
 MemoryTreebank testMemoryTreebank()
          returns a MemoryTreebank appropriate to the testing treebank source
 edu.stanford.nlp.parser.lexparser.TreeHeadPair transformTree(Tree t, Tree root, edu.stanford.nlp.parser.lexparser.TreeHeadPair thp)
          transformTree does language-specific tree transformations such as splicing.
 TreebankLanguagePack treebankLanguagePack()
          contains Treebank-specific (but not parser-specific) info such as what is punctuation, and also information about the structure of labels
 TreeReaderFactory treeReaderFactory()
          Makes appropriate TreeReaderFactory with all options specified
 
Methods inherited from class edu.stanford.nlp.parser.lexparser.AbstractTreebankParserParams
dependencyObjectify, getOutputEncoding, lex, lex, parsevalObjectify, pw, setInputEncoding, setOutputEncoding, treeTokenizerFactory, typedDependencyClasser, typedDependencyObjectify, untypedDependencyObjectify
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

EnglishTreebankParserParams

public EnglishTreebankParserParams()
Method Detail

headFinder

public HeadFinder headFinder()
Description copied from class: AbstractTreebankParserParams
the HeadFinder to use for your treebank.

Specified by:
headFinder in interface TreebankLangParserParams
Specified by:
headFinder in class AbstractTreebankParserParams

diskTreebank

public DiskTreebank diskTreebank()
Allows you to read in trees from the source you want. It's the responsibility of treeReaderFactory() to deal properly with character-set encoding of the input. It also is the responsibility of tr to properly normalize trees.


memoryTreebank

public MemoryTreebank memoryTreebank()
Allows you to read in trees from the source you want. It's the responsibility of treeReaderFactory() to deal properly with character-set encoding of the input. It also is the responsibility of tr to properly normalize trees.

Specified by:
memoryTreebank in interface TreebankLangParserParams
Specified by:
memoryTreebank in class AbstractTreebankParserParams

treeReaderFactory

public TreeReaderFactory treeReaderFactory()
Makes appropriate TreeReaderFactory with all options specified


testMemoryTreebank

public MemoryTreebank testMemoryTreebank()
returns a MemoryTreebank appropriate to the testing treebank source

Specified by:
testMemoryTreebank in interface TreebankLangParserParams
Overrides:
testMemoryTreebank in class AbstractTreebankParserParams

collinizer

public TreeTransformer collinizer()
the tree transformer used to produce trees for evaluation. Will be applied both to the

Specified by:
collinizer in interface TreebankLangParserParams
Specified by:
collinizer in class AbstractTreebankParserParams

collinizerEvalb

public TreeTransformer collinizerEvalb()
Description copied from class: AbstractTreebankParserParams
the tree transformer used to produce trees for evaluation. Will be applied both to the parse output tree and to the gold tree. Should strip punctuation and maybe do some other things. The evalb version should strip some more stuff off. (finish this doc!)

Specified by:
collinizerEvalb in interface TreebankLangParserParams
Specified by:
collinizerEvalb in class AbstractTreebankParserParams

treebankLanguagePack

public TreebankLanguagePack treebankLanguagePack()
contains Treebank-specific (but not parser-specific) info such as what is punctuation, and also information about the structure of labels

Specified by:
treebankLanguagePack in interface TreebankLangParserParams
Overrides:
treebankLanguagePack in class AbstractTreebankParserParams

pw

public java.io.PrintWriter pw(java.io.OutputStream o)
The PrintWriter used to print output to OutputStream o. It's the responsibility of pw to deal properly with character encodings for the relevant treebank.

Specified by:
pw in interface TreebankLangParserParams
Overrides:
pw in class AbstractTreebankParserParams

sisterSplitters

public java.lang.String[] sisterSplitters()
Description copied from class: AbstractTreebankParserParams
Returns the splitting strings used for selective splits.

Specified by:
sisterSplitters in interface TreebankLangParserParams
Specified by:
sisterSplitters in class AbstractTreebankParserParams
Returns:
An array containing ancestor-annotated Strings: categories should be split according to these ancestor annotations.

subcategoryStripper

public TreeTransformer subcategoryStripper()
Returns a TreeTransformer appropriate to the Treebank which can be used to remove functional tags (such as "-TMP") from categories.

Specified by:
subcategoryStripper in interface TreebankLangParserParams
Overrides:
subcategoryStripper in class AbstractTreebankParserParams

transformTree

public edu.stanford.nlp.parser.lexparser.TreeHeadPair transformTree(Tree t,
                                                                    Tree root,
                                                                    edu.stanford.nlp.parser.lexparser.TreeHeadPair thp)
transformTree does language-specific tree transformations such as splicing. Any parameterizations should be inside the specific TreebankLangParserParams class

Specified by:
transformTree in interface TreebankLangParserParams
Specified by:
transformTree in class AbstractTreebankParserParams

display

public void display()
Description copied from class: AbstractTreebankParserParams
display language-specific settings

Specified by:
display in interface TreebankLangParserParams
Specified by:
display in class AbstractTreebankParserParams

setOptionFlag

public int setOptionFlag(java.lang.String[] args,
                         int i)
Set language-specific options according to flags. This routine should process the option starting in args[i] (which might potentially be several arguments long if it takes arguments). It should return the index after the last index it consumed in processing. In particular, if it cannot process the current option, the return value should be i.

Specified by:
setOptionFlag in interface TreebankLangParserParams
Specified by:
setOptionFlag in class AbstractTreebankParserParams
Parameters:
args - Array of command line arguments
i - Index in command line arguments to try to process as an option
Returns:
The index of the item after arguments processed as part of this command line option.

defaultTestSentence

public java.util.List defaultTestSentence()
Return a default sentence for the language (for testing)


main

public static void main(java.lang.String[] args)