|
|||||||||
PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES |
See:
Description
Interface Summary | |
---|---|
Function<T1,T2> | An interface for classes that act as a function transforming one object to another. |
Tokenizer<T> | Tokenizers break up text into individual Objects. |
Class Summary | |
---|---|
AbstractTokenizer<T> | An abstract tokenizer. |
Americanize | Takes a HasWord or String and returns a lowercase version of it. |
Morphology | Morphology computes the base form of English words, by removing just inflections (not derivational morphology). |
PTBTokenizer | Tokenizer implementation that conforms to the Penn Treebank tokenization conventions. |
PTBTokenizer.PTBTokenizerFactory | |
TokenizerAdapter | This class adapts between a java.io.StreamTokenizer
and a edu.stanford.nlp.process.Tokenizer . |
WhitespaceTokenizer | Simple Tokenizer implementation that tokenizes on whitespace. |
Contains classes for processing documents. The key here is the Processor
interface, which has a sole Document process(Document)
method
which takes a document and returns another document, which may
be parsed, stoplisted, stemmed, etc.
|
|||||||||
PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES |