Update: Minh-Thang Luong has extended and evolved my prefix probability parser and released it as the EarleyX parser. It has more functionality than my implementation, so I recommend you check it out. It's on GitHub here.


I have a basic Java implementation of Andreas Stolcke's probabilistic Earley parser:

Andreas Stolcke. 1995. An Efficient Probabilistic Context-Free Parsing Algorithm that Computes Prefix Probabilities. Computational Linguistics 21(2), 165-201.

There is also useful library code to interface with treebanks and train probabilistic context-free grammars for use with the parser. You will need a JVM version 1.4 or later to run it.

Download it here.

If you use the parser in your research and need to provide a reference for it, you can use the following paper:

Roger Levy. 2008. Expectation-based syntactic comprehension. Cognition 106(3):1126-1177.

You should also cite Stolcke as above.

Contact Roger Levy, rlevy[at]ling.ucsd.edu, for questions regarding the parser.


Last modified: Fri Oct 27 15:26:41 EDT 2017