Small discussion and evaluation of different parsers.
Please keep wiki links as wiki links, use external links only if there is no existing page for the tool.
Name |
Grammar |
Module |
Python |
Comment |
|
C |
|
included in the main Python distribution |
|
Python |
2.7+, 3.3+, PyPy |
Tool that takes grammars in EBNF variant & and outputs memoizing (Packrat) PEG parsers in Python. Grako is different from other PEG parser generators in that the generated parsers use Python's very efficient exception-handling system to backtrack. |
||
|
Python/Regex |
2.x, 3.x |
Combines Regular Expressions |
|
|
C |
|
lexical analysis module for Python, foundation for Pyrex and Cython. Plex 2.0.0 is Python 2 only, the version embedded in Cython works in Python 3.x. There is also an experimental port to Python 3 (tested on Python 3.3) |
|
Earley Parser |
Python |
2.3+, 3.2+, PyPy |
PyPI package; github project. This parser is notably used in decompilers like uncompyle6 where using an ambigous grammar is desirable. |
|
LL(1) |
Python |
1-any, 2-1.5+ |
|
|
LR(1) LALR(1) |
C |
|
|
|
|
|
|
|
|
|
C |
|
bison grammar with python code actions |
|
LR |
|
1.5.1+ |
|
|
SLR LALR(1) |
Python |
|
Python Lex-Yacc |
|
|
|
2.2+ |
|
|
GLR |
C |
2.2+ |
grammar in doc strings |
|
GLR |
Python |
2.2.1 |
|
|
PEG |
Python |
2.5+ |
|
|
PEG |
Python |
2.5+ |
|
|
- |
|
2.0+ |
requires mxTextTools |
|
|
Python |
2.0+ |
requires mxTextTools |
|
- |
C |
|
is not exactly a parser like we're used to, but it is a fast text-processing engine |
|
|
Python |
2.2+ |
|
|
|
Python |
2.6+ |
Parser combinator library, similar to pyparsing |
|
LL1+ |
Python |
|
stand-alone tool in Java. Latest version can produce Python code |
|
LR(0) LR(1) SLR LALR(1) |
Python |
2.2+ |
|
|
|
Python |
|
Object-oriented, Pythonic parsing |
|
LR(1) |
Python |
2.5+ |
|
|
LL(1) |
Python |
|
uses separate grammar files |
|
|
Python |
|
inspired by pyparsing and boost::spirit |
|
LR(1) |
Python |
2.4+ |
has separate parser input file, parser output is a parse tree |
|
na |
Python |
2.6+ |
Simple parser using rule defined in BNF format |
|
Any |
Python |
2.6+,3+ |
Recursive descent with full backtracking and optional memoisation (which can handle left recursive grammars). So equivalent to GLR, but based on LL(k) core. |
|
GLR |
Python |
3.1+ |
Recursive descent parser with full backtracking. Grammar elements and results are defined as Python classes, so are fully customizable. Supports ambiguous grammars. |
|
LL(*) |
Python |
2.4+ |
Recursive descent parsing library for Python based on functional combinators |
|
- |
Python |
2.7+ 3+ |
|
|
LR(1) |
Python |
2.6+ |
A fast parser, lexer combination with a concise Pythonic interface. Lots of documentation, include example parsers for SQL and Lua. |
|
PEG |
Python |
2.7+, 3.2+ |
Packrat parser. Works as interpreter. Multiple syntaxes for grammar definition. Lots of docs, examples and tutorials. |
|
|
Python |
2.7+, 3.2+ |
A high-level meta-language/parser for Domain-Specific Language implementation. Built on top of Arpeggio parser. Inspired by XText. Documentation, examples and tutorials available. |
|
LR |
Python |
3.2+ |
A fast, stand-alone parser which can export a grammar to JavaScript (jsleri), Go (goleri), C (libcleri) or Java (jleri). |
|
LR/GLR |
Python |
2.7+, 3.3+ |
A pure Python LR/GLR parser with integrated scanner (scannerless). Grammar in BNF format. Automata/GLR trace visualization. Full documentation and examples available. |
|
LALR(1), CFG |
Python |
2.7, 3.4+ |
LALR(1) for speed or Earley parser for any context-free grammar. |
For faster performance, one may use other parser generator systems and plug them in as modules.
For example:
Spirit (http://spirit.sourceforge.net/) framework for writing EBNF as C++ code
FlexBisonModule (http://www.crsr.net/Software/FBModule.html)
cocktail compiler tools approach
Example of such usage is SeeGramWrap available from Edward C. Jones Python page, which is a heavily revised and upgraded version of the ANTLR C parser that is in cgram (broken link). The lastest verson has been refactored to move some of the complexity from ANTLR to Python.
Martin von Loewis presented a paper at Python10, titled "Towards a Standard Parser Generator" that surveyed the available parser generators for Python.
Additional information on these and other parsers at Python Parsing Tools.
Books
Complete online textbook, titled "Parsing: A Practical Guide".