Size: 1080
Comment:
|
Size: 7463
Comment: added lark
|
Deletions are marked like this. | Additions are marked like this. |
Line 3: | Line 3: |
* ["shlex"] (http://www.python.org/doc/current/lib/module-shlex.html) * ["Plex"] (http://www.cosc.canterbury.ac.nz/~greg/python/Plex) * SimpleParse (http://members.rogers.com/mcfletch/programming/simpleparse/simpleparse.html) * ["Spark"] (http://pages.cpsc.ucalgary.ca/~aycock/spark/) * ["Yapps"] (http://theory.stanford.edu/~amitp/Yapps/) * ["PyLR"] (http://starship.python.net/crew/scott/PyLR.html) * ["kwParsing"] (http://www.pythonpros.com/arw/kwParsing/kwParsing.html) * PyBison (http://www-db.stanford.edu/~hassan/Python/pybison.tar.gz) * ["mcf.parse"] (http://starship.python.net/crew/mcfletch/) * ["Trap"] (http://www.first.gmd.de/smile/trap/) Also see Python ParserSig @ http://www.python.org/sigs/parser-sig/ to discuss and select a standard parser generator for Python. |
(!) Please keep wiki links as wiki links, use external links only if there is no existing page for the tool. ||Name ||Grammar ||Module ||Python ||Comment || ||[[http://docs.python.org/lib/module-shlex.html|shlex]] || ||C || ||included in the main Python distribution || ||[[Grako]] ||[[http://en.wikipedia.org/wiki/Parsing_expression_grammar|PEG]] ||Python ||2.7+, 3.3+, PyPy ||Tool that takes grammars in EBNF variant & and outputs memoizing (Packrat) PEG parsers in Python. Grako is different from other PEG parser generators in that the generated parsers use Python's very efficient exception-handling system to backtrack. || ||[[http://reparse.readthedocs.org/en/latest/|reparse]] || ||Python/Regex ||2.x, 3.x ||Combines Regular Expressions || ||[[http://packages.python.org/plex/|Plex]] || ||C || ||lexical analysis module for Python, foundation for [[Pyrex]] and [[http://cython.org|Cython]]. Plex 2.0.0 is Python 2 only, the version embedded in Cython works in Python 3.x. There is also an [[https://github.com/uogbuji/plex3|experimental port to Python 3 (tested on Python 3.3)]] || ||[[http://pages.cpsc.ucalgary.ca/~aycock/spark/|Spark]] ||Earley Parser ||Python ||2.3+, 3.2+, PyPy ||[[https://pypi.org/project/spark-parser/|PyPI package]]; [[https://github.com/rocky/python-spark|github project]]. This parser is notably used in decompilers like [[https://pypi.org/project/uncompyle6/|uncompyle6]] where using an ''ambigous'' grammar is desirable. || ||[[Yapps]] ||LL(1) ||Python ||1-any, 2-1.5+ || || ||[[http://starship.python.net/crew/scott/PyLR.html|PyLR - (broken link)]] ||LR(1) LALR(1) ||C || || || ||[[http://gadfly.sourceforge.net/kwParsing.html|kwParsing]] || || || || || ||PyBison || ||C || ||bison grammar with python code actions || ||[[http://www.ercim.org/publication/Ercim_News/enw36/ernst.html|Trap]] ||LR || ||1.5.1+ || || ||[[http://www.dabeaz.com/ply/|PLY]] ||SLR LALR(1) ||Python || ||Python Lex-Yacc || ||[[http://christophe.delord.free.fr/tpg/index.html|ToyParserGenerator]] || || ||2.2+ || || ||[[http://dparser.sourceforge.net/|DParser]] ||GLR ||C ||2.2+ ||grammar in doc strings || ||[[http://www.lava.net/~newsham/pyggy/|PyGgy - (broken link)]] ||GLR ||Python ||2.2.1 || || ||[[http://fdik.org/pyPEG|pyPEG]] ||PEG ||Python ||2.5+ || || ||[[http://github.com/erikrose/parsimonious|parsimonious]] ||PEG ||Python ||2.5+ || || ||[[http://simpleparse.sourceforge.net/|SimpleParse]] ||- || ||2.0+ ||requires mxTextTools || ||[[http://www.biopython.org/DIST/docs/api/public/Martel-module.html|Martel]] || ||Python ||2.0+ ||requires mxTextTools || ||[[http://www.lemburg.com/files/python/mxTextTools.html|mxTextTools]] ||- ||C || ||is not exactly a parser like we're used to, but it is a fast text-processing engine || ||[[https://pyparsing-docs.readthedocs.io/en/latest/|pyparsing]] || ||Python ||2.2+ || || ||[[http://me.opengroove.org/2011/06/parcon-new-parser-combinator-library.html|parcon]] || ||Python ||2.6+ ||Parser combinator library, similar to pyparsing || ||[[http://www.antlr.org/|ANTLR]] ||LL1+ ||Python || ||stand-alone tool in Java. Latest version can produce Python code || ||[[http://www.ncc.up.pt/FAdo/Yappy|Yappy]] ||LR(0) LR(1) SLR LALR(1) ||Python ||2.2+ || || ||[[http://pypi.python.org/pypi/ZestyParser|ZestyParser]] || ||Python || ||Object-oriented, Pythonic parsing || ||[[http://www.canonware.com/Parsing/|Parsing]] ||LR(1) ||Python ||2.5+ || || ||[[http://msdl.cs.mcgill.ca/people/eposse/projects/aperiot|aperiot]] ||LL(1) ||Python || ||uses separate grammar files || ||[[http://bitbucket.org/namenlos/yeanpypa|yeanpypa]] || ||Python || ||inspired by pyparsing and boost::spirit || ||[[http://seehuhn.de/pages/wisent|Wisent]] ||LR(1) ||Python ||2.4+ ||has separate parser input file, parser output is a parse tree || ||[[http://lparis45.free.fr/rp.html|RP]] ||na ||Python ||2.6+ ||Simple parser using rule defined in BNF format || ||[[http://www.acooke.org/lepl/|LEPL]] ||Any ||Python ||2.6+,3+ ||Recursive descent with full backtracking and optional memoisation (which can handle left recursive grammars). So equivalent to GLR, but based on LL(k) core. || ||[[http://pypi.python.org/pypi/modgrammar|modgrammar]] ||GLR ||Python ||3.1+ ||Recursive descent parser with full backtracking. Grammar elements and results are defined as Python classes, so are fully customizable. Supports ambiguous grammars. || ||[[http://code.google.com/p/funcparserlib/|funcparserlib]] ||LL(*) ||Python ||2.4+ ||Recursive descent parsing library for Python based on functional combinators || ||[[http://pydsl.org|pydsl]] ||- ||Python ||2.7+ 3+ || || ||[[http://pypi.python.org/pypi/lrparsing|lrparsing]] ||LR(1) ||Python ||2.6+ ||A fast parser, lexer combination with a concise Pythonic interface. Lots of documentation, include example parsers for SQL and Lua. || ||[[https://github.com/textX/Arpeggio|Arpeggio]] ||PEG ||Python ||2.7+, 3.2+ ||Packrat parser. Works as interpreter. Multiple syntaxes for grammar definition. Lots of docs, examples and tutorials. || ||[[https://github.com/textX/textX|textX]] || ||Python ||2.7+, 3.2+ ||A high-level meta-language/parser for Domain-Specific Language implementation. Built on top of Arpeggio parser. Inspired by XText. Documentation, examples and tutorials available. || ||[[https://github.com/transceptor-technology/pyleri|pyleri]] ||LR ||Python ||3.2+ ||A fast, stand-alone parser which can export a grammar to JavaScript ([[https://github.com/transceptor-technology/jsleri|jsleri]]), Go ([[https://github.com/transceptor-technology/goleri|goleri]]), C ([[https://github.com/transceptor-technology/libcleri|libcleri]]) or Java ([[https://github.com/transceptor-technology/jleri|jleri]]). || ||[[https://github.com/igordejanovic/parglare|parglare]] ||LR/GLR ||Python ||2.7+, 3.3+ ||A pure Python LR/GLR parser with integrated scanner (scannerless). Grammar in BNF format. Automata/GLR trace visualization. Full documentation and examples available. || ||[[https://github.com/lark-parser/lark|lark]] ||LALR, CFG ||Python ||2.7, 3.4+ ||LALR for speed or Earley parser for any context-free grammar.|| |
Line 17: | Line 46: |
For some speed up one may use other parser generator systems and plug them in as modules. | For faster performance, one may use other parser generator systems and plug them in as modules. |
Line 20: | Line 51: |
* ["ANTLR"] (http://www.antlr.org/) C++ output (an older version with C output also available) | * [[Spirit]] (http://spirit.sourceforge.net/) framework for writing EBNF as C++ code * FlexBisonModule (http://www.crsr.net/Software/FBModule.html) * [[http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&selm=61g0ff$39g$1@vishnu.jussieu.fr|cocktail compiler tools]] approach Example of such usage is SeeGramWrap available from Edward C. Jones [[http://members.tripod.com/~edcjones/pycode.html|Python page]], which is a heavily revised and upgraded version of the ANTLR C parser that is in [[http://members.tripod.com/~edcjones/cgram.tar.gz|cgram]] (broken link). The lastest verson has been refactored to move some of the complexity from ANTLR to Python. Martin von Loewis presented a paper at Python10, titled [[http://www.python.org/sigs/parser-sig/towards-standard.html|"Towards a Standard Parser Generator"]] that surveyed the available parser generators for Python. Additional information on these and other parsers at [[https://github.com/webmaven/python-parsing-tools|Python Parsing Tools]]. == Books == * Complete online textbook, titled [[http://dickgrune.com/Books/PTAPG_1st_Edition/|"Parsing: A Practical Guide"]]. |
Small discussion and evaluation of different parsers.
Please keep wiki links as wiki links, use external links only if there is no existing page for the tool.
Name |
Grammar |
Module |
Python |
Comment |
|
C |
|
included in the main Python distribution |
|
Python |
2.7+, 3.3+, PyPy |
Tool that takes grammars in EBNF variant & and outputs memoizing (Packrat) PEG parsers in Python. Grako is different from other PEG parser generators in that the generated parsers use Python's very efficient exception-handling system to backtrack. |
||
|
Python/Regex |
2.x, 3.x |
Combines Regular Expressions |
|
|
C |
|
lexical analysis module for Python, foundation for Pyrex and Cython. Plex 2.0.0 is Python 2 only, the version embedded in Cython works in Python 3.x. There is also an experimental port to Python 3 (tested on Python 3.3) |
|
Earley Parser |
Python |
2.3+, 3.2+, PyPy |
PyPI package; github project. This parser is notably used in decompilers like uncompyle6 where using an ambigous grammar is desirable. |
|
LL(1) |
Python |
1-any, 2-1.5+ |
|
|
LR(1) LALR(1) |
C |
|
|
|
|
|
|
|
|
|
C |
|
bison grammar with python code actions |
|
LR |
|
1.5.1+ |
|
|
SLR LALR(1) |
Python |
|
Python Lex-Yacc |
|
|
|
2.2+ |
|
|
GLR |
C |
2.2+ |
grammar in doc strings |
|
GLR |
Python |
2.2.1 |
|
|
PEG |
Python |
2.5+ |
|
|
PEG |
Python |
2.5+ |
|
|
- |
|
2.0+ |
requires mxTextTools |
|
|
Python |
2.0+ |
requires mxTextTools |
|
- |
C |
|
is not exactly a parser like we're used to, but it is a fast text-processing engine |
|
|
Python |
2.2+ |
|
|
|
Python |
2.6+ |
Parser combinator library, similar to pyparsing |
|
LL1+ |
Python |
|
stand-alone tool in Java. Latest version can produce Python code |
|
LR(0) LR(1) SLR LALR(1) |
Python |
2.2+ |
|
|
|
Python |
|
Object-oriented, Pythonic parsing |
|
LR(1) |
Python |
2.5+ |
|
|
LL(1) |
Python |
|
uses separate grammar files |
|
|
Python |
|
inspired by pyparsing and boost::spirit |
|
LR(1) |
Python |
2.4+ |
has separate parser input file, parser output is a parse tree |
|
na |
Python |
2.6+ |
Simple parser using rule defined in BNF format |
|
Any |
Python |
2.6+,3+ |
Recursive descent with full backtracking and optional memoisation (which can handle left recursive grammars). So equivalent to GLR, but based on LL(k) core. |
|
GLR |
Python |
3.1+ |
Recursive descent parser with full backtracking. Grammar elements and results are defined as Python classes, so are fully customizable. Supports ambiguous grammars. |
|
LL(*) |
Python |
2.4+ |
Recursive descent parsing library for Python based on functional combinators |
|
- |
Python |
2.7+ 3+ |
|
|
LR(1) |
Python |
2.6+ |
A fast parser, lexer combination with a concise Pythonic interface. Lots of documentation, include example parsers for SQL and Lua. |
|
PEG |
Python |
2.7+, 3.2+ |
Packrat parser. Works as interpreter. Multiple syntaxes for grammar definition. Lots of docs, examples and tutorials. |
|
|
Python |
2.7+, 3.2+ |
A high-level meta-language/parser for Domain-Specific Language implementation. Built on top of Arpeggio parser. Inspired by XText. Documentation, examples and tutorials available. |
|
LR |
Python |
3.2+ |
A fast, stand-alone parser which can export a grammar to JavaScript (jsleri), Go (goleri), C (libcleri) or Java (jleri). |
|
LR/GLR |
Python |
2.7+, 3.3+ |
A pure Python LR/GLR parser with integrated scanner (scannerless). Grammar in BNF format. Automata/GLR trace visualization. Full documentation and examples available. |
|
LALR, CFG |
Python |
2.7, 3.4+ |
LALR for speed or Earley parser for any context-free grammar. |
For faster performance, one may use other parser generator systems and plug them in as modules.
For example:
Spirit (http://spirit.sourceforge.net/) framework for writing EBNF as C++ code
FlexBisonModule (http://www.crsr.net/Software/FBModule.html)
cocktail compiler tools approach
Example of such usage is SeeGramWrap available from Edward C. Jones Python page, which is a heavily revised and upgraded version of the ANTLR C parser that is in cgram (broken link). The lastest verson has been refactored to move some of the complexity from ANTLR to Python.
Martin von Loewis presented a paper at Python10, titled "Towards a Standard Parser Generator" that surveyed the available parser generators for Python.
Additional information on these and other parsers at Python Parsing Tools.
Books
Complete online textbook, titled "Parsing: A Practical Guide".