Differences between revisions 1 and 100 (spanning 99 versions)
Revision 1 as of 2002-08-23 18:48:13
Size: 1080
Editor: adsl-67-113-129-186
Comment:
Revision 100 as of 2020-06-08 22:16:24
Size: 7316
Editor: rockyb
Comment: Ned's site refers to webmaven now.
Deletions are marked like this. Additions are marked like this.
Line 3: Line 3:
  * ["shlex"] (http://www.python.org/doc/current/lib/module-shlex.html)
  * ["Plex"] (http://www.cosc.canterbury.ac.nz/~greg/python/Plex)
  * SimpleParse (http://members.rogers.com/mcfletch/programming/simpleparse/simpleparse.html)
  * ["Spark"] (http://pages.cpsc.ucalgary.ca/~aycock/spark/)
  * ["Yapps"] (http://theory.stanford.edu/~amitp/Yapps/)
  * ["PyLR"] (http://starship.python.net/crew/scott/PyLR.html)
  * ["kwParsing"] (http://www.pythonpros.com/arw/kwParsing/kwParsing.html)
  * PyBison (http://www-db.stanford.edu/~hassan/Python/pybison.tar.gz)
  * ["mcf.parse"] (http://starship.python.net/crew/mcfletch/)
  * ["Trap"] (http://www.first.gmd.de/smile/trap/)

Also see Python ParserSig @ http://www.python.org/sigs/parser-sig/ to discuss and select a standard parser generator for Python.
(!) Please keep wiki links as wiki links, use external links only if there is no existing page for the tool.
||Name ||Grammar ||Module ||Python ||Comment ||
||[[http://docs.python.org/lib/module-shlex.html|shlex]] || ||C || ||included in the main Python distribution ||
||[[Grako]] ||[[http://en.wikipedia.org/wiki/Parsing_expression_grammar|PEG]] ||Python ||2.7+, 3.3+, PyPy ||Tool that takes grammars in EBNF variant & and outputs memoizing (Packrat) PEG parsers in Python. Grako is different from other PEG parser generators in that the generated parsers use Python's very efficient exception-handling system to backtrack. ||
||[[http://reparse.readthedocs.org/en/latest/|reparse]] || ||Python/Regex ||2.x, 3.x ||Combines Regular Expressions ||
||[[http://packages.python.org/plex/|Plex]] || ||C || ||lexical analysis module for Python, foundation for [[Pyrex]] and [[http://cython.org|Cython]]. Plex 2.0.0 is Python 2 only, the version embedded in Cython works in Python 3.x. There is also an [[https://github.com/uogbuji/plex3|experimental port to Python 3 (tested on Python 3.3)]] ||
||[[http://pages.cpsc.ucalgary.ca/~aycock/spark/|Spark]] ||Earley Parser ||Python ||2.3+, 3.2+, PyPy ||[[https://pypi.org/project/spark-parser/|PyPI package]]; [[https://github.com/rocky/python-spark|github project]]. This parser is notably used in decompilers like [[https://pypi.org/project/uncompyle6/|uncompyle6]] where using an ''ambigous'' grammar is desirable. ||
||[[Yapps]] ||LL(1) ||Python ||1-any, 2-1.5+ || ||
||[[http://starship.python.net/crew/scott/PyLR.html|PyLR - (broken link)]] ||LR(1) LALR(1) ||C || || ||
||[[http://gadfly.sourceforge.net/kwParsing.html|kwParsing]] || || || || ||
||PyBison || ||C || ||bison grammar with python code actions ||
||[[http://www.ercim.org/publication/Ercim_News/enw36/ernst.html|Trap]] ||LR || ||1.5.1+ || ||
||[[http://www.dabeaz.com/ply/|PLY]] ||SLR LALR(1) ||Python || ||Python Lex-Yacc ||
||[[http://christophe.delord.free.fr/tpg/index.html|ToyParserGenerator]] || || ||2.2+ || ||
||[[http://dparser.sourceforge.net/|DParser]] ||GLR ||C ||2.2+ ||grammar in doc strings ||
||[[http://www.lava.net/~newsham/pyggy/|PyGgy - (broken link)]] ||GLR ||Python ||2.2.1 || ||
||[[http://fdik.org/pyPEG|pyPEG]] ||PEG ||Python ||2.5+ || ||
||[[http://github.com/erikrose/parsimonious|parsimonious]] ||PEG ||Python ||2.5+ || ||
||[[http://simpleparse.sourceforge.net/|SimpleParse]] ||- || ||2.0+ ||requires mxTextTools ||
||[[http://www.biopython.org/DIST/docs/api/public/Martel-module.html|Martel]] || ||Python ||2.0+ ||requires mxTextTools ||
||[[http://www.lemburg.com/files/python/mxTextTools.html|mxTextTools]] ||- ||C || ||is not exactly a parser like we're used to, but it is a fast text-processing engine ||
||[[https://pyparsing-docs.readthedocs.io/en/latest/|pyparsing]] || ||Python ||2.2+ || ||
||[[http://me.opengroove.org/2011/06/parcon-new-parser-combinator-library.html|parcon]] || ||Python ||2.6+ ||Parser combinator library, similar to pyparsing ||
||[[http://www.antlr.org/|ANTLR]] ||LL1+ ||Python || ||stand-alone tool in Java. Latest version can produce Python code ||
||[[http://www.ncc.up.pt/FAdo/Yappy|Yappy]] ||LR(0) LR(1) SLR LALR(1) ||Python ||2.2+ || ||
||[[http://pypi.python.org/pypi/ZestyParser|ZestyParser]] || ||Python || ||Object-oriented, Pythonic parsing ||
||[[http://www.canonware.com/Parsing/|Parsing]] ||LR(1) ||Python ||2.5+ || ||
||[[http://msdl.cs.mcgill.ca/people/eposse/projects/aperiot|aperiot]] ||LL(1) ||Python || ||uses separate grammar files ||
||[[http://bitbucket.org/namenlos/yeanpypa|yeanpypa]] || ||Python || ||inspired by pyparsing and boost::spirit ||
||[[http://seehuhn.de/pages/wisent|Wisent]] ||LR(1) ||Python ||2.4+ ||has separate parser input file, parser output is a parse tree ||
||[[http://lparis45.free.fr/rp.html|RP]] ||na ||Python ||2.6+ ||Simple parser using rule defined in BNF format ||
||[[http://www.acooke.org/lepl/|LEPL]] ||Any ||Python ||2.6+,3+ ||Recursive descent with full backtracking and optional memoisation (which can handle left recursive grammars). So equivalent to GLR, but based on LL(k) core. ||
||[[http://pypi.python.org/pypi/modgrammar|modgrammar]] ||GLR ||Python ||3.1+ ||Recursive descent parser with full backtracking. Grammar elements and results are defined as Python classes, so are fully customizable. Supports ambiguous grammars. ||
||[[http://code.google.com/p/funcparserlib/|funcparserlib]] ||LL(*) ||Python ||2.4+ ||Recursive descent parsing library for Python based on functional combinators ||
||[[http://pydsl.org|pydsl]] ||- ||Python ||2.7+ 3+ || ||
||[[http://pypi.python.org/pypi/lrparsing|lrparsing]] ||LR(1) ||Python ||2.6+ ||A fast parser, lexer combination with a concise Pythonic interface. Lots of documentation, include example parsers for SQL and Lua. ||
||[[https://github.com/textX/Arpeggio|Arpeggio]] ||PEG ||Python ||2.7+, 3.2+ ||Packrat parser. Works as interpreter. Multiple syntaxes for grammar definition. Lots of docs, examples and tutorials. ||
||[[https://github.com/textX/textX|textX]] || ||Python ||2.7+, 3.2+ ||A high-level meta-language/parser for Domain-Specific Language implementation. Built on top of Arpeggio parser. Inspired by XText. Documentation, examples and tutorials available. ||
||[[https://github.com/transceptor-technology/pyleri|pyleri]] ||LR ||Python ||3.2+ ||A fast, stand-alone parser which can export a grammar to JavaScript ([[https://github.com/transceptor-technology/jsleri|jsleri]]), Go ([[https://github.com/transceptor-technology/goleri|goleri]]), C ([[https://github.com/transceptor-technology/libcleri|libcleri]]) or Java ([[https://github.com/transceptor-technology/jleri|jleri]]). ||
||[[https://github.com/igordejanovic/parglare|parglare]] ||LR/GLR ||Python ||2.7+, 3.3+ ||A pure Python LR/GLR parser with integrated scanner (scannerless). Grammar in BNF format. Automata/GLR trace visualization. Full documentation and examples available. ||
Line 17: Line 45:
For some speed up one may use other parser generator systems and plug them in as modules.

For faster performance, one may use other parser generator systems and plug them in as modules.
Line 20: Line 50:
  * ["ANTLR"] (http://www.antlr.org/) C++ output (an older version with C output also available)
 * [[Spirit]] (http://spirit.sourceforge.net/) framework for writing EBNF as C++ code
 * FlexBisonModule (http://www.crsr.net/Software/FBModule.html)
 * [[http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&selm=61g0ff$39g$1@vishnu.jussieu.fr|cocktail compiler tools]] approach

Example of such usage is SeeGramWrap available from Edward C. Jones [[http://members.tripod.com/~edcjones/pycode.html|Python page]], which is a heavily revised and upgraded version of the ANTLR C parser that is in [[http://members.tripod.com/~edcjones/cgram.tar.gz|cgram]] (broken link). The lastest verson has been refactored to move some of the complexity from ANTLR to Python.

Martin von Loewis presented a paper at Python10, titled [[http://www.python.org/sigs/parser-sig/towards-standard.html|"Towards a Standard Parser Generator"]] that surveyed the available parser generators for Python.

Additional information on these and other parsers at [[https://github.com/webmaven/python-parsing-tools|Python Parsing Tools]].

== Books ==
 * Complete online textbook, titled [[http://dickgrune.com/Books/PTAPG_1st_Edition/|"Parsing: A Practical Guide"]].

Small discussion and evaluation of different parsers.

(!) Please keep wiki links as wiki links, use external links only if there is no existing page for the tool.

Name

Grammar

Module

Python

Comment

shlex

C

included in the main Python distribution

Grako

PEG

Python

2.7+, 3.3+, PyPy

Tool that takes grammars in EBNF variant & and outputs memoizing (Packrat) PEG parsers in Python. Grako is different from other PEG parser generators in that the generated parsers use Python's very efficient exception-handling system to backtrack.

reparse

Python/Regex

2.x, 3.x

Combines Regular Expressions

Plex

C

lexical analysis module for Python, foundation for Pyrex and Cython. Plex 2.0.0 is Python 2 only, the version embedded in Cython works in Python 3.x. There is also an experimental port to Python 3 (tested on Python 3.3)

Spark

Earley Parser

Python

2.3+, 3.2+, PyPy

PyPI package; github project. This parser is notably used in decompilers like uncompyle6 where using an ambigous grammar is desirable.

Yapps

LL(1)

Python

1-any, 2-1.5+

PyLR - (broken link)

LR(1) LALR(1)

C

kwParsing

PyBison

C

bison grammar with python code actions

Trap

LR

1.5.1+

PLY

SLR LALR(1)

Python

Python Lex-Yacc

ToyParserGenerator

2.2+

DParser

GLR

C

2.2+

grammar in doc strings

PyGgy - (broken link)

GLR

Python

2.2.1

pyPEG

PEG

Python

2.5+

parsimonious

PEG

Python

2.5+

SimpleParse

-

2.0+

requires mxTextTools

Martel

Python

2.0+

requires mxTextTools

mxTextTools

-

C

is not exactly a parser like we're used to, but it is a fast text-processing engine

pyparsing

Python

2.2+

parcon

Python

2.6+

Parser combinator library, similar to pyparsing

ANTLR

LL1+

Python

stand-alone tool in Java. Latest version can produce Python code

Yappy

LR(0) LR(1) SLR LALR(1)

Python

2.2+

ZestyParser

Python

Object-oriented, Pythonic parsing

Parsing

LR(1)

Python

2.5+

aperiot

LL(1)

Python

uses separate grammar files

yeanpypa

Python

inspired by pyparsing and boost::spirit

Wisent

LR(1)

Python

2.4+

has separate parser input file, parser output is a parse tree

RP

na

Python

2.6+

Simple parser using rule defined in BNF format

LEPL

Any

Python

2.6+,3+

Recursive descent with full backtracking and optional memoisation (which can handle left recursive grammars). So equivalent to GLR, but based on LL(k) core.

modgrammar

GLR

Python

3.1+

Recursive descent parser with full backtracking. Grammar elements and results are defined as Python classes, so are fully customizable. Supports ambiguous grammars.

funcparserlib

LL(*)

Python

2.4+

Recursive descent parsing library for Python based on functional combinators

pydsl

-

Python

2.7+ 3+

lrparsing

LR(1)

Python

2.6+

A fast parser, lexer combination with a concise Pythonic interface. Lots of documentation, include example parsers for SQL and Lua.

Arpeggio

PEG

Python

2.7+, 3.2+

Packrat parser. Works as interpreter. Multiple syntaxes for grammar definition. Lots of docs, examples and tutorials.

textX

Python

2.7+, 3.2+

A high-level meta-language/parser for Domain-Specific Language implementation. Built on top of Arpeggio parser. Inspired by XText. Documentation, examples and tutorials available.

pyleri

LR

Python

3.2+

A fast, stand-alone parser which can export a grammar to JavaScript (jsleri), Go (goleri), C (libcleri) or Java (jleri).

parglare

LR/GLR

Python

2.7+, 3.3+

A pure Python LR/GLR parser with integrated scanner (scannerless). Grammar in BNF format. Automata/GLR trace visualization. Full documentation and examples available.

For faster performance, one may use other parser generator systems and plug them in as modules.

For example:

Example of such usage is SeeGramWrap available from Edward C. Jones Python page, which is a heavily revised and upgraded version of the ANTLR C parser that is in cgram (broken link). The lastest verson has been refactored to move some of the complexity from ANTLR to Python.

Martin von Loewis presented a paper at Python10, titled "Towards a Standard Parser Generator" that surveyed the available parser generators for Python.

Additional information on these and other parsers at Python Parsing Tools.

Books

LanguageParsing (last edited 2022-01-27 20:15:14 by joente)

Unable to edit the page? See the FrontPage for instructions.