1128
Comment:
|
← Revision 6 as of 2008-11-15 14:01:15 ⇥
1167
converted to 1.6 markup
|
Deletions are marked like this. | Additions are marked like this. |
Line 1: | Line 1: |
#acl All:read | |
Line 33: | Line 34: |
- lists & dicts - s.join(list) instead of accumulating |
- lists & dictionaries - ``s.join(list)`` instead of accumulating |
Line 49: | Line 50: |
- s.split() - s.find() |
- ``text.split()`` - ``text.find()`` |
Intended Audience
Beginning to intermediate programmers. A basic working knowledge of Python is assumed.
Summary
This tutorial will introduce beginning to intermediate programmers to the many useful Python tools & techniques for text and data processing. Topics will include regular expressions, filtering data with generators, and parsing.
Outline
- Common data sources needing processing:
- log files
- CSV
- tabular data
- XML
- Tools & techniques:
- lists & dictionaries
- s.join(list) instead of accumulating
- for line in file
- filters, large data sources: generators
- decorate-sort-undecorate
- StringIO
- Regular expressions:
- pattern matching
- filtering
- substitution
- splitting
- Parsing:
- text.split()
- text.find()
- regular expressions
- "real" parsers (including XML)
- state machines
Please send feedback & ideas for further specific topics to the trainer, David Goodger (email, home page).