Intended Audience
Beginning to intermediate programmers. A basic working knowledge of Python is assumed.
Summary
This tutorial will introduce beginning to intermediate programmers to the many useful Python tools & techniques for text and data processing. Topics will include regular expressions, filtering data with generators, and parsing.
Outline
- Common data sources needing processing:
- log files
- CSV
- tabular data
- XML
- Tools & techniques:
- lists & dicts
- s.join(list) instead of accumulating
- for line in file
- filters, large data sources: generators
- decorate-sort-undecorate
- StringIO
- Regular expressions:
- pattern matching
- filtering
- substitution
- splitting
- Parsing:
- s.split()
- s.find()
- regular expressions
- "real" parsers
- state machines
Trainer: David Goodger (email)