Differences between revisions 4 and 5
Revision 4 as of 2003-02-26 20:32:55
Size: 1868
Editor: 170
Comment:
Revision 5 as of 2003-02-26 20:34:07
Size: 1890
Editor: 170
Comment:
Deletions are marked like this. Additions are marked like this.
Line 46: Line 46:

PyConFrancescAlted

Processing And Analyzing Extremely Large Amounts Of Data In Python

Abstract

Many scientific applications frequently need to save and read extremely large amounts of data (frequently, this data is derived from experimental devices). Analyzing the data requires re-reading it many times in order to select the most appropriate data that reflects the scenario under study. In general, it is not necessary to modify the gathered data (except perhaps to enlarge the dataset), but simply access it multiple times from multiple points of entry.

The goal of [http://pytables.sourceforge.net PyTables] is to address this requirements by enabling the end user to manipulate easily scientific data tables, numarray objects and Numerical Python objects in a persistent, hierarchical structure.

Capabilities

During my talk, I'll be describing the capabilities of the forthcoming PyTables 0.3 version, including:

  • Appendable tables: It supports adding records to already created tables.
  • Unlimited data size: Allows working with tables with a large number of records, i.e. that don't fit in memory.
  • Support of Numeric and numarray Python arrays.
  • Hierarchical data model: Pytables builds up an object tree in memory that replicates the hierarchical structure existing on disk.
  • Data compression: It supports data compression (through the use of the zlib library) out of the box.
  • Support of files bigger than 2 GB.
  • Ability to read generic HDF5 files and work natively with them.
  • Architecture-independent: PyTables has been carefully coded (as HDF5 itself) with little-endian/big-endian byte orderings issues in mind.

  • Optimized I/O: PyTables has been designed from the ground with performance in mind. We will see some benchmarks comparing PyTables speed with other databases.

PyConFrancescAlted

PyTables (last edited 2008-11-15 14:01:27 by localhost)

Unable to edit the page? See the FrontPage for instructions.