Package recsql :: Module rest_table
[hide private]
[frames] | no frames]

Module rest_table

source code


:mod:`recsql.rest_table` --- Parse a simple reST table
======================================================

Turn a `restructured text simple table`_ into a numpy array. See the Example_
below for how the table must look like. The module allows inclusion of
parameters and data in the documentation itself in a natural way. Thus the
parameters are automatically documented and only exist in a single place. The
idea is inspired by `literate programming`_ and is embodied by the DRY_ ("Do not
repeat yourself") principle.

.. _restructured text simple table:
    http://docutils.sourceforge.net/docs/ref/rst/restructuredtext.html#simple-tables
.. _literate programming:
    http://en.wikipedia.org/wiki/Literate_programming
.. _DRY:
    http://c2.com/cgi/wiki?DontRepeatYourself

Limitations
-----------

Note that not the full specifications of the original `restructured
text simple table`_ are supported. In order to keep the parser simple,
the following additional restriction apply:

* All row data must be on a single line.
* Column spans are not supported.
* Headings must be single legal SQL and python words as they are used
  as column names.
* The delimiters are used to extract the fields. Only data within the
  range of the '=====' markers is used. Thus, each column marker
  *must* span the whole range of input. Otherwise, data will be lost.  
* The keyword 'Table' must precede the first marker line and the table
  name must be provided in square brackets; the table name should be a
  valid SQL identifier.
* Currently, only a *single* table can be present in the string.
* Autoconversion of list fields might not always work...


Example
-------

The following table is converted::

  Table[laureates]: Physics Nobel prize statistics.
  =============  ==========  =========
  name           age         year
  =============  ==========  =========
  A. Einstein    42          1921
  P. Dirac       31          1933
  R. P. Feynman  47          1965
  =============  ==========  =========

with

  >>> import recsql.rest_table as T
  >>> P = T.Table2array(T.__doc__)
  >>> P.recarray()
  rec.array([(u'A. Einstein', 42, 1921), (u'P. Dirac', 31, 1933),
       (u'R. P. Feynman', 47, 1965)], 
      dtype=[('name', '<U52'), ('age', '<i4'), ('year', '<i4')])


Module content
--------------

The only class that the user really needs to know anything about is
:class:`recsql.rest_table.Table2array`.

.. autoclass:: Table2array
   :members: __init__, recarray

.. autoexception:: ParseError

Classes [hide private]
  ParseError
Signifies a failure to parse.
  Table2array
Primitive parser that converts a simple reST table into ``numpy.recarray``.
Variables [hide private]
  TABLE = re.compile(...
Python regular expression that finds a *single* table in a multi-line string.
  EMPTY_ROW = re.compile(...
Python regular expression that detects a empty (non-data) line in a reST table.
Variables Details [hide private]

TABLE

Python regular expression that finds a *single* table in a multi-line string.

Value:
re.compile("""
                   ^[ \t]*Table(\[(?P<name>\w*)\])?:\s*(?P<title>[^\n]\
*)[ \t]*$     # 'Table[name]:' is required
                   [\n]+
                   ^(?P<toprule>[ \t]*==+[ \t=]+)[ \t]*$  # top rule
                   [\n]+
                   ^(?P<fields>[\w\t ]+?)$                # field name\
s (columns), must only contain A-z0-9_
...

EMPTY_ROW

Python regular expression that detects a empty (non-data) line in a reST table. It acts on a single input line and not a multi-line string.

Value:
re.compile("""
                   ^[-\s]*$       # white-space lines or '----' divide\
rs are ignored (or '-- - ---')
                   """, re.VERBOSE)