Package recsql :: Module sqlarray :: Class SQLarray
[hide private]
[frames] | no frames]

Class SQLarray

source code

object --+
         |
        SQLarray

A SQL table that returns (mostly) rec arrays.

.. method:: SQLarray([name[,records[,columns[,cachesize=5,connection=None]]]])

:Arguments:
   name        
      table name (can be referred to as '__self__' in SQL queries)
   records    
      numpy record array that describes the layout and initializes the
      table OR any iterable (and then columns must be set, too) OR a string
      that contains a single, *simple reStructured text table* (and the table name is
      set from the table name in the reST table.)
      If ``None`` then simply associate with existing table name.
   filename
      Alternatively to *records*, read a reStructured table from *filename*.
   columns
      sequence of column names (only used if records does not have 
      attribute dtype.names) [``None``]
   cachesize   
      number of (query, result) pairs that are cached [5]
   connection  
      If not ``None``, reuse this connection; this adds a new table to the same 
      database, which allows more complicated queries with cross-joins. The 
      table's connection is available as the attribute T.connection. [``None``]
   is_tmp
      ``True``: create a tmp table; ``False``: regular table in db [``False``] 

:Bugs:        
   * :exc:`InterfaceError`: *Error binding parameter 0 - probably unsupported type*

     In this case the recarray contained types such as ``numpy.int64`` that are not
     understood by sqlite. Either convert the data manually (by setting the numpy 
     dtypes yourself on the recarray, or better: feed a simple list of tuples ("records")
     to this class in *records*. Make sure that these tuples only contain standard python types.
     Together with *records* you will also have to supply the names of the data columns
     in the keyword argument *columns*.

     If you are reading from a file then it might be simpler to
     use :func:`recsql.sqlarray.SQLarray_fromfile`.

Instance Methods [hide private]
 
__init__(self, name=None, records=None, filename=None, columns=None, cachesize=5, connection=None, is_tmp=False, **kwargs)
Build the SQL table from a numpy record array.
source code
 
merge(self, recarray, columns=None)
Merge another recarray with the same columns into this table.
source code
 
merge_table(self, name)
Merge an existing table in the database with the __self__ table.
source code
 
sql_index(self, index_name, column_names, unique=True)
Add a named index on given columns to improve performance.
source code
 
sql_select(self, fields, *args, **kwargs)
Execute a simple SQL ``SELECT`` statement and returns values as new numpy rec array.
source code
 
SELECT(self, fields, *args, **kwargs)
Execute a simple SQL ``SELECT`` statement and returns values as new numpy rec array.
source code
 
sql(self, SQL, parameters=None, asrecarray=True, cache=True)
Execute sql statement.
source code
 
limits(self, variable)
Return minimum and maximum of variable across all rows of data.
source code
 
selection(self, SQL, parameters=None, **kwargs)
Return a new SQLarray from a SELECT selection.
source code
 
_init_sqlite_functions(self)
additional SQL functions to the database
source code
 
__len__(self)
Number of rows in the table.
source code
 
__del__(self)
Delete the underlying SQL table from the database.
source code

Inherited from object: __delattr__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __str__

Class Variables [hide private]
  tmp_table_name = '__tmp_merge_table'
  recarray = property(** recarray())
Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

__init__(self, name=None, records=None, filename=None, columns=None, cachesize=5, connection=None, is_tmp=False, **kwargs)
(Constructor)

source code 

Build the SQL table from a numpy record array.

Overrides: object.__init__

merge(self, recarray, columns=None)

source code 
Merge another recarray with the same columns into this table.

:Arguments:
   recarray    
      numpy record array that describes the layout and initializes the
      table

:Returns:
   n           number of inserted rows
   
:Raises:
   Raises an exception if duplicate and incompatible data exist
   in the main table and the new one.

merge_table(self, name)

source code 
Merge an existing table in the database with the __self__ table.

Executes as ``'INSERT INTO __self__ SELECT * FROM <name>'``.
However, this method is probably used less often than the simpler :meth:`merge`.

:Arguments:
   name         name of the table in the database (must be compatible with __self__)

:Returns:
   n            number of inserted rows

sql_select(self, fields, *args, **kwargs)

source code 
Execute a simple SQL ``SELECT`` statement and returns values as new numpy rec array.

The arguments *fields* and the additional optional arguments
are simply concatenated with additional SQL statements
according to the template::

   SELECT <fields> FROM __self__ [args]

The simplest fields argument is ``"*"``.

Example:
   Create a recarray in which students with average grade less than
   3 are listed::

     result = T.SELECT("surname, subject, year, avg(grade) AS avg_grade",
                    "WHERE avg_grade < 3", "GROUP BY surname,subject",
                    "ORDER BY avg_grade,surname")

   The resulting SQL would be::

     SELECT surname, subject, year, avg(grade) AS avg_grade FROM __self__
          WHERE avg_grade < 3
          GROUP BY surname,subject
          ORDER BY avg_grade,surname

   Note how one can use aggregate functions such avg().

   The string *'__self__'* is automatically replaced with the table
   name (``T.name``); this can be used for cartesian products such as ::

      LEFT JOIN __self__ WHERE ...

.. Note:: See the documentation for :meth:`~SQLarray.sql` for more details on
          the available keyword arguments and the use of ``?`` parameter
          interpolation.

SELECT(self, fields, *args, **kwargs)

source code 
Execute a simple SQL ``SELECT`` statement and returns values as new numpy rec array.

The arguments *fields* and the additional optional arguments
are simply concatenated with additional SQL statements
according to the template::

   SELECT <fields> FROM __self__ [args]

The simplest fields argument is ``"*"``.

Example:
   Create a recarray in which students with average grade less than
   3 are listed::

     result = T.SELECT("surname, subject, year, avg(grade) AS avg_grade",
                    "WHERE avg_grade < 3", "GROUP BY surname,subject",
                    "ORDER BY avg_grade,surname")

   The resulting SQL would be::

     SELECT surname, subject, year, avg(grade) AS avg_grade FROM __self__
          WHERE avg_grade < 3
          GROUP BY surname,subject
          ORDER BY avg_grade,surname

   Note how one can use aggregate functions such avg().

   The string *'__self__'* is automatically replaced with the table
   name (``T.name``); this can be used for cartesian products such as ::

      LEFT JOIN __self__ WHERE ...

.. Note:: See the documentation for :meth:`~SQLarray.sql` for more details on
          the available keyword arguments and the use of ``?`` parameter
          interpolation.

sql(self, SQL, parameters=None, asrecarray=True, cache=True)

source code 
Execute sql statement. 

:Arguments:
   SQL : string
      Full SQL command; can contain the ``?`` place holder so that values
      supplied with the ``parameters`` keyword can be interpolated using
      the ``pysqlite`` interface.
   parameters : tuple
      Parameters for ``?`` interpolation.
   asrecarray : boolean
      ``True``: return a ``numpy.recarray`` if possible;
      ``False``: return records as a list of tuples. [``True``]
   cache : boolean
      Should the results be cached? Set to ``False`` for large queries to
      avoid memory issues. Queries with ``?`` place holders are never cached.
      [``True``]
      
.. warning::
   There are **no sanity checks** applied to the SQL. 

If  possible, the  returned list  of tuples  is turned  into a
numpy record  array, otherwise the original list  of tuples is
returned.

.. warning::
   Potential BUG: if there are memory issues then it can
   happen that we just silently fall back to a tuple even
   though calling code expects a recarray; because we
   swallowed ANY exception the caller will never know

The last cachesize queries are cached (for cache=True) and are
returned directly unless the table has been modified.

.. Note:: '__self__' is substituted with the table name. See the doc
          string of the :meth:`SELECT` method for more details.

selection(self, SQL, parameters=None, **kwargs)

source code 

Return a new SQLarray from a SELECT selection.

This is a very useful method because it allows one to build complicated selections and essentially new tables from existing data.

Examples:

       s = selection('a > 3')
       s = selection('a > ?', (3,))
       s = selection('SELECT * FROM __self__ WHERE a > ? AND b < ?', (3, 10))