Credit: Luther Blissett
PostgreSQL 7.2 supports large objects, and the
psycopg
module supplies a
Binary
escaping function:
import psycopg, cPickle # Connect to a DB, e.g., the test DB on your localhost, and get a cursor connection = psycopg.connect("dbname=test") cursor = connection.cursor( ) # Make a new table for experimentation cursor.execute("CREATE TABLE justatest (name TEXT, ablob BYTEA)") try: # Prepare some BLOBs to insert in the table names = 'aramis', 'athos', 'porthos' data = {} for name in names: datum = list(name) datum.sort( ) data[name] = cPickle.dumps(datum, 1) # Perform the insertions sql = "INSERT INTO justatest VALUES(%s, %s)" for name in names: cursor.execute(sql, (name, psycopg.Binary(data[name])) ) # Recover the data so you can check back sql = "SELECT name, ablob FROM justatest ORDER BY name" cursor.execute(sql) for name, blob in cursor.fetchall( ): print name, cPickle.loads(blob), cPickle.loads(data[name]) finally: # Done. Remove the table and close the connection. cursor.execute("DROP TABLE justatest") connection.close( )
PostgreSQL supports binary data (BYTEA and variations thereof), but
you need to be careful when communicating such data via SQL.
Specifically, when you use a normal
INSERT
SQL statement and need to have
binary strings among the VALUES
you’re inserting, you need to escape some characters
in the binary string according to PostgreSQL’s own
rules. Fortunately, you don’t have to figure out
those rules for yourself: PostgreSQL supplies functions that do all
the needed escaping, and psycopg
exposes such a
function to your Python programs as the Binary
function. This recipe shows a typical case: the BYTEAs
you’re inserting come from
cPickle.dumps
, so they may represent almost
arbitrary Python objects (although, in this case,
we’re just using them for a few lists of
characters). The recipe is purely demonstrative and works by creating
a table and dropping it at the end (using a
try/finally
statement to ensure finalization is
performed even if the program terminates because of an uncaught
exception).
Earlier PostgreSQL releases put limits of a few KB on the amount of
data you could store in a normal field of the database. To store
really large objects, you needed to use roundabout techniques to load
the data into the database (such as PostgreSQL’s
nonstandard SQL function LO_IMPORT
to load a
datafile as an object, which requires superuser privileges and
datafiles that reside on the machine running the PostgreSQL server)
and store a field of type OID
in the table to be
used later for indirect recovery of the data. Fortunately, none of
these techniques are necessary anymore: since Release 7.1 (the
current release at the time of writing is 7.2.1), PostgreSQL embodies
the results of project TOAST, which removes the limitations on
field-storage size and therefore the need for peculiar indirection.
psycopg
supplies the handy
Binary
function to let you escape any binary
string of bytes into a form acceptable for placeholder substitution
in INSERT
and UPDATE
SQL
statements.
Recipe 8.7 for a MySQL-oriented solution to the same problem; PostgresSQL’s home page (http://www.postgresql.org/); the Python/PostgreSQL module (http://initd.org/software/psycopg).