Credit: John E. Barham
Thread-specific storage is a useful pattern, and Python does not
support it directly. A simple
dictionary,
protected by a lock, makes it pretty easy to program. For once,
it’s slightly more general, and not significantly
harder, to program to the lower-level
thread
module, rather than to the more common,
higher-level threading
module that Python also
offers on top of it:
try: import thread except: """ We're running on a single-threaded OS (or the Python interpreter has not been compiled to support threads), so return a standard dictionary. """ _tss = {} def get_thread_storage( ): return _tss else: """ We do have threads; so, to work: """ _tss = {} _tss_lock = thread.allocate_lock( ) def get_thread_storage( ): """ Return a thread-specific storage dictionary. """ thread_id = thread.get_ident( ) # Identify the calling thread tss = _tss.get(thread_id) if tss is None: # First time being called by this thread try: # Entering critical section _tss_lock.acquire( ) _tss[thread_id] = tss = {} # Create thread-specific dictionary finally: _tss_lock.release( ) return tss
The
get_thread_storage
function in this recipe returns a thread-specific storage dictionary.
It is a generalization of the get_transaction
function from ZODB, the object database underlying Zope. The returned
dictionary can be used to store data that is private to the thread.
One benefit of multithreaded programs is that all of the threads can share global objects. Often, however, each thread needs some storage of its own—for example, to store a network or database connection unique to itself. Indeed, such externally oriented objects are best kept under the control of a single thread to avoid multiple possibilities of highly peculiar behavior, race conditions, and so on.
The get_thread_storage
function returns a
dictionary object that is unique to each thread. For an exhaustive
treatment of thread-specific storage (albeit aimed at C++
programmers), see http://www.cs.wustl.edu/~schmidt/PDF/TSS-pattern.pdf.
A useful extension would be to add a
delete_thread_storage
function, particularly if a way could be found to automate its being
called upon thread termination. Python’s threading
architecture does not make this task particularly easy. You could
spawn a watcher thread to do the deletion after a join with the
calling thread, but that might be rather heavyweight. The recipe as
presented, without deletion, is quite appropriate for the common
architecture in which you have a pool of (typically daemonic) worker
threads that are spawned at the start and do not go away until the
end of the whole process.
“Thread-specific Storage: an Object Behavioral Pattern for Efficiently Accessing per-Thread State”, by Douglas Schmidt, Timothy Harrisson, and Nat Pryce (http://www.cs.wustl.edu/~schmidt/PDF/TSS-pattern.pdf).