For the benefit of object persistence, the pickle module supports the notion of a reference to an object outside the pickled data stream. Such objects are referenced by a ``persistent id'', which is just an arbitrary string of printable ASCII characters. The resolution of such names is not defined by the pickle module; it will delegate this resolution to user defined functions on the pickler and unpickler3.9.
To define external persistent id resolution, you need to set the persistent_id attribute of the pickler object and the persistent_load attribute of the unpickler object.
To pickle objects that have an external persistent id, the pickler
must have a custom persistent_id() method that takes an
object as an argument and returns either None
or the persistent
id for that object. When None
is returned, the pickler simply
pickles the object as normal. When a persistent id string is
returned, the pickler will pickle that string, along with a marker
so that the unpickler will recognize the string as a persistent id.
To unpickle external objects, the unpickler must have a custom persistent_load() function that takes a persistent id string and returns the referenced object.
Here's a silly example that might shed more light:
import pickle from cStringIO import StringIO src = StringIO() p = pickle.Pickler(src) def persistent_id(obj): if hasattr(obj, 'x'): return 'the value %d' % obj.x else: return None p.persistent_id = persistent_id class Integer: def __init__(self, x): self.x = x def __str__(self): return 'My name is integer %d' % self.x i = Integer(7) print i p.dump(i) datastream = src.getvalue() print repr(datastream) dst = StringIO(datastream) up = pickle.Unpickler(dst) class FancyInteger(Integer): def __str__(self): return 'I am the integer %d' % self.x def persistent_load(persid): if persid.startswith('the value '): value = int(persid.split()[2]) return FancyInteger(value) else: raise pickle.UnpicklingError, 'Invalid persistent id' up.persistent_load = persistent_load j = up.load() print j
In the cPickle module, the unpickler's persistent_load attribute can also be set to a Python list, in which case, when the unpickler reaches a persistent id, the persistent id string will simply be appended to this list. This functionality exists so that a pickle data stream can be ``sniffed'' for object references without actually instantiating all the objects in a pickle3.10. Setting persistent_load to a list is usually used in conjunction with the noload() method on the Unpickler.