Wednesday, December 5, 2012

Persistent but not Persistent

When is a persistent object not necessarily persistent?

You might think that persistent means that the object can be stored in some durable data store. But there is another meaning [1].

This other meaning describes an optimal way of dealing with data structures. Instead of mutating these data structures, they are decorated with an object that represents the changed structure.

This is easier for some structures than others. For instance, in a linked list each element only knows about the element in front of it. So, if you wanted to add a new chain to the end of the list, you'd just create a new object that represented that chain and whose head pointed at the tail of the original list.

The important thing to remember is that the original list has not changed. So, there is no need to worry about mutexes or race conditions.

There are downsides. If you want to insert in the middle of a single linked list, you'd have to copy the elements in the old list that followed your insertion point into your new chain.

Also, this technique does not work with some data structures, eg double linked lists.

That said, the Java collection classes employ this technique to good effect. If you look at TreeMap.subSet(...) for instance you'll see a new set is returned that contains a subset of the original by wrapping it.

Note: not only do changes in one set affect the other, the original data structure stays in memory even if all other references to it disappear. We found this the hard way when we thought we'd discard elements of a large list but found we were still consuming large amounts of memory.

[1] Persistent Data Structures.

No comments:

Post a Comment