May 21, 2013

Support your local caching

Suppose you had an expensive function xyz() whose output does not change often. Caching sounds appropriate. You may be inclined to memoize the function at the source, maybe by adding a decorator.

1
2
3
4
5
@memoized
def xyz(a, b):
    ...

result = xyz('foo', 'bar')

You have just added behavior on which downstream users are likely to end up depending; and imposed a design decision on them. This is fine in most cases.

But perhaps not all users need xyz() cached. Perhaps the results are only good for some time but only the user knows exactly how long. Perhaps the caching behavior will depend on factors you do not even know ahead of time. Now you have a problem: making a simple decision that works for everyone is not trivial any more.

To avoid a contrived dependency configuration, consider caching at the usage point — the function call itself — and use it where you need it.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
Stamped = namedtuple('Stamped', 'stamp obj')
cache = {}

def cached(ttl, func, *args, **kw):
    'Simple inline cache for function calls'

    key = (func.__name__,) + args
    entry = cache.get(key, None)
    now = time.time()

    if not entry or (now - entry.stamp) > ttl:
        result = func(*args, **kw)
        entry = cache.setdefault(key, Stamped(now, result))
    return entry.obj

Then the calling code might become:

1
2
3
4
5
6
...
# ttl is the number of seconds until a cached value expires
ttl = 60 * some_number_of_minutes_only_i_know

result = cached(ttl, xyz, 'foo', 'bar')
...

Now your caching can be as dynamic as needed, the other users need not know the details and the library function can stay simple. The caching behavior stays with the rest of the user logic.

We use a function name and argument tuple as a key for illustration purposes. You may need to index your functions differently — and again, with minimal local inline caching you can be maximally lazy about it.