Jan 13, 2014

Adventurous metaclassing: automatic class splitting

td;dr: I had a problem that could only be solved by splitting a class into two separate superclasses and inheriting from both. This article describes a metaclass that achieves the same effect. Contrive, it's fun.

Let's write a metaclass that splits a superclass of a class in two, while demonstrating a rare legitimate use case for mixins.

Why would you ever want to? My use case arose from an uncooperative dynamic class.

Atom (the problem)

I was building a UI using Enaml, which internally relies on Atom (formerly Traits). Among other things, Atom is an observer framework that provides bindings for Enaml, allowing attribute value changes to automatically update the UI.

To use, you subclass Atom and declare class attributes initialized to field instances, just like an ORM.

from atom.api import Atom, Int

class Model(Atom):
    counter = Int()

The problem is that Atom expects all attributes to be declared fields. If the attribute is not a known field type, Atom will turn it into a read-only field. This means you cannot have a "normal" attribute alongside an "atomic" field, and vice versa.

class MixedModel(Atom):
    counter = Int()
    x = 0  # x is read-only

# subclassing won't help
class Extended(Model):
    x = 0  # ditto

    def __init__(self):
        self.x = 1  # error: object attribute 'x' is read-only
        self.y = 1  # error: object has no attribute 'y'

Well, that's annoying. But what if an atom was a mixin, would that work?

class ModelAtom(Atom):
    counter = Int()

class ModelBase(object):
    x = 0

class Model(ModelBase, ModelAtom):
    pass

# fingers crossed...
m = Model()
m.x = 5  # works!

Solution: mixins

So, a pure atom mixin can contain our atomic fields and we can use them in our impure object. We still need a "normal" class to co-inherit from (where the other attributes will live), and it cannot be object.

This frees us from having to make the whole class atomic, but writing new mixins to contain atom fields can get tedious. Assuming that modifying Atom itself is out of the question, what can we do to automate the generation of the necessary classes?

The Metaclass

Naturally, anytime we need to mess with the class creation it's time for metaclasses. In this case, however, we are not just creating a class. We need to create up to two new classes and make our new class inherit from them instead. In other words, we need to rewrite the inheritance tree.

We do this by dynamically creating new classes and injecting them into the class bases. For consistency and reference, we will also inject them into the module as if they were defined there. One of the new classes will get all the atom fields, and subclass Atom. The other class will be optional, in case our target class is a direct subclass of object. The target class will keep all the other attributes.

def split_dict(d, test):
    a, b = {}, {}
    for k, v in d.iteritems():
        if test(k, v):
            a[k] = v
        else:
            b[k] = v
    return a, b


from atom.catom import Member
from atom.api import AtomMeta, Atom
import inspect
class AtomizerMeta(type):
    def __new__(meta, class_name, bases, attrs):
        module = inspect.getmodule(inspect.stack()[1][0])

This will the metaclass of our new mixin base, Atomizer. The metaclass will run on Atomizer itself but we only need the magic to happen in its subclasses, so skip it:

        if module.__name__ == meta.__module__ and class_name == 'Atomizer':
            return super(AtomizerMeta, meta).__new__(meta, class_name, bases, attrs)

        # pluck out atom fields
        atom_attrs, basic_attrs = split_dict(attrs, lambda k,v: isinstance(v, Member))

atom_attrs contains only the atomic fields (subclasses of Member) and basic_attrs all the rest.

One of the bases will be Atomizer itself. Because we are replacing it with two new classes, let's remove it from bases:

        new_bases = tuple(b for b in bases if b != Atomizer)
        new_classes = ()

Create the two new classes:

        # allow use as both mixin and a sole superclass
        if not new_bases:
            basic_name = class_name + '_deatomized'
            basic_class = type(basic_name, new_bases, {})
            new_classes += ((basic_name, basic_class), )
            new_bases += (basic_class, )

        if atom_attrs:
            atom_name = class_name + '_atom'
            atom_class = type(atom_name, (Atom,), atom_attrs)
            new_classes += ((atom_name, atom_class), )
            new_bases += (atom_class, )

Later new_bases will become the bases for our target class. Note that we are covering two different cases: inheriting directly from Atomizer or having it as a mixin. In the latter case we already have a "normal" class to contain our "normal" attributes.

Finally, let's populate the module space and finish creating the target class.

        # inject into the containg module
        for n, c in new_classes:
            setattr(module, n, c)
            setattr(c, '__module__', module.__name__)  # just in case

        return type(class_name, new_bases, basic_attrs)


class Atomizer(object):
    '''Smart atomizer'''
    __metaclass__ = AtomizerMeta

we haz a win?

from atom.api import Atom, Int

class A(Atomizer):
    a = Int()
    b = 0

Yep. No need to bother with separate mixin classes.