F-ing Modules: How Do They Work?
If you're a Rubyist, you likely love modules. You use them for namespaces, you use them to mix in behavior across classes, and you might even use them as placeholders for global methods. This post is about how mixing in modules works in Ruby 1.9. I mean how they really work: how it's implemented. I'm a forward-thinking guy so I haven't bothered to research 1.8.x, but I imagine it's going to be much the same.
(Per usual, this post comes from my work on Laser. Since my work there requires statically inferring class hierarchies, I needed to know precisely how inclusion was implemented.)
Full Multiple Inheritance's Drawbacks
Multiple Inheritance has a pretty bad rep. It's featured prominently in C++ and Python (and CLOS, and Eiffel, and...), but common wisdom is that it can be dangerous. Google's C++ style guide lists this conclusion about multiple inheritance:
Only very rarely is multiple implementation inheritance actually useful. We allow multiple inheritance only when at most one of the base classes has an implementation; all other base classes must be pure interface classes tagged with the Interface suffix.
In other words, they only allow the "extra" superclasses if they have no code in them. This is exactly like Java: one superclass, many interfaces.
Why is multiple inheritance in C++ considered dangerous or confusing? A big part of it is the question of overriding precedence. Inheritance trees become inheritance graphs, and inheritance graphs like this one:

scare people silly in the presence of subclasses overriding superclass methods. In C++,
where instances of classes are constructed by essentially concatenating the superclass
fields/vtables, there's even multiple instances of Top in an instance of Confused,
unless you use virtual base classes. In other words, you need an additional language
feature to manage the complications introduced by multiple inheritance.
A little closer to home for a Ruby programmer: if both superclasses implement void foo(),
then when the Confused class calls foo(), which will it get?
In C++ and Python, it'll get SubOne's implementation. Why? Because it's listed first.
Not a very satisfactory answer, and a pretty fragile one, but at least it's well-defined.
As the graphs get larger, so does the mental overhead of managing multiple inheritance.
What if SubTwo implements bar(), Top implements bar(), but SubOne does not? Will
calling bar() on a Confused object depth first search up the tree and reach Top's
implementation first, or will it first check SubTwo after not finding it in SubOne?
The C3 linearization is one solution
to this problem that has been adopted by Perl, Python and others.
This is a Ruby post, so I won't weigh in on either side of the C++ debate; I don't have to, because as Rubyists, we get to have our multiple inheritance cake and eat it without diamond inheritance too.
Mixing in Modules is Just Like Multiple Inheritance
Ruby has single inheritance. I promise that this will remain true no matter how deep we go into the depths of how Ruby works. But we can mix in modules too. It's important to recognize that the intent of modules is to achieve the intent of multiple inheritance: pulling in implementation details from multiple sources. When we write in Ruby:
class Awesome < NotSoAwesome include Comparable include Enumerable end
what we mean is: "Awesome should inherit everything from NotSoAwesome, but also the traits of a comparable object and an enumerable object." Naïvely, that's this:

Which seems just fine, at first. But there's the question of overrides: if NotSoAwesome defines
#max, will it get overridden by Comparable's implementation? Or will Comparable win? Even
disregarding that, if we continue with this naïve approach, diamond inheritance is just around
the corner:
module M end module N include M end class C include M end class D < C include N end

Damn. I don't want trouble, and that looks a lot like the multiple inheritance that is now avoided like the plague. We know Ruby isn't like this, though, luckily. Precisely how it works is where it gets fun.
Ancestors
The truth is, Ruby does in fact only have single inheritance, even though modules are part of the inheritance tree. We can see the ancestor chains respect this (using the previous example):
p C.ancestors #=> [C, M, Object, Kernel, BasicObject] p D.ancestors #=> [D, N, C, M, Object, Kernel, BasicObject] p N.ancestors #=> [N, M]
This seems counterintuitive. N's ancestors are [N, M], yet in D's ancestors, we see that
C follows N in its inheritance chain. I've already promised that there's only single inheritance
in Ruby, but it sure looks like something funny is afoot. One hears that including a module
"inserts the module in the inheritance hierarchy" but exactly how does that happen in the presence
of single inheritance? First of all, I thought modules didn't have superclasses. How can a module
be between two classes in the hierarchy?
Invisible Classes (Proxy Classes)
That's right, invisible classes. I say invisible because many of the built-in methods, such as
.superclass, skip on past them. Modules themselves actually have superclasses, but they're
of the invisible variety, so there is no Module#superclass method. But it's true: modules
including other modules is implemented as inheritance!
Anyway, just calling these classes invisible ignores their point. They're proxies for the modules we include. When you include a module, you aren't adding a pointer to that module: you're creating a whole new class! That class's method table, ivar table, constant table, and other important parts just happen to be pointing to the module's tables. While we couldn't change the module's superclass (that would break the module), we can change our new class's superclass! So this:

becomes this: (dashes indicate from where the class gets its methods/etc)

Whoa. That looks a lot messier. But if we ignore the dashed lines showing where methods come from, and treat the "Proxy" classes as black boxes, then we get this:

As a meta-note, that image is just the messy one with the dashed edges removed and re-run through GraphViz. No manual touch-up: if you ignore the proxy edges, you can very easily see that we have an inheritance tree again, and not a graph.
The funny thing about these proxy classes is that they speak to Ruby's duck typing quite a bit:
D.ancestors[1] is a different object from N, but there's no way to tell. If you call any
method on D.ancestors[1] it'll simply go straight to N itself: its guts have been swapped
for N's so carefully the two are indistinguishable.
Done, right? Not even close.
Harder Examples
What we've seen so far is intuitive. It's a straightforward flattening of the inheritance graph. We haven't seen a downside to what Ruby has chosen. Yet. A more illustrative example will reveal them. We'll need some more classes and modules, though.
module M end module N include M end module O include N end class C include M end class D < C end class E < D include O end class F < E include M end class G < D include N end class H < G include O end
With traditional multiple inheritance, this would look like this:

But that's not close to what we get in Ruby. So let's use Proxy classes instead:

Without showing the actual proxy relationships:

That's better. There's a few behaviors that can be unintuitive or surprising visible in this inheritance graph. They all come down to the precise rule for including a module in Ruby:
Module Inclusion in a Sentence
When a class X includes a module A, all modules in A's inheritance hierarchy not in X's inheritance hierarchy are proxied; those proxies are inserted into X's inheritance hierarchy between X and X's superclass.
The first form where this appears in the new graph that wasn't present in the original diamond example is that there are two proxies between E and D. Since E included O, and O included N, E had to inherit from both an O Proxy and an N Proxy. This is necessary because when we proxy O, we changed its superclass from N to D. This loses the methods in N, so we must make a proxy for N and insert it between our O proxy and D.
The second form where we see this rule having significant implications is how there are no proxies between F and E. F, despite including M, is no different from its superclass, E. The reason, as a seasoned Rubyist would note, is because M is already present in F's inheritance hierarchy. As Chad Fowler noted in a post on is_paranoid, this can trip us up easily. If E overrides one of the methods in any of M, N, or O, then F cannot recover those methods, even by including one of those modules. It sees what E sees.
The third form is the same effect manifested in a different way: despite the fact that both H and E include O, they can easily have different methods, and it's not hard to imagine breakage in O's methods. H inherits from G, and G includes N. If G is self-contained, it could override one of N's methods to suit its own purposes and work just fine. H inherits from G and then includes O, yet O likely relies on N's methods. Why else would O include N? Yet it is not guaranteed to have a view of N's methods. Worse, just like the above case, if G overrides one of N's methods in a way incompatible with O, no subclass of G can ever use O!
Is This Good?
What I've described up until now is what Ruby is. And to bring it back, this was all because we wanted multiple inheritance without diamond inheritance or other questions about override domination. And it does a fantastic job! Yet there are clearly some notable limitations.
- You can include a module to no effect. Worse, Ruby does not tell you when you do so. As my thesis advisor noted upon learning this: "There's nothing worse than a compiler that sees you do something useless and doesn't let you know about it." As a result, Laser already warns you when it sees you include a module to no effect.
- By overriding a module method in a class, that class and all subclasses could be prevented from using dependent modules. I don't think this is as common of a problem, as module inheritance is less frequent than class inheritance.
- In both of these cases, the class suffering from what I consider negative behavior has (almost) no recourse to recover the desired module's methods precisely as they appear in the module. An explicit re-inclusion of the module in the affected subclass - a clear sign of intent to use that module's methods verbatim - has no effect. Short of manually extracting and rebinding all of the desired module's methods manually (see delegator.rb for an example of this process), there's nothing for the subclass to do.
One might cite a viewpoint that superclasses should dominate module inclusions, as modules are "mixed in",
and are not supposed to be substitutes for the subclass-superclass relationship. However, successfully including
a module overrides superclass methods, and you can even use Object#is_a? and Module#=== to check if
a given object has a module in its hierarchy, just like a class. There is no primacy of class inheritance over
module inclusion, really. Yet there are not-uncommon ways in which that primacy exists in non-negotiable ways.
Are there alternative implementations that allow include to function even if the requested module
already exists in the inheritance hierarchy? We could simply insert proxies for an included module's entire
hierarchy, which would at least allow subclasses to "get back" lost module methods. However, this would
require a bit more care:
module M def foo; 'hi'; end end module N include M def foo; 'there'; end end class A include N end class B < A include M end p B.new.foo #=> 'hi'
B now dispatches to M before N, something not possible currently in Ruby. This seems unintuitive, but isn't
the programmer defining B making it clear they want M's implementation of #foo?
The inheritance graph that goes along with this example:

The obvious downside, of course, is the increase in dispatch time that is unavoidable. This idea has at least as many nodes in the full inheritance tree, and each inclusion inserts as many nodes as that module has in its hierarchy.
Anyway – at this point I'm rambling. If you'd like more ramblings, follow us on Twitter @carbonica.
Update: Special thanks to Wim Looman (@Nemo157) for correcting the final code block; module N originally did not include module M in the code listing.

Enjoy this article? Then feel free to: