jducoeur: (Default)
[personal profile] jducoeur
[This one is strictly for the programmers.]

As mentioned this morning, I spent the day reimplementing the Ecology Pattern, my preferred way to manage dependency injection in my programs. It's a tried and true pattern that I've been using for a solid 15 years now. I first learned it from Tom Leonard, my boss at both Looking Glass and Buzzpad -- he had evolved it to keep Dependency Hell at bay in C++/COM applications, but I quickly found that it's almost always appropriate for programs that are of more-than-modest size. It isn't necessarily the One True Answer to dependency injection, but I find that it consistently works well, and frankly, it's easy enough that I find it usually better to just reroll it for each application, instead of using a canned library.

Most of the concepts will be familiar to folks who are used to dependency injection, but there's an additional focus here: managing initialization and termination of the system in an organized and sufficiently-predictable way. Most programs pay too little attention to initialization. In the better cases, they simply have the top level of the program choose what order to initialize the components. (Which is difficult to maintain, and produces ugly coupling.) Most often, folks just use the Singleton Pattern in some fashion, initializing a system when it is first invoked -- which is great until you hit a dependency cycle, and it abruptly crashes in a hard-to-debug way. By contrast, Ecology treats initialization and termination as first-class problems, to be dealt with properly.

NOTE: in the following, I'm not going to deal with truly complex initialization problems, such as multi-threaded initialization (eg, when you have to initialize a subsystem that *must* live on a single master GUI thread), or asynchronous initialization (eg, when a component needs to work with a remote dependency before it can be considered fully initialized), or changing the Ecology while the system is running, or encapsulating subsystems in child Ecologies. I've dealt with all of these in previous projects, and none are *terribly* hard to solve, but I'm not going to muddy the waters with them here. Feel free to ask about them in the comments.

The key concepts of Ecology go as follows:

First, there is the Ecology itself. (The name is questionable -- previous projects have argued for "Ecosystem" as being more correct, but I am set in my ways.) This is the master wrapper for the whole world -- a single point of reference from which you can access all the major system components. It provides access to all of the Interfaces that have been registered in it. It keeps track of what has been initialized, and throws an exception if you try to access anything before it has been initialized. (There is also usually a side-interface called EcologyManager, that is used during setup and shutdown.)

The Ecology is composed of Ecots. Yes, this is a horrible piece of made-up jargon, but it's less ambiguous than a wishy-washy term like "Module". An Ecot is a self-contained system singleton -- anything from Logging to Configuration to Database Access. In the case of Querki, Ecots are required to be stateless (to keep threading clean), but that's not inherent in the concept -- many of my previous projects have involved stateful Ecots.

An Ecot may implement any number of Interfaces. The Ecot is private, not visible to the rest of the world; the Interfaces are public, and can be queried from anywhere once they are initialized. In particular, Ecots refer to each other via Interfaces.

Each Ecot declares the Interfaces that it depends upon in order to initialize. These dependencies are how you get clean initialization. System startup works like this:
  • First, the top level of the system creates all of the Ecots, passing the Ecology into each one. Each Ecot registers itself in the Ecology. During construction, Ecots are *absolutely forbidden* to refer to anything else -- they only do their own internal construction.

  • After all Ecots are registered, the top level calls Ecology.init(). This (effectively) does a topological sort of the Ecots, by their dependsUpon declarations. If it finds any dependency loops in those declarations, it immediately fails and reports the loop. Otherwise, it initializes in sorted order, starting with the Ecots that depend on nothing, and gradually working its way outward as the required dependencies are available.

  • Once that is finished, the system is up and running. Anybody can use any other system from this point forward, by fetching the needed interfaces from the Ecology.

  • At shutdown time, you terminate each Ecot, in the reverse order of how you initialized them. (This isn't strictly correct, but in my experience generally works as desired.)
Note that initialization order is *not* strictly deterministic, and doesn't try to be. Instead, it focuses on the important part: making sure that the world is ready before each element is initialized.

That's pretty much it. It is *not* rocket science -- I implemented the whole system, including unit tests, today. But a surprisingly large number of projects don't even go to this much effort -- they simply leave initialization and the inter-relation of subsystems up to the Singleton pattern, and eventually find themselves in all sorts of hell as a result, only after the code has gotten truly complex. By *starting* with Ecology, you can avoid those hells from the beginning, and have an architecture that is solidly scalable from a code POV.

Here are some simplified versions of the main traits (what most languages call "interfaces") from the Querki version of Ecology, to give you an idea. Questions welcomed...
trait Ecology {
  // Get the Manager for setting up and shutting down this Ecology
  def manager:EcologyManager
 
  def api[T <: EcologyInterface : TypeTag]:T
}
 
trait EcologyManager {
  // Gets the Ecology that this is managing.
  def ecology:Ecology
 
  // Adds the specified Ecot to this Ecology.
  def register(ecot:Ecot):Unit
  
  // Initializes the world.
  def init()
  
  // Terminates the world.
  def term()
}
 
/**
 * This is a pure marker trait. All "interfaces" exposed through the Ecology *must* have this as their
 * first trait linearly. (Usually, it will be the only thing that an exposed interface extends, but
 * that is not required.)
 */
trait EcologyInterface
 
case class InterfaceWrapper[T <: EcologyInterface](ecology:Ecology)(implicit tag:TypeTag[T]) {
  lazy val get:T = ecology.api[T]
}
 
trait Ecot {
  def dependsUpon:Set[Class[_]]
 
  // This is messy, but is the method you actually call inside the Ecot, to get an init-time reference
  // to an external Interface.  This populates dependsUpon().
  def initRequires[T <: EcologyInterface](implicit tag:TypeTag[T]):InterfaceWrapper[T]  
 
  def init = {}
  def term = {}
  
  /**
   * Note that registration takes place during construction.
   */
  ecology.manager.register(this)
 
  // This is the set of all EcologyInterfaces that this Ecot implements.
  def implements:Set[Class[_]]
}
There's a bunch of implementation, but honestly, it's not hard -- like I said, I wrote pretty much the whole thing today. (Yay for Scala.) I strongly recommend going to the effort of setting up something like this at the beginning of any major project: it'll save you lots of hassle down the road...

(no subject)

Date: 2014-01-04 03:55 am (UTC)
From: [identity profile] vortexofchaos.livejournal.com
Interesting, but a challenge to read because you have some embedded < (and maybe >) that are messing with the markup.

Are there other references to this pattern in other sources, because I don't think I've ever seen it before? I'm on the edge of wrestling with a big initialization/shutdown mess, and this sounds like the solution I'm going to need.
Edited Date: 2014-01-04 03:56 am (UTC)

(no subject)

Date: 2014-01-04 12:06 pm (UTC)
From: [identity profile] metahacker.livejournal.com
Agreed... I think it's the <: inheritance-looking statements in the code that are messing things up.

(no subject)

Date: 2014-01-04 12:09 pm (UTC)
From: [identity profile] metahacker.livejournal.com
Interesting way to lay things out. What do you use for tree-stitching? (to create the partial ordering)

Have you ever written visualization tools for ecology? There's a similar pattern at play at work, where the challenge is getting an enormous legacy codebase to comply with basic things like "no loops" and "every bit of code goes in only one ecot" (which are approximately like what we call Components), and I spend a lot of time working on tools to help people understand the rather complex resulting structure--especially when trying to untangle loops, or figure out where new code goes.

(no subject)

Date: 2014-01-05 05:54 pm (UTC)
From: [identity profile] metahacker.livejournal.com
Each time, I knock off an Ecot that has no unresolved dependencies.

Cute, and obvious in retrospect!

(no subject)

Date: 2014-01-09 02:50 pm (UTC)
From: [identity profile] goldsquare.livejournal.com
Interesting, and basically a variant of self-inclusion-like behavior.

In practice, I suspect the termination process is, in fact, the inverse of the initialization process - but it doesn't really have to be. It might be interesting (and more generalizable) to permit an ECOT to specify its wind-down dependencies separately, and determine if one "can" both initialize and terminate before starting the initialization. (Because, not all initialization is stateless, so trying and failing may not be safe, and why initialize if you can't shut down?)

I'm trying and failing to determine how this would work in a lazy-initialization system. I've worked on some marvelously complex systems where something ECOT-like is defined and prepared, but it does not actually instantiate until first used. Those systems simply demanded that these lazy-ECOTs were self-contained. But: what if they were not?

Could your pattern be extended to handle that case, easily? I'm stuck.

(no subject)

Date: 2014-01-09 04:05 pm (UTC)
From: [identity profile] goldsquare.livejournal.com
If you bear in mind that my major introduction to such systems was Jini - a dynamically Federated system with mobile code (the great-grandfather of the upcoming Internet of Things), you can see where such lazy instantiation might be useful.

Consider a service provider which offers a variety of services. When a federated service consumer comes along, it will contact the provider and ask for that service; mobile code will be downloaded to the consumer to use that service. At that time it makes sense for the service provider to instantiate that service. Not before.

I suppose one could consider that to be almost the equivalent of "starting a new server", but it wasn't, exactly.

I'm asking because I'm stretching the model. I know that I'm stretching the model (and I was also stretching my brain a bit) just to see if it COULD be done. I'm not attempting to detract from the essential prettiness of it, nor its utility.

Profile

jducoeur: (Default)
jducoeur

July 2025

S M T W T F S
  12345
6789101112
13141516171819
20212223242526
27 28293031  

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags