Beware the Hack!

Hacks are dangerous little creatures. They live in the darkest, dustiest corners of your application, forgotten about, waiting… Waiting for the chance to rear their ugly little heads, open their disease infested mouths, and sink their jagged teeth into customer confidence and developer productivity.

We’ve all been there. We have a product that works great. It solves a certain problem incredibly well. Then, a well meaning customer comes along and says, “This is fantastic. It almost solves my problem perfectly. Is there any way you can modify it slightly to do X instead of Y.”

Sometimes this is no problem at all. Sometimes the design of the product is flexible in ways that make it a breeze to add this functionality. But, sometimes the request comes out of left field, and takes your product in a direction you never anticipated. While we strive to build software that is extensible and adaptable (it is software after all, isn’t it?), none of us can see the future, nor anticipate every possible customer request.

About this time you start to hear a little voice inside your head. “Well, I suppose I can hard code this, or add an if statement here, or write a one-off script to do X”. After all, you don’t want to tell your customer “No, sorry, we can’t do that”. And, they certainly don’t want to hear “Sure, we can make that modification, but it will require a significant amount of refactoring in order to ‘do it right’”.

Acting on these thoughts births a tiny baby Hack. The Hack is little when it’s born, but it certainly doesn’t stay that way. Once the Hack is born, it is much easier to add to the Hack, or feed it. With everybody modification to the Hack, it gets bigger, and bigger. Pretty soon you have a large, ugly Hack with a nasty attitude on your hands. And, despite you being it’s “mommy”, it doesn’t like you, at all. Not one bit.

Hacks are dangerous for several reasons.

First, they almost always live outside the main execution path of the code. This means they’re not executed nearly as often as the other code. Even if you have a series of tests for the Hack, nothing exercises code like constant execution by your customers. Also, because they’re not really “part of the application”, Hacks are often forgotten about when updating or fixing code.

Second, they’re usually created to quickly get around some issue. And by “quickly”, I mean “didn’t totally think this though, but I’m fairly certain that if I tweak X, alter Y, and drive it with a custom script, it should work just fine”. And, usually it does work just fine…at least in the beginning. But, this is when the Hack is still young, and under your control. Adult Hacks are not nearly as cooperative.

Third, they’re usually only known about (at least in detail) by the members of the team that created them. A Hack is like a big, puss filled pimple on your ass. You don’t go around showing those to your friends and co-workers, do you? Hacks, by definition, are quick and dirty solutions to problems. They’re not elegant, or sexy. So, developers tend to keep their Hacks to themselves. At most, a developer will mention that they hacked around a problem, but rarely do they go into details. The other team members are largely left in the dark. Not knowing where a Hack lives or how it behaves is a sure fire way to get bitten by it down the road.

Always remember, that little baby Hack…it will grow up. It will get nasty. It will bite. It’s just a matter of when and where. Those who have been programming for long enough know this to be fact. And, being nasty little creatures, Hacks usually wait until the worst possible time to bite.

So, beware the Hack! They are big, ugly, mean, have teeth, and will most certainly bite.

Strive to Limit Integration Points

Last week, I was working on a new feature of TextMe that required a call to one of our external service providers for some data. The call in particular was to lookup the carrier for a given mobile number. Sounds simple enough. However, we already had code that integrated with this provider in one component of our architecture, and I needed to make this call from another component.

A couple of options jumped out at me. I could pull the code I needed to use into a library that could be shared between the components, or implement some form of inter-process communication that would enable me to invoke the service from the one component, and have it processed by the component that already integrated with the service provider.

Pulling the code into a library would be the easier of the two to implement for sure. Like any project of reasonable size, we were already doing this for several other shared pieces of code. Adding one more to the list would be a piece of cake. The second option would require a bit more work. The component that integrates with the service provider runs as a daemon process, so using something straightforward like HTTP to handle the interprocess communication was out of the question. Instead, I’d likely have to utilize the queuing framework that we already had in place. What makes it more difficult is that the queuing library we use only handles asynchronous calls, and this would need to be a synchronous call. Not the end of the world by any means, but without a doubt more complicated than simply sucking the code into a library.

Even though option one was easier to implement, having two components in the architecture integrate with a 3rd party seemed like a bad idea. Sprinkling integration points throughout your application is usually a recipe for failure. Largely because it is only a matter of time before an integration point fails.

If we went with option one, we could have the library handle the failures. However, even if handled properly, failures like this usually have other consequences. For example, if the service never responded, it could cause requests to back up in the given component. Even if we implemented a timeout, it is likely that the timeout would be greater than the average response time, which means our system would take longer to process each request. If you had to deal with a lot of incoming requests at the time of the failure, you could be in for a world of hurt, especially if you had multiple components suffering from this issue.

With option two, we have a bit more control over the situation. First off, we would know there was one, and only one spot in our architecture that integrated with that particular service. This would allow us to better understand the potential impact of the failure, and the steps that needed to be taken to address it. Second, it would allow us to more easily implement a circuit beaker to prevent the failure from rippling across the system. If the circuit breaker was tripped, we could return an error, some sort of filler data, or queue the request up for processing at a later time. Third, we could potentially add resources to account for the situation. Since the work was being done in a completely different component, if it was simply a matter of increased latency on the part of our service provider, we could always spin up a few more instances of that component to account for the fact that some of the requests may be starting to back up.

In his fantastic book, Release It, Michael Nygard talks about integration points, along with a host of other topics regarding the deployment and support of production software. Any developer who writes code that will eventually be running in a production environment (which I hope is EVERY developer) should read this book. Regarding integration points, Michael says the following:

  • Integration points are the number-one killer of systems.
  • Every integration point will eventually fail in some way, and you need to be prepared for that failure.
  • Integration point failures take several forms, ranging from various network errors to semantic errors.
  • Failure in a remote system quickly becomes your problem, usually as a cascading failure when your code isn’t defensive enough.

However, even though integration points can be tough to work with, system’s without any integrations points are usually not that useful. So, integration points are a necessary evil. Our best tools to keep them in line are defensive coding, being smart about where you place the integration points in your system, and limiting the integration points in the system.

With the help of my colleague Doug Barth, we (mostly Doug) whipped up a synchronous client for the Ruby AMQP library. I then used this code to implement the synchronous queuing behavior I needed to keep the integration point where it belonged. Those interested can find the code in GitHub, at http://github.com/dougbarth/amqp/tree/bg_em.

Increase Design Flexibility by Separating Object Creation From Use

I just finished reading Emergent Design, by Scott Bain. Overall, I thought it was a pretty good book that touched on some important concepts in software design. I’ve read about one particular concept covered in the book a few times before, but the value of it didn’t sink in until I read Emergent Design. This concept states that code that creates an object should be separate from code that uses the object.

Separating code that creates an object from the code that uses the object results in a much more flexible design, which is easier to change. Creating this separation is also very easy to do. By simply avoiding a call to the new operator in the “client” code for the particular object you wish to instantiate, you are able to evolve your code to adjust to a variety of changes, most of which require no changes in the code that uses the object. Let’s walk through an example.

Let’s say we have a logging class, named Logger, that we use to log messages from our application. The class is pretty simple, and looks something like this.

public class Logger {
    private static final String logFileName = "application.log";
    private FileWriter fileWriter;
    private Class from;
    
    public Logger(Class from) {
        this.from = from;

        try {
            fileWriter = new FileWriter(logFileName, true);
        } catch (IOException e) {
            throw new RuntimeException("Log file '" + logFileName + 
                    "' could not be opened for writing.", e);
        }
    }

    public void log(String message) {
        try {
            fileWriter.write(
                from.getCanonicalName() + ": " + message + "\n");
            fileWriter.flush();
        } catch (IOException e) {
            System.err.println("Writing to the log file failed");
            e.printStackTrace();
        }
    }
}

In our application, we would typically use the Logger class like this:

Logger logger = new Logger(MyClass.class);
logger.log("Some message");

I think this is pretty typical, and seems to be the default pattern. Create the object that you need, and then use it. Simple and straightforward. However, the simplicity comes at the price of limited flexibility. For example, what if I wanted to limit the Logger class to only having one instance? Or, what if I wanted to start logging some messages to the database, and some to the file system? By combining the code that creates the object with the code that uses the object, we’ve greatly limited the ways in which we can evolve our design without affecting existing “client” code. Sure, we can work our way out of it, but since the Logger is a very popular class used by almost every other class in the system, it will require a lot of work to change.

So, how can we avoid this? How can we effectively encapsulate the creation of the object from the code that uses it? The very first “tip” in Effective Java, by Joshua Bloch, is to prefer static builder methods over constructors. Joshua suggests this for the same reasons Scott suggests separating code that creates the object from code that uses the object in Emergent Design. Instead of making your clients use the new operator to create instances of your object, provide them with a static builder method to do so.

    public static Logger getInstance(Class from) {
        return new Logger(from);
    }
    
    protected Logger(Class from) {
        this.from = from;

        try {
            fileWriter = new FileWriter(logFileName, true);
        } catch (IOException e) {
            throw new RuntimeException("Log file '" + logFileName + 
                    "' could not be opened for writing.", e);
        }
    }

Note that I changed the scope of Logger‘s constructor from public to protected. This will discourage other classes outside of the logging package from using it, while leaving the Logger class open for subclassing. With this new method in place, users of this class can now create an instance by doing the following.

Logger logger = Logger.getInstance(MyClass.class);
logger.log("Some message");

It seems silly to provide a method that simply calls new. But, doing so adds so much flexibility to the design, that Scott considers it a “practice”, or something he does every time without even thinking about it. Abandoning the constructor also opens a few doors. You are no longer required to return an instance of that specific class, giving you the freedom return any object of that type. You don’t always have to return a new instance, allowing you to implement a cache, or a singleton. You can use this flexibility to your advantage when evolving your design. Let’s see how.

Let’s say we get a request from our accounting department to log messages from code that deals with financial transactions (conveniently located in the net.johnpwood.financial package) to the database. This sounds like the birth of a new type of logger. Because clients are not using the new operator to create new instances of the Logger class, we can easily evolve Logger into an abstract class, keeping the static getInstance() method as the factory method for the Logger class hierarchy. After we have the abstract class, we can create two new subclasses to implement the individual behavior. All of this with no change to how the client uses the logging functionality.

Because the filesystem logger and the database logger don’t have too much in common, the Logger class has been slimmed down quite a bit. What remains is the interface for the Logger subtypes, defined via the abstract log() method, and a factory method to create the proper logger, which is implemented in getInstance().

public abstract class Logger {
    
    public static Logger getInstance(Class from) {
        if (from.getCanonicalName().startsWith(
                "net.johnpwood.financial")) {
            return DatabaseLogger.getInstance(from);
        } else {
            return FilesystemLogger.getInstance(from);
        }
    }
    
    protected Logger() {}
    public abstract void log(String message);
}

We now have two distinct classes that handle logging transactions. FilesystemLogger, which contains most of the old Logger code, and DatabaseLogger. FilesystemLogger should look pretty familiar.

public class FilesystemLogger extends Logger {
    private static final String logFileName = "application.log";
    private FileWriter fileWriter;
    private Class from;

    public static FilesystemLogger getInstance(Class from) {
        return new FilesystemLogger(from);
    }
    
    protected FilesystemLogger(Class from) {
        this.from = from;
        
        try {
            fileWriter = new FileWriter(logFileName, true);
        } catch (IOException e) {
            throw new RuntimeException("Log file '" + logFileName + 
                    "' could not be opened for writing.", e);
        }
    }

    @Override
    public void log(String message) {
        try {
            fileWriter.write(
                from.getCanonicalName() + ": " + message + "\n");
            fileWriter.flush();
        } catch (IOException e) {
            System.err.println("Writing to the log file failed");
            e.printStackTrace();
        }
    }
}

DatabaseLogger is also pretty simple, since I didn’t bother to implement any of the hairy database code (doesn’t help to illustrate the point…and I’m lazy).

public class DatabaseLogger extends Logger {
    private Class from;
    
    public static DatabaseLogger getInstance(Class from) {
        return new DatabaseLogger(from);
    }
    
    protected DatabaseLogger(Class from) {
        this.from = from;
        establishDatabaseConnection();
    }

    @Override
    public void log(String message) {
        LoggerDataObject dataObject = 
            new LoggerDataObject(from, message);
        dataObject.save();
    }
    
    private void establishDatabaseConnection() {
        // Connect to the database
    }
}

We’ve significantly changed how the Logger works, and the client is totally oblivious to the changes. The client code continues to use the Logger as it did before, and everything just works. Pretty sweet, eh?

As you can imagine, there are many other ways you can evolve your design if you have this separation of creation and use. If we need to create a MockLogger for testing purposes, it can be created in Logger.getInstance() along with the other Logger implementations. The client would never know that it is using a mock. If we ended up creating 10 different loggers, it would be trivial to have Logger.getInstance() delegate the creation of the proper Logger instance to a factory, moving the creation logic out of the Logger class. Again, no changes to the client.

Separating creation from use also allows you to easily evolve your class into a singleton (or any other pattern that controls the number of instances created). This doesn’t make much sense for Logger, since each unique Logger instance contains state. However, it does make sense for some classes. Evolving your class into a singleton simply requires a static instance variable on the class containing the instance of the singleton object, and an implementation of getInstance() that returns the singleton instance. If clients have already been using the getInstance() method to get an instance of the class, then no change would be required on their end. Here’s an example:

public class SomeOtherClass {
    private static SomeOtherClass instance = new SomeOtherClass();
    
    public static SomeOtherClass getInstance() {
        return instance;
    }
    
    private SomeOtherClass() {}
}

It is worth pointing out that static builder methods are not the only way to achieve this separation. Dependency injection frameworks like Spring and Guice do all of this for you. They take on the responsibility of creating the objects, and getting the instances to the code that uses them. If you are a disciplined developer, and never “cheat” by instantiating the objects directly, then all of the same benefits outlined above apply when using a dependency injection framework.

Like everything in life, there are cons that go along with the pros. Separating the code that creates an object from the code that uses the object is not the default pattern. It is not the norm. It will take time for you and your co-workers to get comfortable with this pattern. API documentation tools don’t “call out” static builder methods like they do constructors. This could have an effect on anybody using your library. Dependency injection frameworks take the creation of objects completely out of your code, moving it to some magical, mysterious land where things just happen, somehow. This also can take some time to get used to, especially for those new to the concept.

However, I feel that the benefits of separating creation from use far outweigh the drawbacks.

In our field, change is a constant. As a profession, we’re gradually learning to stop fighting change, and to start accepting it. This means designing for change. Doing so makes everybody’s life easier, from the customer to the developer. Separating creation from use is one, quick way we can increase the flexibility of our design, with very little up front cost.

Thanks to Mahesh Murthy for reviewing this post.

Diners Club Webapp Has Been Released

I finally got around to cleaning up and publishing the code for the Diners Club web application. Like everything else I release, I’m not sure who will actually use it. But, I find that the whole exercise of releasing your code is beneficial to you and your code. So, it’s worth the effort. It’s available via the MIT license, so have at it.

Kind of a funny side note about this project. I recently found out that a couple of co-workers of mine spend some of their spare time working on Planypus, which is a site for making plans with your friends. Sounds familiar! I was talking to them, and it turns out that Planypus started the exact same way that the Diners Club webapp started, as just a way to organize dinner outings with friends. They however took it to the next level and threw a business plan around it. I’m just not that business savvy I guess :)