Merlin Mann visits Orbitz

Merlin Mann, creator of 43 folders, was in the office today to give his talk on Time and Attention. It was a very good talk, and Merlin offered plenty of practical advice on how to manage distractions at work, and really focus on what you are getting paid to do. If you constantly feel like you can never get anything done at work due to meetings, the constant flow of emails, or some other distraction, I’d highly recommend checking out the presentation. None of what Merlin points out is rocket science. It is all simple, practical, and sound advice. I’m really going to make an effort to try some of his suggestions, and see they can’t help me spend a little more time doing what I’m good at while at work…solving problems.

JavaOne sessions now online

Sun has recently posted MP3s, PDFs, and combo slideshows (slides in sequence with the audio) from this year’s JavaOne conference. I wanted to post links to some of the sessions that I enjoyed the most.

You need a Sun Developer Network account (free) in order to access the presentations.

There were a number of sessions that I was not able to attend at the conference that I look forward to watching now. I’d highly recommend taking a troll through to see what else might peak your interest.

Interview practice

I’ve given quite a few interviews over the past year. I enjoy interviewing candidates to determine if they would be a good fit at Orbitz. I also enjoy being the face of Orbitz to those who come in for interviews, trying my best to make the candidate want to work at Orbitz (which is at least one-half of the interview).

But, after every interview I give, I always come out with a feeling that I could have done better. My questions could have been clearer, my topics should have been more diverse, I could have represented Orbitz a little better, etc. After a recent interview, I proposed something to a couple of my teammates…interview practice.

I proposed that those who interview candidates practice their interviewing skills on other Orbitz engineers. I think this could be valuable for many reasons.

It gives you great feedback on your questions. If an engineer you work with and respect isn’t giving you the answer you are looking for on a particular question, perhaps the question is unclear, or too difficult for most of the interviews we are giving. Sure, it could also mean that my colleague may not have detailed knowledge of that topic. But, if I think my colleague is a solid engineer, and that I’d like to hire more people like this person, then I want to make sure that I don’t unintentionally weed-out a candidate of this person’s caliber. This doesn’t mean that you can’t ask this question, but instead maybe weigh the question a little bit lighter, or ask it as a “harder” question while pushing the candidate a bit.

Practice interviews help you hone your “interview agenda”. After a few practice sessions, you should have a fairly good idea how long to spend on each topic you want to cover. And, you should develop more than enough questions to challenge the candidates who easily plow through your warm up questions.

Practice interviews give you a great forum to try out and tune new questions. If you think a coding exercise may be a little too complex for a 5 minute trip to the white board, then ask it and find out. If the mock candidate asks you questions to clarify the exercise that you cannot answer, then your question needs a bit more work. Whatever it may be, the practice interview is a much better place to try out new questions and exercises than a live interview.

If you really get into the role, and act as you would during a real interview, hopefully your trusted friend and colleague will have no trouble telling you if you are acting like a ass. In my opinion, one of the worst things you can do is talk down to, degrade, or make a candidate feel stupid if they are unable to answer a question. It’s unprofessional, mean, and will discourage somebody who may have turned out to be a great employee from joining the company.

These are just a few of the benefits that I see right away. I have a feeling that if we start holding these mock interviews, say once a week for about an hour, that we will quickly find other benefits as well. Should be an interesting experiment. If we kick it off, I’ll let you know how it works out.

A theme at JavaOne – Beyond Java

I noticed a few themes at JavaOne this year. One of the big ones was JavaFX. It had sessions galore, and plenty of stage time at the general sessions. But, another theme I picked up on was the amount of sessions dedicated to non-Java programming languages; perhaps a bit odd considering this was JavaOne.

JRuby and Groovy were all over the place. There was also a session on Scala. These other languages bring new ways of solving problems to the table. In addition to their expressiveness, JRuby and Groovy bring the power of meta programming to the Java world. Scala also brings a more expressive syntax, and the power and flexibility of two programming paradigms: object oriented and functional.

These were the only non-Java programming languages that had sessions associated with them. At the CommunityOne session I attended which was presented by Charlie Nutter, Charlie said that virtually every language out there is capable of running on the JVM, usually via a sub-project. I think this is a very profound statement. It shows that the language implementers or those strongly associated with the language realize the benefits of running on the JVM, and the ability to integrate with the billions of lines of existing Java code out there. Granted that each one of these projects varies in the level it can interact with existing Java code. However, Sun appears to be making a strong effort to work with the implementers of these languages to make integration with Java easier.

I think this is great. I’m sure that Sun realizes how powerful and mature the Java platform has become. And, as a Java programmer, I love the idea of being able to pick a language that best fits the problem I am trying to solve, while still maintaining interoperability with my existing Java code.

The Java language itself appears to be stalling. No widely adopted programming language lives forever. In order for a language to be successful, it must maintain some sort of backward compatibility. Maintaining backward compatibility, while necessary, slows the evolution of the language, and prevents the language from adopting vastly different programming models that may be a better fit for new problems that developers may be facing. Newer languages, or less widely adopted languages, do not face this dilemma and can change more rapidly. I’m also unconvinced of the effectiveness of the whole JCP process. I some situations I feel it is best to have a small group of intelligent individuals at the helm of a project, making all of the decisions. Usually, the more people involved, the longer it takes to get things done.

I think that making the JVM a more attractive environment for non-Java programming languages will only benefit the platform. I think it will also prevent some developers from jumping platforms to use a different programming language that better suits the problem they are trying to solve. I think this is a win for the Java community.

JavaOne 2008 – Day 4

Sun General Session Extreme Innovation

The last general session of JavaOne 2008 consisted of James Gosling inviting several people on stage to showcase what they have been using Java to create. There were several presentations, but I’m going to only talk about a few that interested me.

First up was Visual VM. Visual VM is a free JVM monitoring tool that can look under the covers of applications running Java 1.4.6 (I think) or greater. The stats gathered include memory usage, thread usage, CPU usage, and more. It also has a series of nice features, like getting a thread dump on your application by simply clicking a button. Best of all is that it does this with almost no overhead on the application. The tool looked very nice, and worth checking out in further detail.

Second was by far THE coolest thing that I have seen here this week; a pen. Yep, a pen…but a VERY smart pen. This pen can record your voice (or somebody else’s) as you write, and when you tap on an item that you wrote with the tip of the pen, it will play back just the portion of that audio that you record when you wrote that particular item. The pen also has several tools, like a translator that will take a word written in English and translate it (verbally) to a number of different languages, by simply tapping the word you wrote. It also stores images of everything you write. You can basically toss the paper you wrote on in the trash, because you can easily transfer the images to your computer through the USB port on the pen. The text in the notes is searchable via the pen software, and audio captured can also be accessed via that same software.

JMars is an open source application that contains loads of detailed images of Mars, collected via the several NASA missions to that planet. It is fully interactive, lets you see several types of maps of a terrain (and combine particular maps), and basically navigate the planet as you wish.

Complex Event Processing at Orbitz

Matt O’Keefe and Doug Barth did a great job presenting our event processing framework here at Orbitz. They took the audience step by step through our API and library that collect the data (ERMA), the commercial third-party tool that we use to aggregate and route that data (Streambase), and our graphing tool that visually represents that data (Graphite). The event processing framework has very high throughput, has little overhead on the running application, and requires very little code in the application.

At the end of the presentation, Doug announced that we would be open sourcing the two pieces of the framework that we own, ERMA and Graphite, and we were looking into bundling an open source data aggregation tool (since we can’t open source Streambase) to provide a complete event processing solution. The audience applauded this announcement, which took Doug a bit by surprise based on the look on his face :)

Good job guys!

Improving the Engineering Process Through Automation by Hudson

Hudson is a continuous integration (CI) tool that can be used to automate building, testing, and deploying your code. PCs are cheap and getting cheaper by the day. Developers on the other hand are not. Hudson is advertised as a cheap “team member” that can take on some of the easier to automate activities.

One of Hudson’s obvious goals is ease of use. And, this is a goal that they have reached in all areas. Even something as complex as distributed builds is a snap in Hudson (more on that later). It installs in a snap, and can run either by itself, or within another web container. It was designed to be extensible, allowing the community to continue development of the tool through plug-ins. These plugins provide integration with several popular version control systems and bug tracking systems.

Several best practices were suggested:

  • Componentize builds to reduce the time needed to get feedback. Replace that one, monolithic build with several smaller builds. And, only build what changed.
  • Run tests in parallel, or run groups of related tests in parallel.
  • Hudson can build, test, promote, do some QA, deploy, integrate, and more. Take advantage of its power and flexibility to automate whatever can be automated. Set it up to do as much as possible.

Hudson also makes it very easy to do distributed builds. The master machine serves HTTP build requests to the slave machines, and stores all of the information collected from the builds. The slave machines are the ones that do the build. The nice thing about distributed builds in Hudson is that slave boxes can come and go as they please. At the beginning of a build, the master checks to see how many slaves are available, and delegates the build to a slave machine. And, slave configuration is easy. A client needs to run on the slave, and some basic configuration is needed on the server for each slave. After that, Hudson takes care of the rest. This feature is great for building and testing on multiple environments and operating systems.

Oh, did I mention that Hudson was free? We use it on the Transaction Services team at Orbitz, even though the company has standardized on Quickbuild. Putting up with maintaining two ci tools tells you 1) how much we like Hudson and 2) how easy it must be to get running, and keep running.

Automated Heap Dump Analysis for Developers, Testers, and Support Employees

This session focused on the use of an open source tool, Memory Analyzer, to help track down memory leaks in a Java application. Finding memory leaks in Java has always been difficult. Who in their right mind wants to wade through a heap dump? Memory Analyzer is a sweet little tool that analyzes the heap dump for you, and provides you with easy to decipher diagnostic information that it pulled from the dump.

Since Memory Analyzer doesn’t understand your application, it can’t really tell you where a leak is. However, it can tell you which classes are holding the majority of the heap, how many instances for a given class have been instantiated and how much memory those instances occupy, and more. This is usually enough to point you in the right direction. Memory Analyzer can also give you the stack trace to a specific memory allocation, helping you further track down the leak.

The reports generated by Memory Analyzer are very comprehensive, and provide tons of useful information about your application’s memory usage. These reports can be used by all areas. They can help support employees track down production issues. They can help developers fix and test memory related issues. And, they can help testers very that the memory usage for a given application doesn’t dramatically vary from release to release. The reports have several useful features to track down leaks, like the ability to group the results by classloader, in an effort to further isolate the problem.

Memory Analyzer also does a static analysis of you code to look for common memory related anti-patterns. This can help find a bug before it is introduced into the codebase. This tool has a lot of promise. I hope I never have to use it, but it’s comforting to know that it’s out there to use, just in case.

Top 10 Patterns for Scaling Out Java Technology-Based Applications

Scalability seems to be one of the industry’s biggest buzz words these days. And, I don’t think that many people really know what it means. It was pointed out to me during a Q and A session after a talk how many people were asking scalability questions regarding topics that had nothing to do with scalability. “Does JAXB scale?” was one of these. Scalability != Performance.

Scalability is the ability to handle an ever increasing amount of requests gracefully. This could include adding servers to your server farm, or upgrading some key components. If you can add capacity to your system or tweak your system to handle increasing numbers of requests, you can scale. If there is a bottle neck in your system that maxes out your capacity, and you can’t easily fix that bottleneck, you can’t scale. Linear scalability, the ability to handle increased traffic with increased hardware, keeping the latency at its normal rate, is the goal. Any sort of up-trend in latency indicates that there will come a point in time where your latency will hit unacceptable rates, and cause a scalability bottleneck.

Scalability is not limited by a technology, a programming language, or an operating system. People have built scalable systems on every possible combination of these. The design and architecture of your system is what will determine if your system will scale or not.

Availability and reliability must be baked into the design of your system. It cannot be an afterthought. Refactoring your code to deal with scalability issues after launch is very difficult, as major design changes are often necessary.

The speaker then went on to discuss some things to consider when thinking about availability.

Latency is not always predictable. Network IO and other tasks performed outside of the application can be unpredictable. Reduce or eliminate these where possible. Remote messaging brings its own set of challenges. If order of the messages is important, how do you control it? How do you make sure the message will get there? How can you make sure you get a response quickly? How can you make sure that subsequent executions (a retry) won’t cause repercussions? Managing these complexities can be difficult, and if done improperly, can limit scalability.

Durability, the ability to survive a failure, is also a major challenge. Writing data to disk or to a DB takes time, and the coordination of writing and reading data necessary to perform a failover can be difficult to manage.

The speaker went on to identify some key areas to focus on when trying to build a system that can scale.

  • Routing – Reliable routing is essential for scalability. You must be able to reliably send requests to components that can process those requests in timely manner.
  • Partitioning – Spreading out the responsibilities of your system into different components enables scalability. If you notice that one area of the system is becoming a bottleneck, you can always add more capacity to that area, without touching the other areas.
  • Replication – The replication of data is necessary for surviving a failure in the middle of a transaction. The routing of a system must also be able to recognize failure, and route the request to another component who can handle the request.

The presentation also covered how to handle load on a system. Some common ways to deal with load:

  • Load balancing – Send the requests to a component that has the capacity to process them.
  • Partitioning – Partition your system so that you can add capacity to stressed areas.
  • Queue – Queue requests for processing when the system has available capacity.
  • Parallelization – Execute requests, or parts or requests in parallel where possible.

There are also strategies for when your system is overloaded:

  • Turn away requests
  • Queue requests to be processed when the system has capacity
  • Add capacity to the system (usually in the form of hardware)
  • Relax any kind of read/write consistency that you are enforcing
  • Increase the size of any batch jobs that you run

The speaker also spent some time talking about failure recovery. You should plan for failures at the component level (a piece of the system) and the systematic level (the entire system). Build some redundancy into your system so you have the ability to failover to a redundant component if one component fails. If you can’t failover to another component, then you need to build recoverability into your component, so that it can handle problems by itself. Critical data should be replicated so one component can pick up where another left off in the event of a failure. This however adds overhead to the system. So, only the critical data should be replicated.

At the end of the talk, the speaker left us with the “secret” to scalability: simplification. The simpler your system is, the easier it will be to scale. If you can’t get it to work on the whiteboard, then there is no way it will work in production.

Spring Framework 2.5: New and Notable

I’ve been working with Spring for quite a while now, and we are currently using 2.0. This presentation gave a 10,000 foot overview of what is coming in version 2.5 of the Spring framework.

  • As always, 2.5 will be backwards compatible with previous 2.x releases.
  • Greater annotation support.
  • Enhanced support of the testing framework provided by Spring. Using the test framework, you can easily test your configuration, your database connections, and even the database transactions you plan on executing.
  • Support for Java 6, Java EE 5, and OSGi.
  • This release of Spring will be the last release to support Java 1.4.
  • Support for OSGi allows for greater modularization.