Using Page Objects for More Readable Feature Specs

Feature specs are great for making sure that a web application works from end to end. But feature specs are code, and like all other code, the specs must be readable and maintainable if you are to get a return on your investment. Those who have worked with large Capybara test suites know that they can quickly become a stew of CSS selectors. It can make the tests difficult to read and maintain if CSS selectors are sprinkled around through the suite. This is even more problematic if the name of the CSS selector doesn’t accurately reflect what the element actually is or does.

Introducing Page Objects

This is where Page Objects come in. Per Martin Fowler:

A page object wraps an HTML page, or fragment, with an application-specific API, allowing you to manipulate page elements without digging around in the HTML.

Page Objects are great for describing the page structure and elements in one place, a class, which can then be used by your tests to interact with the page. This makes your tests easier to read and understand, which in turn makes them easier to maintain. If used correctly, changing the id or class of an element that is used heavily in your feature specs should only require updating the page object, and nothing else.

Let’s look at an example. Here is a basic test that uses the Capybara DSL to interact with a page:

Let’s take a look at at that same spec re-written to use SitePrism, an outstanding implementation of Page Objects in Ruby:

In the SitePrism version of the spec, the majority of the CSS selectors have moved from the test into the CompanyOfficePage class. This not only makes the selectors easier to change, but it also makes the test easier to read now that the spec is simply interacting with an object. In addition, it makes it easier for other specs to work with this same page since they no longer need knowledge about the structure of the page or the CSS selectors on the page.

While SitePrism’s ability to neatly abstract away most knowledge about CSS selectors from the spec is pretty handy, that is only the beginning.

Repeating Elements, Sections, and Repeating Sections

Page Objects really shine when you completely model the page elements that your spec needs to interact with. Not all pages are as simple as the one in the above example. Some have repeating elements, some have sections of related elements, and some have repeating sections of related elements. SitePrism contains functionality for modeling all of these. Defining the proper repeating elements, sections, and repeating sections in your page object allows you to interact with those elements and sections via SitePrism APIs that make your specs much easier to write, read, and understand.

Testing for Existence and Visibility

SitePrism provides a series of methods for each element defined in the page object. Amongst the most useful of these methods are easy ways to test for the existence and visibility of an element.

Using the Capybara DSL, the best way to check for the existence of an element is to expect that a CSS selector that uniquely identifies the element can be found on the page:

This becomes a bit easier with SitePrism, as we no longer have to use the CSS selector if our page object has an element defined:

Visibility checks are also simpler. Asserting that an element exists in the DOM, and is visible, can be a bit convoluted using the Capybara DSL:

Using SitePrism, this simply becomes:

The above code will wait until #blah is added to the DOM and becomes visible. If that does not happen in the time specified by Capybara.default_wait_time, then an exception will be raised and the test will fail.

Waiting for Elements

Using the above visibility check is a great way to ensure that it is safe to proceed with a portion of a test that requires an element be in the DOM and visible, like a test that would click on a given element. Waiting for an element to be not only in the DOM, but visible, is a great way eliminate race conditions in tests where the click target is dynamically added to the DOM.

Using Capybara Implicit Waits

By default, SitePrisim will not implicitly wait for an element to appear in the DOM when it is referenced in the test. This can lead to flakey tests. Because of this, we highly recommend telling SitePrism to use implicit waits in your project.

Summary

SitePrism is a fantastic tool. If you write feature specs, I highly recommend you check it out. It will make your specs eaiser to write, and maintain.

Note: This article has been cross posted on the UrbanBound product blog.

Tips and Tricks for Dubugging and Fixing Slow/Flaky Capybara Specs

In a previous post, I wrote about how the proper use of Capybara’s APIs can dramatically cut back on the number of flaky/slow tests in your test suite. But, there are several other things you can do to further reduce the number of flaky/slow tests, and also debug flaky tests when you encounter them.

Use have_field(“.some-element”, with: “X”) to check text field contents

Your test may need to ensure that a text field contains a particular value. Such an expectation can be written as:

This can be a flaky expectation, especially if the contents of #some-text-field are loaded via an AJAX request. The problem here is that this expectation will check the value of the text field as soon as it hits this line. If the AJAX request has not yet come back and populated the value of the field, this test will fail.

A better way to write this expectation would be to use have_field(“.some-element”, with: “X”):

This expectation will wait Capybara.default_wait_time for the text field to contain the specified content, giving time for any asynchronous responses to complete, and change the DOM accordingly.

Disable animations while running the tests

Animations can be a constant source of frustration for an automated test suite. Sometimes you can get around them with proper use of Capybara’s find functionality by waiting for the end state of the animation to appear, but sometimes they can continue to be a thorn in your side.

At UrbanBound, we disable all animations when running our automated test suite. We have found that disabling animations have stabilized our test suite and made our tests more clear, as we no longer have to write code to wait for animations to complete before proceeding with the test. It also speeds up the suite a bit, as the tests no longer need to wait for an animations to complete.

Here is how we went about disabling animations in our application: https://gist.github.com/jwood/90e7bf6873774055b169

Use has_no_X instead of !have_X

Testing to make sure that an element does not have a class, or that a page does not contain an element, is a common test to perform. Such a test will sometimes be implemented as:

Here, have_css will wait for the element to appear on the page. When it does not, and the expression returns false, it will be negated to true, allowing the expectation to pass. However, there is a big problem with the above code. have_css will wait Capybara.default_wait_time for the element to appear on the page. So, with the default settings, this expectation will take 2 whole seconds to run!

A better way to check for the non-existence of an element or a class is to use the has_no_X matcher:

Using to_not will also behave as expected, without waiting unnecessarily:

No sleep for the fast feature test

Calls to sleep are often used to get around race conditions. But, they can considerably increase the amount of time it takes to run your suite. In almost all cases, the sleep can be replaced by waiting for some element or some content to exist on the page. Waiting for an element or some content using Capybara’s built in wait functionality is faster, because you only need to wait the amount of time it takes for that element/content to appear. With a sleep, your test will wait that full amount of time, regardless.

So, really scrutinize any use of sleep in a feature test. There are a very small number of cases, for one reason or another, where we have not been able to replace a call to sleep with something better. However, these cases are the exception, and not the rule. Most of the time, using Capybara’s wait functionality is a much better option.

Reproduce race conditions

Flaky tests are usually caused by race conditions. If you suspect a race condition, one way to reproduce the race condition is to slow down the server response time.

We used the following filter in our Rails application’s ApplicationController to slow all requests down by .25 seconds:

Intentionally slowing down all requests by .25 seconds flushed out a number of race conditions in our test suite, which we were then able to reliably reproduce, and fix.

Capture screenshots on failure

A picture is worth a thousand words, especially when you have no friggin idea why your test is failing. We use the capybara-screenshot gem to automatically capture screenshots when a capybara test fails. This is especially useful when running on CI, when we don’t have an easy way to actually watch the test run. The screenshots will often provide clues as to why the test is failing, and at a minimum, give us some ideas as to what might be happening.

Write fewer, less granular tests

When writing unit tests, it is considered best practice to make the tests as small and as granular as possible. It makes the tests much easier to understand if each test only tests a specific condition. That way, if the test fails, there is little doubt as to why it failed.

In a perfect world, this would be the case for feature tests too. However, feature tests are incredibly expensive (slow) to setup and run. Because of this, we will frequently test many different conditions in the same feature test. This allows us to do the expensive stuff, like loading the page, only once. Once that page is loaded, we’ll perform as many tests as we can. This approach lets us increase the number of tests we perform, without dramatically blowing out the run time of our test suite.

Sharing is caring

Have any tips or tricks you’d like to shrare? We’d love to hear them in the comments!

Note: This article has been cross posted on the UrbanBound product blog.

Fix Flaky Feature Tests by Using Capybara’s APIs Properly

A good suite of reliable feature/acceptance tests is a very valuable thing to have. It can also be incredibly difficult to create. Test suites that are driven by tools like Selenium or Poltergeist are usually known for being slow and flaky. And, flaky/erratic tests can cause a team to lose confidence in their test suite, and question the value of the specs as a whole. However, much of this slowness and flakiness is due to test authors not making use of the proper Capybara APIs in their tests, or by overusing calls to sleep to get around race conditions.

The Common Problem

In most cases flaky tests are caused by race conditions, when the test expects an element or some content to appear on the page, but that element or content has not yet been added to the DOM. This problem is very common in applications that use JavaScript on the front end to manipulate the page by sending an AJAX request to the server, and changing the DOM based on the response it receives. The time that it takes to respond to a request and process the response can vary. Unless you write your tests to account for this variability, you could end up with a race condition. If the response just happens to come back quick enough and there is time to manipulate the DOM, then your test will pass. But, should the response come a little later, or the rendering take a little longer, your test could end up failing.

Take the following code for example, which clicks a link with the id "foo", and checks to make sure that the message "Loaded successfully" displays in the proper spot on the page.

There are a few potential problems here. Let’s talk about them below.

Capybara’s Finders, Matchers, and Actions

Capybara provides several tools for working with asynchronous requests.

Finders

Capybara provides a number of finder methods that can be used to find elements on a page. These finder methods will wait up to the amount of time specified in Capybara.default_wait_time (defaults to 2 seconds) for the element to appear on the page before raising an error that the element could not be found. This functionality provides a buffer, giving time for the AJAX request to complete and for the response to be processed before proceeding with the test, and helps eliminate race conditions if used properly. It will also only wait the amount of time it needs to, proceeding with the test as soon as the element has been found.

In the example above, it should be noted that Capybara’s first API will not wait for .message to appear on the DOM. So if it isn’t already there, the test will fail. Using find addresses this issue.

The test will now wait for an element with the class .message to appear on the page before checking to see if it contains "Loaded successfully". But, what if .message already exists on the page? It is still possible that this test will fail because it is not giving enough time for the value of .message to be updated. This is where the matchers come in.

Matchers

Capybara provides a series of Test::Unit / Minitest matchers, along with a corresponding set of RSpec matchers, to simplify writing test assertions. However, these matchers are more than syntactical sugar. They have built in wait functionality. For example, if has_text does not find the specified text on the page, it will wait up to Capybara.default_wait_time for it to appear before failing the test. This makes them incredibly useful for testing asynchronous behavior. Using matchers will dramatically cut back on the number of race conditions you will have to deal with.

Looking at the example above, we can see that the test is simply checking to see if the value of the element with the class .message equals "Loaded successfully". But, the test will perform this check right away. This causes a race condition, because the app may not have had time to receive the response and update the DOM by the time the assertion is run. A much better assertion would be:

This assertion will wait Capybara.default_wait_time for the message text to equal "Loaded successfully", giving our app time to process the request, and respond.

Actions

The final item we’ll look at are Capybara’s Actions. Actions provide a much nicer way to interact with elements on the page. They also take into account a few different edge cases that you could run into for some of the different input types. But in general, they provide a shortened way of interacting with the page elements, as the action will take care of performing the find.

Looking at the example above, we can re-write the test as such:

click_link will not just look for something on the page with the id of #foo, it will restrict its search to a link. It will perform a find, and then call click on the element that find returns.

Summary

If you write feature/acceptance tests using Capybara, then you should spend some time getting familiar with Capybara’s Finders, Matchers, and Actions. Learning how to use these APIs effectively will help you steer clear of flaky tests, saving you a whole lot of time and aggravation.

Note: This article has been cross posted on the UrbanBound product blog.

Managing Development Data for a Service Oriented Architecture

A service oriented architecture (SOA) provides many benefits. It allows for better separation of responsibilities. It simplifies deployment by letting you only deploy the services that have changed. It also allows for better scalability, as you can scale out only the services that are being hit the hardest.

However, a SOA does come with some challenges. This blog post addresses one of those challenges: managing a common dataset for a SOA in a development environment.

The Problem

With most SOAs there tends to be some sharing of data between applications. It is common for one application to store a key to data which is owned by another application, so it can fetch more detailed information about that data when necessary. It is also possible that one application may store some data owned by another application locally, to avoid calling the remote service in certain scenarios. Either way, the point is that in most SOAs the applications are interconnected to some degree.

Problems arise with this type of architecture when you attempt to run an application in isolation with an interconnected dataset. At some point, Application A will need to communicate with Application B to perform some task. Unfortunately, simply starting up Application B so Application A can talk to it doesn’t necessarily solve your problem. If Application A is trying to fetch information from Application B by key, and Application B does not have that data (the application datasets are not “in sync”), then the call will obviously fail.

Stubbing Service Calls

Stubbing service calls is one way to approach this issue. If Application A stubs out all calls to Application B, and instead returns some canned response that fulfills Application A‘s expectations, then there is no need to worry about making sure Application B‘s data is in sync. In fact, Application B doesn’t even need to be running. This greatly simplifies your development environment.

Stubbing service calls, however, is very difficult to implement successfully.

First, the stubbed data must meet the expectations of the calling application. In some cases, Application A will be requesting very specific data. Application A, for example, may very well expect the data coming back to contain certain elements or property values. So any stubbing mechanism must be smart enough to know what Application A is asking for, and know how to construct a response that will satisfy those expectations. In other words, the response simply can’t contain random data. This means the stubbing mechanism needs to be much more sophisticated (complex).

Handling calls that mutate data on the remote service are especially difficult to handle. What happens when the application tries to fetch information that it just changed via another service call? If the requests are stubbed, it may appear that the call to mutate the data had no effect. This could possibly lead to buggy behavior in the application.

Also, if you stub out everything, you’re not really testing the inter-application communication process. Since you’re never actually calling the service, stubs will completely hide any changes made to the API your application uses to communicate with the remote service. This could lead to surprises when actually running your code in an environment that makes a real service call.

Using Production Data

In order for service calls to work properly in a development environment, the services must be running with a common dataset. Most people I’ve spoken with accomplish this by downloading and installing each application’s production dataset for use in development. While this is by far the easiest way to get up and running with a common dataset, it comes with a very large risk.

Production datasets typically contain sensitive information. A lost laptop containing production data could easily turn into a public relations disaster for your company, and more importantly it could lead to severe problems for your customers. If you’ve ever had personal information lost by a 3rd party then you know what this feels like. Even if your hard drive is encrypted, there is still a chance that a thief could gain access to the data (unless some sort of smart card or biometric authentication system is used). The best way to prevent sensitive information from being stolen by keeping it locked up and secure, on a production server.

Using Scrubbed Production Data

Using a production dataset that has been scrubbed of sensitive information is also an option. This approach will get you a standardized dataset, without the risk of potentially losing sensitive information (assuming your data scrubbing process is free of errors).

However, if your dataset is very large, this may not be a feasible option. My MacBook Pro has a 256GB SSD drive. I know of datasets that are considerably larger than 256GB. In addition, you have less control over what your test data looks like, which could make it harder to test certain scenarios.

Creating a Standardized Dataset

The approach we’ve taken at Centro to address this issue is to create a common dataset that individual applications can use to populate their database. The common dataset consists of a series of YAML files, and is stored in a format that is not specific to any particular application. The YAML files are all stored together, with the thought that conflicting data is less likely to be introduced if all of the data lives in the same place.

The YAML files may also contain ERB snippets. We are currently using ERB snippets to specify dates.

- id: 1
  name: Test Campaign
  start_date: <%= 4.months.ago.to_date %>
  end_date: <%= 3.months.ago.to_date %>

Specifying relative dates using ERB, instead of hard coding them, gives us a dataset that will not grow stale with time. Simply re-seeding your database with the common dataset will give you a current dataset.

Manually creating the standardized dataset also enables us to construct the dataset in such a way that edge cases that are not often seen in production data are exposed, so we can better test how the application will handle that data.

Importing the Standardized Data into the Application’s Database

A collection of YAML files by itself is useless to the application. We need some way of getting that data out of the YAML files and into the application’s database.

Each of our applications has a Rake task that reads the YAML files that contain the data it cares about, and imports that data into the database by creating an instance of the model object that represents the data.

This process can be fairly cumbersome. Since the data in the YAML files are stored in a format that is not specific to any particular application, attribute names will often need to be modified in order to match the application’s data model. It is also possible that attributes in the standardized dataset will need to be dropped, since they are unused by this particular application.

We solved this, and related issues by building a small library that is responsible for reading the YAML files, and providing the Rake tasks with an easy to use API for building model objects from the data contained in the YAML file. The library provides methods to iterate over the standardized data, map attribute names, remove unused attributes, or find related data (perhaps in another data file). This API greatly simplifies the application’s Rake task.

In the code below, we are iterating over all of the data in the campaigns.yml file, and creating an instance of our Campaign object with that data.

require 'application_seeds'

namespace :application_seeds do
  desc 'Dump the development database and load it with standardized application seed data'
  task :load, [:dataset] => ['db:drop', 'db:create', 'db:migrate', :environment] do |t, args|
    ApplicationSeeds.dataset = args[:dataset]
    seed_campaigns
  end

  def seed_campaigns
    ApplicationSeeds.campaigns.each do |id, attributes|
      ApplicationSeeds.create_object!(Campaign, id, attributes)
    end
  end
end

With the Rake task in place, getting all of the applications up and running with a standardized dataset is as simple as requiring the seed data library (and the data itself) in the application’s Gemfile, and running the Rake task to import the data.

The application_seeds library

The library we created to work with our standardized YAML files, called application_seeds, has been open sourced, and is available on GitHub at https://github.com/centro/application_seeds.

Drawbacks to this Approach

Making it easy to perform real service calls in development can be a double edged sword. On one hand, it greatly simplifies working with a SOA. On the other hand, it makes it much easier to perform service calls, and ignore the potential issues that come with calling out to a remote service. Service calls should be limited, and all calling code should be capable of handling all of the issues that may result from a call to a remote service (the service is unavailable, high latency, etc).

Another drawback is that test data is no substitute for real data. No matter how carefully it is constructed, the test dataset will never contain all of the possible combinations that a production dataset will have. So, it is still a good idea to test with a production dataset. However, that testing should be done in a secure environment, where there is no risk of the data being lost (like a staging environment).