CSS Regression

Adopting Test-Driven Development (TDD) is critical for a tech company. TDD serves as a strong safety net against bugs caused by code changes. TDD becomes especially important for large code bases that are changed frequently. At the time of this writing, the Trustious code base is around 20,000 commits large, with over 80,000 lines of code, code that changes rapidly. The frequent changes are a natural consequence of the Lean methodology. As a company we firmly believe in Lean and are continuously running Build-Measure-Learn cycles. This requires us to make frequent experimental changes to the user experience which map to frequent changes to the code base (mostly to the front end). With such a rapidly evolving product, writing and maintaining tests that will truly keep us safe, could easily get out of hand.

Why do CSS regression?

A large code base that is frequently changed is very difficult to maintain. This is especially true for front end code where tiny changes have a nasty habit of affecting modules you never thought would be affected.  This implies that the safety net of tests is critical for these visual front end changes.

One approach is the use of Selenium (through capybara for ruby), which provides a scriptable API to control the browser. For example you can click on buttons, links, fill forms, verify the existence of content, or execute JavaScript in the context of the loaded page. This is very useful for simulating the user’s behavior through scripting automated scenarios and running them as part of our test suite. But with such automation, how can I assert that a page or a component looks right? Three techniques are available here:

  1. Check that the DOM element in question has a specific id or a set of classes applied to it. This can be easily done using Selenium and would ensure that the CSS is applied. However, this doesn’t check that the CSS itself is correct, and it is not very robust to changes. For example if a developer needs to change the HTML structure, the whole test suite needs to be examined for this change.

  2. Read the CSS properties of the DOM element in question using JavaScript. This solves the problem of CSS correctness, but is also not robust against small changes.

  3. Take a screenshot of the page and compare to a base image. This approach is completely independent of the code base, making it robust to changes. It also allows for a truthful comparison between different versions. The concern here is that it requires two manual operations: verification of the comparison result and maintaining the base image.

Trustious went through using these three techniques in this chronological order. The first two were naturally needed for many sanity tests. However, with time our test suite became bloated and very difficult to maintain. Then came in the third technique, which we now use in parallel with the usual test suite run. Building such a hybrid testing environment has the advantage of keeping your test code much less volatile to changes, and spotting visual changes sooner and more accurately.

Wraith

To take a screenshot based CSS regression, we opted for Wraith, although, there were other options. We ended up picking Wraith for the following reasons:

  1. Easy customization and configuration. The framework wraith offers you the flexibility to write your own JavaScript code that would run on the browser before taking a screenshot. Details of how this is used is explained below.
  2. Support for responsive testing. It allows you to take screenshots of the same page with different widths. This is very important to us since 30% of our traffic is from mobile.
  3. phantomJS being Wraith’s default headless browser. Headless browsers give a significant speedup versus using a fully-fledged browser. Another plus is that in Trustious we already use phantomJS in other endeavors, so we were able to avoid rewriting some tricky code.

Gotcha: When installing Wraith, its best to get it directly from their GitHub repository to stay up to date with their updates. The repository contains many cool features like: taking screenshots in a parallelized fashion making Wraith runs much faster, and the ability to sort the screenshot comparisons by the size of the diff.

Here are some examples of how the output looks like:

Screenshot from 2014-09-24 14:00:19

Screenshot from 2014-09-24 14:01:01

Tips and tricks

Before going through the examples below, make sure you take a look at the phantomJS quick start guide and Wraith’s readme.

The two main files to play with during setup are config.yaml (specifies the running configuration such as domains, page widths, paths) and snap.js (a phantomJS script). Wraith uses snap.js to produce the screenshots for a given url with a certain width. Here’s how it runs the script:

phantomjs snap.js <input> <view port width> <target image name>

where <input> is substituted by a value in the paths section of config.yaml. This would allow you to pass custom configurations to the snapping script. We prefer to pass "input" as a comma separated value to give us the freedom of passing arguments that customize the snapshots to be captured. More on this below.

User agent switching based on width

When working on a responsive website, often relying solely on CSS media queries is not enough, as there may be different HTML elements between desktop and mobile views. This is why checking the user agent to recognize which platform you are on is useful.

To handle this difference during CSS regression run, we change the user agent to the one needed based on the current screenshot width. This is done as follows:

if(view_port_width == '320') {
page.settings.userAgent = 'Mobile Mozilla/5.0 .....';
}
else {
page.settings.userAgent = 'Mozilla/5.0 .....';
}

Post JS callback

Let’s say you you have a JS function defined on the global script scope, and would like to run it inside the page.evaluate call back. If you check the documentation for the page.evaluate function, you will notice that after the callback function, you can pass a list of arguments. This list of arguments is actually passed to the callback function.

So, assume we have 2 variables defined on the global script level: ran_js, and js_func_to_run. The former is a flag to know whether js_func_to_run was executed or not.

To access these 2 variables inside the page.evaluate callback, it will be used as follows:

page.evaluate(function(ran_js, js_to_run) {
js_to_run();
ran_js = true;
}, ran_js, js_to_run);

Now that we can run a function inside the page scope, let’s make use of the input convention we mentioned earlier. In this example, the JS function to run will be the second column in the Comma Seprated Value (CSV) input.

The screenshot script can have a hash of various functions to run, for example:

JS_FUNCS = {
'foo1': function() {console.log("in foo1")},
'foo2': function() {console.log("in foo2")},
'foo3': function() {console.log("in foo3")},
'foo4': function() {console.log("in foo4")},
}

So if we want to run foo2 on http://example.com/page1, the input would be http://example.com/page1,foo2. Note that in the wraith configuration file, the path value will be /page1,foo2 since wraith prepends the domain from the domains section.

Now that we have a CSV input, we can split it back into a url and a function name in the script itself and then call the function name using the JSON defined above:

var parts = system.args[1].split(',');
var url = fn_name[0];
var fn_name = parts[1];
JS_FUNCS[fn_name]()

Cookie management

Often one needs to test pages that require a user to be logged in. One way to tackle this is to write the JS code that would simulate the steps to login. Namely clicking the right buttons, filling the form, and submitting. The problem with this approach is it takes a lot of time doing this for each screenshot, and is not robust against code changes.

Another way of simulating login is provided through fake cookies, as most frameworks store the current user using a session cookie. In phantomJS, you can add a pre-stored cookie to the current browser session. A cookie can be thought of as a JSON object that looks like the following:

var cookie = {
"name": "...",
"value": "...",
"domain": "..."
}

This cookie can be added to the current browser session as follows:

phantom.addCookie(cookie);

Looks good. Now you can simulate a logged in user through a cookie object. But where can we get it from? You can write a phantomJS script that simulates the login steps needed like the approach described earlier, and print the content of page.cookies. A template for such a script can be found here:

Now that we have all the parts to support taking a screenshot for a page that requires login, how can it be integrated with our CSV based input scheme?

The input now has the following columns:

url,needs login?,JS function to run

Note that this order allowed us to have input strings that only contains a url, both url,needs signed in? without a JS function, or all three.

Regarding mobile checks, one could detect weather the current session is mobile or not, and persist this fact in a cookie. That’s why it’s better to have a cookie object for each type of session.

Waiting for AJAX done

If you are operating a website that depends on an API to populate its content, waiting for the JS functions to load is not enough, you may also want to wait until all AJAX requests are completed. A nice trick is to use the following check inside the page.evaluate callback:

jQueryInactive = jQuery.active == 0;
angularInactive = angular.element($("html")).scope().$apply(function(scope) {return scope.isAjaxActive()}) == false;
applicationControllerPresent = $("html").attr("ng-controller") !== undefined;

isAjaxComplete = jQueryInactive && (!applicationControllerPresent || angularInactive);

Timeout on resources

Rendering a page will proceed even if any of its resources times out. Sometimes resources that are not important for rendering may timeout, causing the screenshot command to freeze. A quick solution for this is to force Wraith to retry if a critical resource times out, and possibly ignore resources that would not influence the diff.

page.settings.resourceTimeout = 60*1000; // 60 seconds
page.onResourceTimeout = function(e) {
if (e.url.indexOf('jpg') > 0) {
    return; // ignore images that timeout
}
console.log(e.errorCode); // usually a 408
console.log(e.errorString); // typically 'Network timeout on resource'
console.log(e.url); // the url whose request timed out
phantom.exit(1); // intentionally crash to force a retry
};

Debugging

Using console.log within the page.evaluate callback will not print to phnatomJS’ STDOUT. Use the following snippet for debugging:

page.onConsoleMessage = function(msg, lineNum, sourceId) {
console.log('CONSOLE: ' + msg +
' (from line #' + lineNum + ' in "' + sourceId + '")');
};

When putting the above tricks together, the resulting screenshot script looks something like this:

What next?

Wraith allows to run the comparison on other headless browsers than phantomJS.
For firefox, you can use slimerJS as Wraith already supports it out of the box. For IE, there is triflejs that still needs to be tried with Wraith.

We encourage you to try this with your own front-end-change-intensive projects. If you have any questions, don’t hesitate to write to us.

4 thoughts on “CSS Regression”

    1. You mean the condition “system.args.length === 3” right?

      If so, the condition is actually correct, because how wraith runs the script is “phantomjs snap.js “, so they are 4 parameters to phantomJS. This condition is from the initial auto generated file from “wraith setup”. I think a re-formalization of the condition can be “system.args.length != 4”

Comments are closed.