Software Testing (4/4) – Automated UI Tests

Former posts of this series always explained the solutions of the testing approaches. Let me now try to inspire you to understand my motivation in writing this series, as automated UI tests were my main motivator to finally increase the awareness of better test coverage.

Automated UI tests are the holy grail of automatic software testing. They require working software logic (granted by unit – and integration tests) and a mostly finalized UI. In case you’d like UI tests to be part of your software development process, make sure to start early enough to implement them. The reason being they will save you a lot of work as soon as the software is scaled up to different OS versions, device resolutions, languages or if you simply want to ensure proper operation of the software.

You can react faster on OS updates or software changes and you’re able to not just to run random software tests on behalf, but rather make sure the software is working as intended under different circumstances. If quality and reliability matters, this should be your choice :-)

Cost Catalysts
The MyMedela app, developed by Namics in cooperation with the MyMedela AG is one of our projects. I’d like to give you a brief introduction into its complexity from a software developer’s point of view (in this case especially the iOS point of view).

I distinguish 4 types of cost catalysts, which increase costs exponentially to support the most recent changes (whatever the reason for change can be)

  1. Device and Resolution diversity
  2. Language and Dialect diversity
  3. Operating System version diversity
  4. Use Cases

The MyMedela app was released in 2014 (supporting iOS 7 and newer) and of course supported the different device models (iPhone 4s, 5 and – back then 5s – as well as iPod Touch are for reasons of simplicity out of scope).

By now the variety of device types and their respective resolutions have been increased and all shall be supported by the application – as we want to make sure existing users are still able to use and experience new features of the application. Currently, we have 7 device types which are being supported (iPhone 5C is meant to be equal to 5, this is why we have 7 instead of 8).

Back in 2014 the application supported the German market only. By now, the amount has increased and is still raising. For every new language, the entire application has to be checked and every feature has to be reviewed on its own to ensure the language labels fit the need of not truncating, breaking the lines where they are not supposed to be and so on. Furthermore, we have to distinguish not just between languages, but also dialects. (left: Localization issue with text being truncated in the button, right: layouting of a UI component is not working on a different resolution)

Layouting of a UI component is not working on a different resolution Localization issue with text being truncated in the button

The last 2 years, Apple has released several major and minor operating system versions. Basically we take the major OS versions into account and the most appropriate minor release ones. Which is in total 6 – as we are currently on iOS 9.1.
The following matrix shows the average amount of compatible iOS versions per regarded device type:
Average amount of compatible iOS versions per regarded device type

I have listed the amount with respect to the date and related code catalyst in the following table, for you to gain an overview of the increased complexity.
Furthermore, I’m introducing the amount of “appropriate runs”. This measurement is used to indicate the least amount of runs required to grant the highest testing coverage for each use case.
There are two scenarios we want to take care of, which affect the number of runs being performed:

  1. All Languages and Dialects are displayed correctly across all device types
    {Amount of devices & resolutions} x {Amount of languages & dialects}
  2. Any OS version, which shall be supported by the app, is working as intended
    {Amount of compatible OS versions}

The amount of appropriate runs is now easily calculated using the following formula:

{Amount of devices & resolutions} x {Amount of languages & dialects} + {Amount of compatible OS versions}

A short sidemark: system fonts have changed in the past with new OS versions, so to ensure that all languages and regions are displayed as intended, one should run (at least once) a test across all cost catalysts, when a new OS version was released. I simply call it „extraordinary runs„, as they are on behalf. The required amount of runs for this specific case is calculated as followed:

{Amount of Devices & Resolution} x {Amount of Languages & Dialects} x {Amount of compatible OS versions}

The required amount of runs for a specific case
Diagram: amount of runs per use case
Based on the formula we could easily illustrate the system’s complexity, in particular concerning end-user applications with hardly restricted device types and operating systems. Let’s go ahead to the testing approach. To structure our tests, we will take care of use cases to execute user scenarios.

A Use case is „a list of action or event steps, typically defining the interactions between a role […] and a system, to achieve a goal“[1]

These use cases define the most crucial type of cost catalyst: themselves, defined by the application.. A software’s success is tightly coupled to low error rates of the application usage and that the application responds to the user as specified. Each test case of course increases the amount of time required. The more complex an app becomes and the more use cases are covered, the more time consuming the test process is at the end.
We do highly recommend to run through every scenario on each OS and language. So, at the end we are facing the following total amount for the two defined testing runs:

Total amount of „Appropriate Runs„:

{Amount of use cases} x {Appropriate Runs}

Total amount of „Extraordinary Runs„:

{Amount of use cases} x {Extraordinary Runs}

Let’s do the math. In case you have a use case with a user account creation. In the MyMedela app it takes 43 seconds to complete the task by an automated script. A user would have to enter a username, password, optionally select a baby image, baby birth date and so on. Multiply this duration with all combinations from above and you will have a person being busy for ~3 hours (= 240 * 43s) – let’s assume there is no kind of fatigue or error rate from the testers side, nor the time required to setup the device in regards to the related specifications. It would not just take a long time, it would also delay new releases – in case there is an OS update.

How are these tests being accomplished now? Well, as I’m an iOS developer, I’ll use the iPhone UI testing approach to guide you through this.
Basically you will need what you will have set up anyway: a server which takes care of these tests. The server itself should be able to run the product/application you want to test – this also means that the server requires all regarded OS versions and language settings to work.
If this is set up, you can go on and really implement a UI test environment. I’ve setup a build target, which can easily be triggered manually or be performed as nightly build. This test build now executes UI tests.
What do they look like? I have chosen the approach to have four separate kinds of assertions to measure the success of a test.

  1. UI test script passes
  2. Test is able to access all required UI elements in the UI sequence of the use case
  3. Create actual screenshots for each step being performed and compare them with target screenshots
  4. Make screenshots available to others

Let’s break the assertions down a bit:

1. Successful UI test script run
We track the start and end of test scripts. If they do not pass, the test was interrupted and generates a report.

2. Access related UI elements
We have written a test library which makes it easy to track whether the expected element was accessible by the user or not. If it was not, an assertion with the related line in the script and the affected element are logged. Simple example: if a button was not yet visible, a user would also not be able to tap it, so the test should not pass.

3. Actual screenshot vs. target screenshot
We encountered an issue that content was not displayed as we would have liked it to be. For example all required UI elements for the test to pass were accessible but other UI elements, which are expected to work were not regarded in detail. This is why we introduced an image comparison additionally. (left: target screenshot, right: actual screenshot). In this way we could ensure that UI elements, which are not directly affecting the test script are positioned correctly and available for the user.
The approach to take these target screenshots is quite simple: Our automated testing scripts take screenshots either way. These are now (until a change is being made) evaluated and used as target screenshots, if they are appropriate. because the script’s screenshots are being taken we can ensure that even an animation on a certain point of time is at the correct position, as both the target and the actual screenshot are taken at the same time.
After all target screenshots were created, a common testing run is performed. An automated image comparison is being performed and in case there are (slight) differences, we will inform the developer again about the test and the related step in the test scenario.


4. Availability of testing screens

Our project managers liked the automatic application flow on our screens and were fond of all the nice screenshots being created, as they could use them for App Store releases or to verify once again how label changes would look like in the end. So they requested access to the images to easily gain an overview of the application and changes being made. If you have the opportunity: try to give other testers and project related people the access to these files.

Experience has shown that writing UI tests and especially updating UI tests as soon as changes are made in the UI are costly. At first, you have to set up a reliable testing environment, which fits to your specifications (OS versions and languages etc.).
Afterwards, you’ll have to write UI testing scripts which take their time in executing them, because a user scenario should still work on different device resolutions and languages. For example, you can’t use absolute coordinates to locate UI elements. In addition, these scripts need to be updated from time to time – to cover UI changes. And the more complex a use case is, the more time it will take to run the testing script.

At last, it’s is highly recommended to always restore a “default state” for the testing scripts to start. In case a script fails and breaks at a certain point, all follow up scripts could fail as well. We always restore the start state at the end and the beginning of each script, which unluckily always costs a few extra seconds of execution time (scaled up to several hundred runs, this is a lot.

A hint for you: if you can setup a testing architecture for repetitive user interaction flows: do so. The more code you can share, the faster you will be able to implement your test cases and react to UI changes. So, develop an appropriate testing architecture with functions you can share across different testing scripts.

I tried to keep it as general as possible, as this was an iOS point of view. Nevertheless, don’t be afraid, there are a bunch of UI testing opportunities for any kind of software, you just have to write UI testing scripts. For example:

  • Android: UI Automator Viewer
  • Xamarin: Testcloud & UI Tests Framework – even cooler with the now released Xamarin 4
  • Web Apps / Browser Testing: Selenium, Appium
  • iOS: Apple Instruments – UI Automator

So, for your next project do not think whether you shall make use of automated software tests. Instead think of what kind of automated software tests you want to embed: the only thing you will gain is maintainability and a better quality assurance.
Don’t ask what you have to do for automated software tests. Ask, what automated software tests can do for you. ;-)

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert.



You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>