February 19, 2016 What can be learnt from A/B testing?

Hiroshi Ushioda
UX Researcher, User Experience Department

Determining Success Through A/B Testing

As there are a multitude of different approaches when it comes to website information architecture and design, it is not an easy task to decide on the most appropriate to adopt.

Let's consider, as an example, a Japanese smartphone site on which users can search for hotels and then make reservations. On most websites there are usually detailed information pages where it is possible to check hotel services and facilities as well as accommodation plans; however as the width of smart phones screens is limited choosing one of the following two layouts is the only option:

  • Version A: at the top of the page the hotel's features (photos, information on amenities, etc.) are displayed
  • Version B: at the top of the page a list of the possible accommodation plans for certain dates is displayed

For users who are still undecided about their travel plans, version A might be preferable as it allows them to get an idea of a hotel by taking a look at its facilities. On the other hand, for users who have already chosen a destination and how long to stay, it might be better first of all to be able to check if there is accommodation available that matches their travel schedules - which version B provides.

For situations when such issues of design are encountered, the so-called A/B testing method is used. A/B testing is about running simultaneously both versions of the design, comparing them on the basis of access log data and then adopting the one that performs better. In the case of the above-mentioned example of hotel reservation sites, the version with higher conversion rates, that is completed reservations, leading from the detailed information page is the winning design and the one to be adopted.

In recent years, with the development of many tools that make it possible for such A/B tests to be performed easily, an increasing number of companies are beginning to implement decision-making based on the use of such experimental methods.

Considering Data from a Different Perspective

If, in the above example, the winning design is version B - which displays a list of the hotel's different accommodation plans at the top of the page - the following action should be taken as a matter of course:

  • at the top of the hotel's detailed information page on the smartphone site, a list of the possible accommodation plans displayed

However, would that be sufficient?

Here it is not only necessary to compare the conversion rates of version A and version B, but also include the results of a comparison from a different perspective. For instance, factors such as how long visitors stay on the pages and the number of photo clicks made should be considered in addition to conversion rates. By doing so, although version A has a lower conversion rate, in terms of the page visit-duration and the number of photo clicks, it turns out to have higher performance when compared to version B. Supposing such an outcome is observed, wouldn't that lead to the following insights and hypotheses:

  • isn't version A more user-friendly as it provides users who are doing their own comparative review with a full grasp of hotel facilities?
  • aren't the users who are still fairly undecided, and unlikely to make a hotel reservation straight away, likely to return to the page to make a reservation as a result of their positive experience while only reviewing?

Furthermore, let's consider things after adding one more point of view – that of 'devices'.

If we look at the results comparing both layout versions on not only smartphones but also on PCs, the following can be observed:

  • in the case of PC sites, there isn't a significant difference in the conversion rates between version A and version B
  • the conversion rate for completed reservations on the smartphone sites is considerably lower compared to that of the PC sites


  • might it be the case that, the good results of version B on a smartphone site could have been due to the fact that a number of users who were in a hurry to find out a hotel?
  • or perhaps, there were many users who review hotels in their spells of free time on their smart phones and then at the time of making a reservation prefer to use a PC keyboard for inputting data?

Perhaps more insights and hypotheses like these could be made. In my opinion, although this example is to a great extent based on supposition, it shows that just by considering data from a slightly different perspective, through the results of A/B testing, it is possible to acquire a better understanding about generating and verifying hypotheses concerning users' points of view.

Thus, because of the way it is used and its viewpoint analysis, A/B testing not only determines whether designs are successful or not, but also provides those engaged with websites as well as user experience with the opportunity to acquire knowledge and develop a new critical thinking approach.

A/B Testing Aiming at Continuous Improvement

As I have already stated in previous Columns, including “Operation First From the UX Design Perspective” and “Generating Hypotheses through Observational Studies”, with the trends of mobile computing and device diversification users' behavior and consciousness are expected to become increasingly complex and difficult to define. For this reason, we should not remain bound to methods that have proven successful in the past, but rather it is crucial that we study user behavior and thinking on a continuous basis and thus generate new hypotheses.

Of course, using A/B testing as a tool for measuring results in the short-term also has its merits. However, if a site is to be continuously improved, isn't it more important to acquire insights about website users' interactions rather than stay restricted to whether the result of a single A/B test determines design as being accepted or rejected?

Furthermore, in order to perform A/B testing that leads to significant findings, comparisons should not be made ad hoc, instead they should be solid and focus on ‘experiment and analysis planning‘ as well as ‘continuous repetition' addressing questions like:

  • what hypothesis should the comparison be based on?
  • what is the A/B test aiming to verify?
  • what data is necessary in order to verify the hypothesis?
  • what can/can't be verified through the performance of a single A/B test?
  • what is not possible to verify through running a single test, what are the next steps to be taken so that results can be observed?
  • for what cannot be examined through A/B testing, is a qualitative approach such as a user survey also necessary?

At Mitsue-Links, under the slogan “Operation First”, we support the continuous improvement of our clients' website operations. To do so, we utilize our accumulated know-how of user experience/usability advancement, familiarity with planning and performing of A/B tests, analysis of data as well as implementation of user surveys and usability tests.