Evaluating a Interactive Design

18 min readAug 2, 2021

The evaluation methods described in this article so far have involved interaction with, or direct observation of, users. I introduce methods that are based on under-standing users through one of the following:

• Knowledge codified in heuristics

• Data collected remotely

• Models that predict users’ performance

None of these methods requires users to be present during the evaluation. Inspection methods often involve a researcher, sometimes known as an expert, role-playing the users for whom the product is designed, analyzing aspects of an interface, and identifying potential usability problems. The most well-known methods are heuristic evaluation and walk-throughs. Analytics involves user interaction logging, and A/B testing is an experimental method. Both analytics and A/B testing are usually carried out remotely. Predictive modeling involves analyzing the various physical and mental operations that are needed to perform particular tasks at the interface and operationalizing them as quantitative measures. One of the most commonly used predictive models is Fitts’ law.

Heuristic Evaluation

What is Heuristic Evaluation?

Heuristic evaluation is a process where experts use rules of thumb to measure the usability of user interfaces in independent walkthroughs and report issues. Evaluators use established heuristics (e.g., Nielsen-Molich’s) and reveal insights that can help design teams enhance product usability from early in development.

“By their very nature, heuristic shortcuts will produce biases.”
— Daniel Kahneman, Nobel Prize-winning economist

Heuristic Evaluation: Ten Commandments for Helpful Expert Analysis

In 1990, web usability pioneers Jakob Nielsen and Rolf Molich published the landmark article “Improving a Human-Computer Dialogue”. It contained a set of principles — or heuristics — which industry specialists soon began to adopt to assess interfaces in human-computer interaction. A heuristic is a fast and practical way to solve problems or make decisions. In user experience (UX) design, professional evaluators use heuristic evaluation to systematically determine a design’s/product’s usability. As experts, they go through a checklist of criteria to find flaws which design teams overlooked. The Nielsen-Molich heuristics state that a system should:

Keep users informed about its status appropriately and promptly.
Show information in ways users understand from how the real world operates, and in the users’ language.
Offer users control and let them undo errors easily.
Be consistent so users aren’t confused over what different words, icons, etc. mean.
Prevent errors — a system should either avoid conditions where errors arise or warn users before they take risky actions (e.g., “Are you sure you want to do this?” messages).
Have visible information, instructions, etc. to let users recognize options, actions, etc. instead of forcing them to rely on memory.
Be flexible so experienced users find faster ways to attain goals.
Have no clutter, containing only relevant information for current tasks.
Provide plain-language help regarding errors and solutions.
List concise steps in lean, searchable documentation for overcoming problems.

Heuristic Evaluation — for Easy-to-use, Desirable Designs

When you apply the Nielsen-Molich heuristics as an expert, you have powerful tools to measure a design’s usability with. However, like any method, there are pros and cons:

A vital point is that heuristic evaluation, however helpful, is no substitute for usability testing.

How to Conduct a Heuristic Evaluation

To conduct a heuristic evaluation, you can follow these steps:

Know what to test and how — Whether it’s the entire product or one procedure, clearly define the parameters of what to test and the objective.
Know your users and have clear definitions of the target audience’s goals, contexts, etc. User personas can help evaluators see things from the users’ perspectives.
Select 3–5 evaluators, ensuring their expertise in usability and the relevant industry.
Define the heuristics (around 5–10) — This will depend on the nature of the system/product/design. Consider adopting/adapting the Nielsen-Molich heuristics and/or using/defining others.
Brief evaluators on what to cover in a selection of tasks, suggesting a scale of severity codes (e.g., critical) to flag issues.
1st Walkthrough — Have evaluators use the product freely so they can identify elements to analyze.
2nd Walkthrough — Evaluators scrutinize individual elements according to the heuristics. They also examine how these fit into the overall design, clearly recording all issues encountered.
Debrief evaluators in a session so they can collate results for analysis and suggest fixes.

Learn More about Heuristic Evaluation

Several of our courses examine heuristic evaluation closely: https://www.interaction-design.org/courses/the-practical-guide-to-usability and https://www.interaction-design.org/courses/information-visualization-infovis

This is essential reading regarding heuristic evaluation in mobile design: https://www.uxmatters.com/mt/archives/2014/06/empirical-development-of-heuristics-for-touch-interfaces.php

What you might expect (or find surprising) from heuristic evaluation: https://uxmag.com/articles/what-you-really-get-from-a-heuristic-evaluation

Find the refined Nielsen heuristics here: https://www.nngroup.com/articles/ten-usability-heuristics/

Here are a variety of heuristics to consider: https://uxmastery.com/how-to-run-an-heuristic-evaluation/

Walk-Throughs

Walk-throughs offer an alternative approach to heuristic evaluation for predicting user problems without doing user testing. The meaning of walk-throughs involves walking through a task with the product and noting problematic usability features. There are main two walk-through methods called Cognitive walk-throughs and Pluralistic walk-throughs.

Cognitive Walkthroughs

What is a cognitive walkthrough?

Cognitive walkthroughs are used to evaluate a product’s usability. It focuses on the new user’s perspective by narrowing the scope to tasks needed to complete specific user goals. It was created in the early 90’s by Cathleen Wharton, John Rieman, Clayton Lewis, Peter Polson.

Cognitive walkthroughs are sometimes confused with heuristic evaluations, but, while both methods uncover usability problems and take the users’ point of view, heuristic evaluations typically focus on the the product as a whole, not specific tasks.

How to conduct a cognitive walkthrough

At its core, a cognitive walkthrough has three parts:

Identify the user goal you want to examine
Identify the tasks you must complete to accomplish that goal
Document the experience while completing the tasks

Identifying the user goal

A user goal is a big, overarching objective and doesn’t include specific, step-by-step tasks. From a user’s point-of-view, it doesn’t necessarily matter how the goal is accomplished, as long as it gets completed.

For example, I host a dinner party every month. Beforehand, I ask everyone invited to send me 10 songs they love. Then, I use Spotify to create a playlist of those songs to play during the party. As a user, my goal here is to create a playlist with others to play at my dinner party.

Identifying the tasks

I’ll note here I’ll only be walking through one possible path of accomplishing these goals. Spotify has a number of ways to accomplish these goals. Ideally, you’d identify the optimal path and tasks for each interface. However, in this article, I’ll only be walking through one path.

Goal: Create a Playlist

Open Spotify web player
Enter user name in user name field
Enter password in password field
Click the login button
Click the your library section
Click the new playlist button
Type a name into the playlist name field
Click the create button

Goal: Add a track to the playlist

Click search icon
Enter track name into the field
Click tracks tab
Find track in results
Hover over track
Click “…”
Click “add to playlist”
Select playlist

Documenting the experience

Since experience is subjective, it’s important to structure how an evaluator documents it so that all walkthroughs use the same criteria. Traditionally, the evaluator asks/answers four questions during each task.

Will users understand how to start the task?
Are the controls conspicuous?
Will users know the control is the correct one?
Was there feedback to indicate you completed (or did not complete) the task?

However, I like to add an additional question to specify task completion. I add this question because it allows anyone to easily find those tasks that stop users from completing their goal. Often, these tasks become the highest priority and need to be addressed first.

Were you able to complete the task?

Template

I’ve created a walkthrough template for the above Spotify example. For each task, answer each question with a yes or no. The worksheet will color code answers for you so that a briefly scan will quickly reveal the problem areas.

Download the Cognitive Walkthrough Template

Additional Reading:

Pluralistic Walkthroughs

This type of walkthrough involves multiple groups, including the users (thus the use of ‘pluralistic’ in the name) Representatives of at least three groups are present for the walkthrough:

—Users (at least two, hopefully more)

— User experience professionals (one or two; generally serve as moderator and recorder)

— Programmers (one or two)

Other relevant groups could also be present.

Pluralistic Walkthrough Process

A task is chosen for testing.
Storyboards are prepared for that task (e.g., registration, checkout) and the first storyboard is given to each person present.
Approaching the task as a user would, each writes on the first storyboard (or on a piece of paper) the actions to take:

• press the down arrow key twice to scroll the page, click this empty text box, type text in it, then click this button next to the text box.

4. Once everyone is finished with that first storyboard they compare notes and discussion begins

• Users always give their input first.

5. At the end of the discussion the facilitator shows the ‘correct’ sequence of actions (based on the specs/use cases) and then the next storyboard is distributed to participants Jason Withrow (jwithrow@umich.edu) Pluralistic Walkthrough Proces.

6. This process of individual analysis, followed by group discussion and then a review of the ‘correct’ actions, is repeated for each new storyboard, one at a time.

7. A list of prioritized usability issues and their recommended solutions is the end result.

Compared with heuristic evaluation, walk-throughs focus more closely on identifying specific user problems at a detailed level.

Web Analytics

What Is Web Analytics?

Web analytics is the measurement and analysis of data to inform an understanding of user behavior across web pages.

Analytics platforms measure activity and behavior on a website, for example: how many users visit, how long they stay, how many pages they visit, which pages they visit, and whether they arrive by following a link or not.

Businesses use web analytics platforms to measure and benchmark site performance and to look at key performance indicators that drive their business, such as purchase conversion rate.

Why Web Analytics Are Important

There’s an old business adage that whatever is worth doing is worth measuring.

Website analytics provide insights and data that can be used to create a better user experience for website visitors.

Understanding customer behavior is also key to optimizing a website for key conversion metrics.

For example, web analytics will show you the most popular pages on your website, and the most popular paths to purchase.

With website analytics, you can also accurately track the effectiveness of your online marketing campaigns to help inform future efforts.

How Web Analytics Work

Most analytics tools ‘tag’ their web pages by inserting a snippet of JavaScript in the web page’s code.

Using this tag, the analytics tool counts each time the page gets a visitor or a click on a link. The tag can also gather other information like device, browser and geographic location (via IP address).

Web analytics services may also use cookies to track individual sessions and to determine repeat visits from the same browser.

Since some users delete cookies, and browsers have various restrictions around code snippets, no analytics platform can claim full accuracy of their data and different tools sometimes produce slightly different results.

Sample Web Analytics Data

Web analytics data is typically presented in dashboards that can be customized by user persona, date range, and other attributes. Data is broken down into categories, such as:

Audience Data

number of visits, number of unique visitors
new vs. returning visitor ratio
what country they are from
what browser or device they are on (desktop vs. mobile)

Audience Behavior

common landing pages
common exit page
frequently visited pages
length of time spent per visit
number of pages per visit
bounce rate

Campaign Data

which campaigns drove the most traffic
which websites referred the most traffic
which keyword searches resulted in a visit
campaign medium breakdown, such as email vs. social media

Web Analytics Examples

The most popular web analytics tool is Google Analytics, although there are many others on the market offering specialized information such as real-time activity or heat mapping.

The following are some of the most commonly used tools:

Google Analytics — the ‘standard’ website analytics tool, free and widely used
Piwik — an open-source solution similar in functionality to Google and a popular alternative, allowing companies full ownership and control of their data
Adobe Analytics — highly customizable analytics platform (Adobe bought analytics leader Omniture in 2009)
Kissmetrics — can zero in on individual behavior, i.e. cohort analysis, conversion and retention at the segment or individual level
Mixpanel — advanced mobile and web analytics that measure actions rather than pageviews
Parse.ly — offers detailed real-time analytics, specifically for publishers
CrazyEgg — measures which parts of the page are getting the most attention using ‘heat mapping’

With a wide variety of analytics tools on the market, the right vendors for your company’s needs will depend on your specific requirements. Luckily, Optimizely integrates with most of the leading platforms to simplify your data analysis.

A/B Testing

What Is A/B Testing?

A/B testing (also known as split testing or bucket testing) is a method of comparing two versions of a webpage or app against each other to determine which one performs better. AB testing is essentially an experiment where two or more variants of a page are shown to users at random, and statistical analysis is used to determine which variation performs better for a given conversion goal.

Running an AB test that directly compares a variation against a current experience lets you ask focused questions about changes to your website or app, and then collect data about the impact of that change.

Testing takes the guesswork out of website optimization and enables data-informed decisions that shift business conversations from “we think” to “we know.” By measuring the impact that changes have on your metrics, you can ensure that every change produces positive results.

How A/B Testing Works

In an A/B test, you take a webpage or app screen and modify it to create a second version of the same page. This change can be as simple as a single headline or button, or be a complete redesign of the page. Then, half of your traffic is shown the original version of the page (known as the control) and half are shown the modified version of the page (the variation).

As visitors are served either the control or variation, their engagement with each experience is measured and collected in an analytics dashboard and analyzed through a statistical engine. You can then determine whether changing the experience had a positive, negative, or no effect on visitor behavior.

Why You Should A/B Test

A/B testing allows individuals, teams, and companies to make careful changes to their user experiences while collecting data on the results. This allows them to construct hypotheses, and to learn better why certain elements of their experiences impact user behavior. In another way, they can be proven wrong — their opinion about the best experience for a given goal can be proven wrong through an A/B test.

More than just answering a one-off question or settling a disagreement, AB testing can be used consistently to continually improve a given experience, improving a single goal like conversion rate over time.

For instance, a B2B technology company may want to improve their sales lead quality and volume from campaign landing pages. In order to achieve that goal, the team would try A/B testing changes to the headline, visual imagery, form fields, call to action, and overall layout of the page.

Testing one change at a time helps them pinpoint which changes had an effect on their visitors’ behavior, and which ones did not. Over time, they can combine the effect of multiple winning changes from experiments to demonstrate the measurable improvement of the new experience over the old one.

This method of introducing changes to a user experience also allows the experience to be optimized for a desired outcome, and can make crucial steps in a marketing campaign more effective.

By testing ad copy, marketers can learn which version attracts more clicks. By testing the subsequent landing page, they can learn which layout converts visitors to customers best. The overall spend on a marketing campaign can actually be decreased if the elements of each step work as efficiently as possible to acquire new customers.

A/B testing can also be used by product developers and designers to demonstrate the impact of new features or changes to a user experience. Product onboarding, user engagement, modals, and in-product experiences can all be optimized with A/B testing, so long as the goals are clearly defined and you have a clear hypothesis.

A/B Testing Process

The following is an A/B testing framework you can use to start running tests:

Collect Data: Your analytics will often provide insight into where you can begin optimizing. It helps to begin with high traffic areas of your site or app, as that will allow you to gather data faster. Look for pages with low conversion rates or high drop-off rates that can be improved.
Identify Goals: Your conversion goals are the metrics that you are using to determine whether or not the variation is more successful than the original version. Goals can be anything from clicking a button or link to product purchases and e-mail signups.
Generate Hypothesis: Once you’ve identified a goal you can begin generating A/B testing ideas and hypotheses for why you think they will be better than the current version. Once you have a list of ideas, prioritize them in terms of expected impact and difficulty of implementation.
Create Variations: Using your A/B testing software (like Optimizely), make the desired changes to an element of your website or mobile app experience. This might be changing the color of a button, swapping the order of elements on the page, hiding navigation elements, or something entirely custom. Many leading A/B testing tools have a visual editor that will make these changes easy. Make sure to QA your experiment to make sure it works as expected.
Run Experiment: Kick off your experiment and wait for visitors to participate! At this point, visitors to your site or app will be randomly assigned to either the control or variation of your experience. Their interaction with each experience is measured, counted, and compared to determine how each performs.
Analyze Results: Once your experiment is complete, it’s time to analyze the results. Your A/B testing software will present the data from the experiment and show you the difference between how the two versions of your page performed, and whether there is a statistically significant difference.

If your variation is a winner, congratulations! See if you can apply learnings from the experiment on other pages of your site and continue iterating on the experiment to improve your results. If your experiment generates a negative result or no result, don’t fret. Use the experiment as a learning experience and generate new hypothesis that you can test.

Whatever your experiment’s outcome, use your experience to inform future tests and continually iterate on optimizing your app or site’s experience.

A/B Testing & SEO

Google permits and encourages A/B testing and has stated that performing an A/B or multivariate test poses no inherent risk to your website’s search rank. However, it is possible to jeopardize your search rank by abusing an A/B testing tool for purposes such as cloaking. Google has articulated some best practices to ensure that this doesn’t happen:

No Cloaking — Cloaking is the practice of showing search engines different content than a typical visitor would see. Cloaking can result in your site being demoted or even removed from the search results. To prevent cloaking, do not abuse visitor segmentation to display different content to Googlebot based on user-agent or IP address.
Use rel=”canonical” — If you run a split test with multiple URLs, you should use the rel=”canonical” attribute to point the variations back to the original version of the page. Doing so will help prevent Googlebot from getting confused by multiple versions of the same page.
Use 302 Redirects Instead Of 301s — If you run a test that redirect the original URL to a variation URL, use a 302 (temporary) redirect vs a 301 (permanent) redirect. This tells search engines such as Google that the redirect is temporary, and that they should keep the original URL indexed rather than the test URL.
Run Experiments Only As Long As Necessary — Running tests for longer than necessary, especially if you are serving one variation of your page to a large percentage of users, can be seen as an attempt to deceive search engines. Google recommends updating your site and removing all test variations your site as soon as a test concludes and avoid running tests unnecessarily long.

For more information on AB testing and SEO, see our Knowledge Base article on how A/B testing impacts SEO.

A media company might want to increase readership, increase the amount of time readers spend on their site, and amplify their articles with social sharing. To achieve these goals, they might test variations on:

Email sign-up modals
Recommended content
Social sharing buttons

A travel company may want to increase the number of successful bookings are completed on their website or mobile app, or may want to increase revenue from ancillary purchases. To improve these metrics, they may test variations of:

Homepage search modals
Search results page
Ancillary product presentation

A/B Testing Examples

These A/B testing examples show the types of results the world’s most innovative companies have seen through A/B testing with Optimizely:

Discovery A/B tested the components of their video player to engage with their TV show ‘super fan.’ The result? A 6% increase in video engagement.

ComScore A/B tested logos and testimonials to increase social proof on a product landing page and increased leads generated by 69%.

Predictive Models

What Is Predictive Modeling?

Predictive modeling uses statistical techniques to predict future user behaviors. To understand the intricacy of the design of predictive analytics, you must dive deep and comprehend what a predictive model is. A predictive model uses historical data from various sources. You must first normalize the raw data by cleansing it of anomalies and preprocess it to fit a suitable format that would facilitate analysis. Then, apply a statistical model to the data to draw inferences. Each predictive model comprises various indicators — that is, factors that would likely impact future outcomes — that are called independent variables, or predictor variables.

“Applying a predictive-analytics algorithm to UX design … presents users with relevant information….”

Applying a predictive-analytics algorithm to UX design does not result in changes to a user interface. Instead, the algorithm presents users with relevant information that they need. Here’s a simple illustration of this capability from the ecommerce domain: A user who has recently purchased an expensive mobile phone would likely need to purchase a cover to protect it from dust and scratches. Therefore, that user would receive a recommendation to buy a cover. The ecommerce site might also suggest other accessories such as headphones, memory cards, or antivirus software.

Here are some other examples of predictive modeling. Spam filters use predictive modeling to identify the probability that a given message is a spam. The first mail-filtering program to use naive Bayes spam filtering was Jason Rennie’s ifile program, which was released in 1996. Bayes theorem predicted which email messages were spam and which were genuine. Facebook uses DeepText, a form of unsupervised machine earning, to interpret the meaning of users’ posts and comments. For example, if someone said, “I like blackberries,” they might mean the fruit or the smartphone. In Customer Relationship Management, predictive modeling targets messaging to those customers who are most likely to make a purchase.

Predictive User Experience

Envision coffee machines that start brewing just when you think it’s a good time for an espresso, office lights that dim when it’s sunny and workers don’t need them, your favorite music app playing a magical tune depending on your mood, or your car suggesting an alternative route when you hit a traffic jam.

Predictability is the essence of a sustainable business model. In a digital world, with millions of users across the globe, prediction definitely has the power to drive the future of interaction. Feeding a historical dataset into a system that uses machine-learning algorithms to predict outcomes makes prediction possible.

References

[1] Wang, John, editor. Encyclopedia of Business Analytics and Optimization. Hershey, PA: Business Science Reference, 2014.

[2] Johnson-Laird, P.N. “Mental Models and Cognitive Change.” Journal of Cognitive Psychology, March 20, 2013.