One of the things the user research team in Parliament does is run usability testing. This is when we invite members of the public to our offices. We’ll ask them to use the software or service that PDS is building, and observe them performing realistic tasks.
We look for the things that didn’t work as we expected. When people can’t understand what they were asked to do, or were unable to complete the tasks, we work out why that happened. We then present the issues to the team making the software or service.
When presenting the issues, we often get asked ‘how many people had that issue’. We (usually) refuse to answer that.
In this post, I want to explain why we don’t answer the question ‘how many times did that issue occur?’
Why ask that question
It’s easy to understand why people ask how many times an issue occured. When presenting our findings, we’ll cover a large number of issues that we discovered in the usability testing. Teams obviously want to know which ones to deal with first and this question is often used by teams as a way of prioritising the issues. They believe if it happens more often, it’s more important.
Teams will also be looking for a way to help manage their workload. They may have the impression that if an issue only occurred once, it may be an ‘outlier’ that can safely be ignored.
Why it’s not the right question
There are big risks with using the number of occurrences as a way of prioritising (or disregarding) usability issues. There’s potential for bad decisions to be made by teams if researchers let that happen.
Usability research is a form of qualitative research. It focuses on understanding how and why issues occur. This is different from quantitative research which learns how often issues occur.
Industry convention, based on research, shows that testing with five users will uncover 85% of the most severe usability issues. We base our usability testing on this, and run tests with small numbers of users. This makes sure that we’re reliably discovering issues, without creating unnecessary duplication, and wasting researcher time and public money.
The point of this usability research is that we detect the issues that exist, and learn why they occur. Teams can then make sensible decisions to fix the issues. It doesn’t tell us ‘safe’ information about the frequency of those issues. There’s a statistical test that demonstrates this. It shows how often an issue may occur in the real world, based on how many times we see it in our research.
The confidence intervals in the graph show how often an issue that occurred in the usability lab could occur in the real world in a test with five users.
Even if an issue occurred only once in testing, in the real world it might occur for 65% of users. And if an issue occurred 5 times in testing, it might only occur 60% of the time in the real world. Which is less often than the issue that occurred only once in testing.
Drawing the conclusion that an issue which occurred once is less likely to happen than the issue that occurred five times can therefore be wrong. So it’s dangerous to make decisions based on how many times something happened for a small number of users (particularly when comparing issues that occurred two or three times).
Even when you're aware of these risks, there’s an unconscious bias towards favouring the issue that occurred more often in the lab. This is why we try to avoid answering questions about how often issues occurred.
Although ‘how often it occurred’ isn’t a good way of doing it, it’s very important that teams have the ability to prioritise the issues that we discover.
We do this using a matrix, created by userfocus, based on factors that are safe to assess from this type of research. This includes whether the issue occurred on a critical task, how difficult it was to overcome, and whether users could learn to work around it when encountering it a second time. This is a much safer way of prioritising issues than how often it happened.
Dealing with outliers
Outliers are people who take part in usability research and who don’t accurately represent the ‘typical’ users of the software. They can therefore encounter unrealistic problems. This happens occasionally, and is often an issue with how the participant recruitment was done.
However, you need to be wary of disregarding participants, and their issues, as outliers.
Our teams should have understood who their users are through the research in discovery. We can then check our users through ‘screening’. This is where we ask participants questions about their background and behaviour to make sure that the people taking part represent those who’ll use it in the real world.
A participant shouldn’t be assumed to be an outlier just because they had issues that only occurred for them. That’s perfectly normal for this type of research.
Measuring how often something will happen
Sometimes it's important to find out ‘how often’ things occur. This is what quantitative research can help with through analytics or surveys for example. However quantitative methods rarely tell you why issues occurred or how users overcame the issue. This makes it difficult for teams to try to fix the things they see.
We've found that the best solution is to use both quantitative and qualitative methods during a project’s development, carefully chosen based on what information the team needs to know at that time.
Read more about the work of the user research team.