HFI Usability Home

Usable. Experience. Design.

HFI Usability Home About HFI - Usability Experts Usability Consulting Usability Training & Certification Usability Tools & Standards Usability Newsletter Executives Only  

Contact Us | 1-800-242-4480

 
UI Design Newsletter
Current Issue
Past Issues
Reader Comments
Subscribe
Change Address
divider
HFI Webcasts
June 2008 Webcast
Upcoming Webcasts
Past Webcasts / Podcasts
divider
Ask Eric
Questions & Answers
Ask your question
divider
Readings
Published HFI Articles
White Papers
Intranet Standards
GUI Standards
Quantitative Usability
e-Commerce Usability
GUI Design
IVR
divider
Just Fun
Cartoons
Mouse Maze
10 Web Usability Tips
Usability Quiz
Web Usability Quiz
Contextual Innovation Quiz
Persuasive Design Quiz
Persuasion Flow Symbols
History of HFI Buttons
divider
Resources
Persuasion Flow Symbols
Accessibility
Bibliography
Usability Links
HCI Degree Programs

UI Design Newsletter – May, 2007

Print this page | Email this page

Insights from Human Factors International

divider line

In This Issue:

Why "how many users" is just the wrong question – Rethinking the requirements for valid usability tests

HFI Chief Scientist, Kath Straub, PhD, CUA, revisits the question about the number of users required for an effective usability test.

The Pragmatic Ergonomist

Dr. Eric Schaffer, Ph.D., CPE, founder and CEO of HFI offers practical advice.

 
Why "how many users" is just the wrong question
   

Death. Taxes.
How-many-users.

Every day in offices around the world usability professionals ask and are asked this question: How many users do we need for our usability test? Its an important question. We want to find most of and the most severe problems. So, we need to test enough people. But usability testing is so expensive, and the cost of testing increases with each participant. So, we don't want to test too many, either.

On the one hand, synthesizing the received theoretical wisdom suggests that there is an answer to this question. And answer is "5." (Virzi 1992; Nielsen and Landauer, 1992) That is, based on a probabilistic formula, you will need to test 5 users to find about 85% of the problems that will trip up 1/3 or more of your users. The number 5 is very concrete. Practitioners like it. 5 is easy to remember.

On the other hand, this question gets debated every year at the CHI conference. You can count on it.. Like death and taxes. The same debate. Given that the UX community (re-)debates this every year, it seems that the wisdom has not been so well received.

divider line

Blue! No, Green!...
No, 5!

That the number 5 has such staying power says something interesting about human memory and the way people reason. The 5-formula can work. But, like tossing a coin, it's probabilistic. If you keep flipping a coin over and over, it will come up heads half the time. But it can also come up tails nine times in a row.

Similarly, if you run enough usability tests with 5 users, on average you will find most of the errors about most of the time. But if you run only one test (or just a few) with 5 users, it's possible that you will uncover fewer errors than the formula projects. (Spool and Schroeder, 2001; Faulkner, 2003, or you are less ambitious, there is the May, 2004 newsletter.)

There are other challenges with the 5-formula. For instance, to calculate the number of testing participants you need, a priori you need to know how many problems there are to find. If you knew that, likely you wouldn't need to test to find them, eh?

divider line

Reach beyond...
# of users

Not surprisingly, the debate churned on in San Jose (CHI 2007). But this year, Lindgaard and Chattratichart (2007) threw down a different gauntlet. The obstacle to solving the problem, they said, is the question. "How many users" is the wrong way to think about it.

In usability testing, we are looking for mismatches between the site/app model and the user's mental model on the key and critical tasks. Framed this way, the criterion that determines how many problems get uncovered is how many tasks participants try, not how many participants there are.

To test their claim, Lindgaard and Chattratichart reanalyzed the usability testing data from CUE-4* (Molich, 2003 – Workshop Reference). Within that project, 9 highly experienced teams used think-aloud techniques to independently test the same site. The teams received identical input from the coordinators (site objectives, problem criteria, testing focus). Each team shaped their own testing plan and protocol, conducted the testing, and aggregated the findings into a pre-determined feedback format.

Lindgaard and Chattratichart looked for similarities and differences across the methods and findings reported by each team. Specifically, they were seeking relationships between test design (e.g., # users, # tasks) and number of problems identified.

Their study reports that there was no reliable correlation between the number of users tested and the number of usability problems uncovered. Testing more users did not ensure that that more problems would be discovered. Further, although each of the 9 teams tested 5 users or more, they reported only 7-43% of the known problems, not the 85% predicted by the 5-formula.

In contrast, their analysis showed a significant positive correlation between the number of tasks evaluated and the number of problems uncovered. That is, the more tasks a team included in their testing protocol, the more problems they uncovered.

They conclude that other things being equal (e.g., quality of recruiting), the better predictor of the productivity of usability testing is the number of tasks participants (try to) complete, not the number of participants who try to complete them.

______________
* The CUE Studies, Molich and Dumas, in press; Molich, Kaasgaard and Karyukin, 2004, among others, compare methods and findings of different teams conducting the same usability test. CUE findings show that different usability testing teams evaluating the same interface report different numbers usability problems, often with very little overlap in the identified. There's clearly more to it than number of users.

   
The Pragmatic Ergonomist, Dr. Eric Schaffer
   

 

This result is fantastic! It's like trying to find potholes in a city. Not every car hits every pothole in the road. So you need to send a number of cars down each road. But it is even more important to send cars down a larger NUMBER of roads. The key seems to be in more tasks, not just more users. The problem is that you can only run a given number of tasks with a single test participant. More than 60 or perhaps 90 minutes of testing won't work well.

I propose a "Lingaard-Chattratichart Testing Strategy." Test 3 different groups of participants. Put maybe 6 to 12 people in each group. Then have each group do a different basket of tasks. This will allow us to test a LOT of different tasks and should get a far better level of reliability.

divider line

References

Faulkner, L. Beyond the five-user assumption: Benefits of increased sample sizes in usability testing. Behavior Research Methods, Instruments & Computers, 35, 3, Psychonomic Society (2003), 379- 383.

Lindgaard, G. and Chattratichart, J. Usability Testing: What Have We Overlooked? CHI 2007 Proceedings, ACM Press (2007).

Molich, R. & Dumas, J. S. Comparative Usability Evaluation (CUE-4). Behaviour & Information Technology, Taylor & Francis (in press).

Molich, R. & Jeffries, R. Comparative expert review. In Proceedings CHI 2003, Extended Abstracts, ACM Press (2003), 1060-1061.

Molich, R., Ede, M. R., Kaasgaard. K., & Karyukin, B. Comparative usability evaluation. Behaviour & Information Technology, 23, 1, Taylor & Francis (2004), 65-74.

Nielsen, J., & Landauer, T. K. A mathematical model of the finding of usability problems. In Proceedings of INTERCHI 1993, ACM Press (1993), 206-213.

Spool, J. & Schroeder, W. Testing Websites: Five users is nowhere near enough. In Proceedings CHI 2001, Extended Abstracts, ACM Press (2001), 285-286.

Virzi, R.A. Refining the test phase of usability evaluation: How many subjects is enough? Human Factors, 34, HFES (1992), 457-468.

Comment on this article
 
Name: *
Company:  
Email: *
Comment:  

Reader comments on this and other articles.

The HFI User Interface Design Update Newsletter discusses the latest research in the field of usability. To learn more about the practical application of recent usability research and how it impacts user-centered design, we invite you to attend our Putting Research into Practice course.