By Jon Kolko
We commonly hear this concern when conducting a design research program: that the number of people we spoke with can’t possibly be large enough to accurately inform an emergent design and innovation strategy.
Innovation has risk, and it’s natural that most companies want to understand and mitigate that risk through research. Selecting a group of research respondents who are statistically representative, map appropriately to the larger population, and are indicative of the opinions and behavior of that larger set of people is a fundamental principle of most scientific or marketing research. We’ve been taught that “good research” requires a big sample size, and even if we don’t know the math behind the statistics, we recognize that a tiny sample is unlikely to be valuable in making large statements—good research means that the people you talk to represent the people you didn’t talk to.
But that model doesn’t work for design research. The rules for what constitute good research methodologies in social sciences, medicine, and other professionals are simply inadequate for informing design strategy because it’s a fundamentally different form of research. Expectations around it need to be different and the most noticeable change is in sample size and selection criteria. This is challenging because the other professions have a more established history of best practice for experimental research.
Small-sample behavioral research runs counter to these traditions and norms—and to our common sense. The world is big, and it seems crazy to assume that we can make decisions about new products and services for millions based on input from a dozen. But design research is not sloppy—it’s exhaustive and exhausting—and with methods that take time and care to learn and execute.
Pretend that you are an education researcher, and you are interested in understanding why students quit college before they’ve earned a degree. You’ve observed one of your neighbor’s kids drop out of school after she had a child, and you hypothesize that “students who have a child while in school are more likely to drop out than students who don’t have a child while in school.”
A survey can act as a way of gathering data to explore the validity of your hypothesis. You could send a survey to every single college student (there are about 21 million of them) and ask them if they had a child, and if they dropped out. Then, you would see real data that could indicate the relationship between having children and completing degrees.
If you could do this, you would be able to say things like, “College students who have children in college are 13% more likely to drop out than those who don’t.” This doesn’t indicate causality—that having kids leads to dropping out–but it starts to give good evidence that there is a connection between those ideas.
But sending a survey to every college student is, in practice, impossible. You would have to have a list of every student and their address, send them the survey (at your expense), retrieve it (again, at your expense), and aggregate the data. By the time you had your list, the students would have graduated or dropped out and new ones would have enrolled.
In contrast, statistical survey is the idea that you can distribute a survey to a smaller but representative sample of a population, rather than to all of them. By surveying a randomly selected group of students, rather than all 21 million, you could make extrapolations from what you found and make statements like, “We are 95% sure that college students who have children in college are 13%, plus or minus two percentage points, more likely to drop out than those who don’t.”
The statement has a hedge, based on the fact that you talked to a smaller sample size. You aren’t 100% sure, because you didn’t canvas every student, and, you can’t say 13% definitively so you give a range from 11-15%.
The cost savings on your participant selection in your hedge is huge.
To say, “We are 100% sure it’s 13%”, you would need to survey 21 million students.
To say, “We are 95% sure it’s 11-15%”, you only need to talk to 600.
So, the basics of most academic survey-based research depend on a hypothesis, a sample of participants, and some number crunching to identify how sure you are that the people you talked to represent the opinions of everyone else. This basic premise is what most people have in their heads, in some rudimentary form, when they say, “I did some research.” Even if they don’t know the math behind causality and correlation, most critical thinkers understand that by talking to a select group we can predict what a large group of people do, think, or feel. And that the group can be small, but not tiny. 600 people are fine, 20 are probably not.
But design research is about provocation. In design research, we’re not talking to small groups of participants to understand large swaths of people. Our research is not intended to mitigate risk, or to temper anxiety and add assurances of validity.
In design research, we spend time with participants to understand them individually, rather than as a stand-in for a whole population. They have opinions, behaviors, and experiences that are unique to them, and our intent is to hear their stories so we can understand and feel those experiences. We learn their eccentricities and begin to get a glimpse into the peculiarities and extremes of their lives. We see how they use products and services, what inspires them, what motivates them to try to accomplish and grow. We build empathy with them.
And this gives us “raw data” that we can use to inspire new design questions and opportunities, new frames of reference for brainstorming, new strawmen for creative exploration, and new ways of thinking outside of what has been considered the norm of best practice.
“The data makes us think and reconsider our entrenched beliefs. It makes us contemplative and helps us see the world in new ways.”
When we conducted research with college students, we heard stories from people like Rita, who has children ages 14, 12, 10, 8, and 1. Her day is spent caring for her kids and working so her course work begins at 10pm, once the kids go to bed, and she works online until about 2 or 3am each morning.
We also learned about Maria, whose parents are actively discouraging her from going to school because they want her to get married and have children.
We heard from Haley, who didn’t actually have a child, but instead dropped out to take care of her sister’s newborn–sacrificing her own immediate needs and dreams to help her family.
These stories are rich with emotional value and help us build a structure to the problem space. I have no idea if Maria, Rita, and Haley are common or complete outliers. While unlikely, it’s possible that no other college students in the US have kids like Rita, or stay up all night, or dropped out to help their families.
Through a lens of behavioral science research, a small, non-randomly selected sample calls into question any findings, because it (correctly) identifies that the research is flawed in predicting how a larger group will behave, or what a larger group thinks and feels.
The thing is, it isn’t important.
My research, my research findings, and my synthesized research insights don’t make any mention of how Rita and Haley exemplify a larger population of students. We didn’t conduct research to make extrapolations, we did it to understand and feel, meaning sample size and statistical significance are unimportant.
This is a challenge since design research uses the same terminology–participants, research plan, hypothesis–as traditional behavioral science research, making it nearly impossible to shed the baggage of the existing protocol. But if design research is rigorous, well planned, well conducted, and well captured, then the findings are valid as observations about the participants.
In this model, anomalies and extremities are valuable. The weirder the participant, the better, because it provides so much new fodder and raw material, enabling us to stretch our understanding of the problem space.
Consider another research project:
When we spoke to a young banking customer, they didn’t understand and couldn’t articulate the difference between a savings and checking account.
When we talked to a college student about her debt, she explained that she was counting on her college loans being excused by the government–and, without that, she had no real strategy to repay the $100,000 she owed.
When we asked a millennial about life insurance, she explained that she had been sold both a term and whole life policy by a friend of her family, and was paying thousands of dollars a year in premiums–but she had no dependents, no investments, and was having trouble paying rent each month.
In these three examples, the participants or their experiences are anomalous, and that’s the point. Their strange stories and experiences have helped the design team see the problem space of banking, or debt, or insurance differently.
When we consider design research, we need to stop judging the validity of the program on the size of the sample or the bias of the participant selection. Instead, let’s look at the methodology and the rigor and style of the data collection, and use a lens of provocation.
We can judge design research based on these questions:
- Were the participants given the opportunity to be experts?
Being an expert doesn’t mean that they have degrees or credentials, it means that the researchers gave them the runway to voice their expertise. A mother of five is an expert at being a mother. A television aficionado is an expert at watching television. A good research program should place the participants in a context where they can experience and exhibit their expertise.
- Were the participants given methods and tools to communicate their expertise?
People can’t always articulate the things they are good at because those activities have become so autotelic. A good research program should give participants specific ways to vocalize or show their expertise, even when they can’t consciously communicate it.
- Were the facilitators encouraging, yet neutral?
A good design research moderator is able to encourage participants to communicate their ideas and feelings, while avoiding judgement, criticism, or opinions. Participant communication is encouraged based on open-ended, non-agenda-driven questions.
These are the questions that are relevant in assessing and identifying good design research. When you are challenged on the validity of your findings, make it explicit that the participants were supported and encouraged to show how their work is done and how their life is lived, and that their work will inspire you to make new products and services.