The observer bias (or the Hawthorne effect) refers to the fact that people behave differently when they know they are observed. 

Even though there are some questions about the setup and validity of the studies that introduced this effect, they made a lasting impact on management practices. The Hawthorne effect is also widely used to promote desirable behaviors –- for example, to encourage compliance with handwashing by telling people they are being watched in the bathrooms and to increase road safety by displaying dummy CCTV cameras. 

A Brief History of the Hawthorne Effect

The name “Hawthorne effect” comes from a series of studies carried out in the 1920s at Hawthorne Works, a Western Electric phone-manufacturing plant; these studies aimed to understand how the amount of light on a work floor impacts productivity.

The researchers split workers into two groups: a test group, which worked in a separate space with changed lighting, and the control group — the rest of the factory workers, who worked in their normal environment. Better lighting produced the expected increases in productivity in the test group; however, when the test group worked in the poorer lighting, they still outperformed the others.  This increased productivity despite work conditions was an unexpected outcome, which led the researchers to attempt to manipulate other variables –- from monetary incentives to the number and length of breaks during the workday. No matter what experimentation was applied, the productivity went up in the test group. 

At the end of the studies, the increased output was linked to several factors, including the novel working conditions in a small group and the friendlier supervision received from the researchers. Later, in the 1950s, the results were reinterpreted by Henry A. Landsberger, who coined the term “the Hawthorne effect.” Landsberger thought that the test-group participants were more productive because of the unusual extra attention that workers had received due to their participation in the study.

The validity of the Hawthorne effect has been challenged in subsequent studies, which pointed out the flaws in the design and execution of the original studies. Nonetheless, scholars have continued to investigate how individuals modify their behavior under observation. When people are singled out for any special treatment, allowed the freedom to talk about their problems, and build stronger interpersonal relationships, they become more engaged with the work, positively affecting their productivity. That observation influenced the rise of a new school of human relations in management that still shapes many of our work practices, accounting for measuring employee satisfaction and for today’s focus on workers as human beings, on communication, and on the importance of group dynamics.

The Hawthorne Effect in UX and How to Mitigate It

To a certain degree, the Hawthorne effect occurs in all user research –- ranging from field studies to user testing. It is closely associated with the social-desirability bias, which is well-documented among the survey respondents and, as we see below, applicable in other user-research methods. The critical point is that people tend to answer questions in a manner that would make them appear a “better” version of themselves, inflating the number of positive traits and denying or minimizing those perceived as unfavorable by society. Below we discuss how the Hawthorne effect is represented in UX research and what could be done about that.

Field Studies

Let’s start with this most frequently asked question: “How can I observe people’s natural behavior during a field study if they know I am there?” Indeed, the value of the field study is in being there with the user and observing how they live, work, solve problems, and interact with people and technology. At first, participants might not want to show you their shortcomings, genuine reactions, or any tricks they learned on the job if that could affect their reputation or salary. Who would allow you to observe the back of the shop, where employees carelessly throw items in the basket, cut corners in production, do not fill out all the forms properly, or take an extra coffee break? 

Also, some people have reasons not to trust researchers due to their previous negative experiences with a management team that constantly comes up with new, “improved” ways of measuring work. According to this Forbes opinion piece, there are multiple ways in which very productive employees get punished at work, so why should anyone want to stand out or show their vulnerabilities?

Mitigation Strategies

Do not make people feel judged.  If the golden rule of usability is not to make people feel stupid, then for the field studies, it is not to make people feel judged. Of course, we do not want to make participants feel guilty, unworthy, or ashamed in any sort of study, but in field studies, that judgment-free state is essential for being allowed to see what is actually happening. In field studies, researchers come to the users' natural environment to know the context for specific activities and minimize the effect of the artificial lab setting. Still, the mere presence and interest in the activity make participants aware of their actions, potentially leading them to censor their behaviors. At the same time, people quite quickly return to their default routine activities and behaviors if they feel comfortable and do not mind the researcher's presence.

Consider longer sessions to allow participants to build a rapport with the facilitator. Trust is essential for building rapport, which is easier to do over time, so during contextual inquiries, we often opt for longer sessions (approx. 60–90 min) and try to have more than one observation session with the same person or team if possible. All of that will contribute to making participants feel more comfortable around the researchers. The more comfortable participants feel with the researchers, the less they feel pressured to alter their behavior.  

User Testing

Were you ever in a qualitative usability study where the participant waited for a long time for a site to load or persisted in doing a task that seemed to go nowhere? Did you think that the participant was doing it only for the sake of the researcher? If so, you’ve probably experienced the Hawthorne effect. 

Even though, at the beginning of a study session, we might tell participants to work as if they were by themselves, the truth is that users rarely interact with a digital product or service while talking aloud and having someone watch over their shoulder. The knowledge that they are being watched may pressure participants into behaviors they may consider expected of them. For example, it may cause them to be persistent and patient with technology, to read the instructions carefully, or to diligently search a website for relevant information. Participants might also want to save their faces in front of the moderator and complete the activity successfully. 

Mitigation Strategies

Design natural task scenarios that would not be perceived as artificial or completely novel by your target population. Instead, make these scenarios familiar tasks! For example, depending on the specific user audience that you are recruiting from, shopping for a new Ferrari may seem farfetched or realistic. 

Ease people into testing with small talk and giving subtle feedback while avoiding being overly friendly. In UX, there is a danger that the facilitator's friendliness might be misinterpreted, and participants may attempt to behave in ways that please the facilitator. We can prevent such behaviors by setting clear expectations at the beginning of the study and course-correcting if needed. For example, we might say: 

"Please try to imagine you're doing this in real life, and I'm not there. Try to do whatever you would normally do. I'm going to be pretty quiet, and I might ask you some questions at the end of each activity."

Assure participants that their performance is not shared outside of the study and that their feedback would be very helpful.

Explain to participants that you are interested in their honest feedback and behavior. People who are testing a prototype and thinking aloud might assume that their feedback is not valuable unless they stay on the page longer, try harder, and pay more attention than they would otherwise. That's why any usability session often starts with an explanation of the expected behaviors to reassure participants that they do not need to modify their answers to protect our feelings:

"Please tell us honestly what you're thinking. Any feedback, both good and bad, is helpful. I will not be offended by anything you share. We just want to improve the design."

Explain that the study is meant to test the design, not the participants. If you notice participants apologizing for not knowing something or making a mistake while completing a usability task, it is an opportunity to intervene and remind them that they are not being tested and that their feedback is very valuable. Remind them at the beginning of the study that they don’t have to save their face and do the task perfectly or know all the answers.

"We're not testing you and how much you know about X topic. We're testing the design. If there's something that you think is confusing or you're not sure about, that's really helpful for us to know.” 

Surveys

Surveys are powerful tools that need to be carefully set up and analyzed, as there is a high potential for bias and misinterpretation. The Hawthorne effect in surveys can take the form of the social-desirability bias, where people want to present themselves in a positive light. For example, questions about habits and consumption (e.g., “How often do you exercise?”, “How many alcoholic beverages did you have in the past 24 hours?”, “How many cigarettes a day do you smoke?”) are often getting hypothetical desired answers. 

Mitigation Strategies

Prompt participants to upload additional materials. For example, in some online surveys, we can ask people to add images showcasing their achievements, progress, challenges, or specific circumstances. While uploading these images might be conceived as an extra burden, adding extra files also provides evidence and helps us better understand the participant’s context while the participant gets an opportunity to show what they have done.

Use direct and indirect questioning in your survey design. Some sensitive questions are just harder to answer directly, especially if there is not enough trust: “What if I say something negative about my company and it eventually comes up?” The alternative is to ask indirect questions to get a better sense of the participant’s views. Indirect questions are usually asked in series and take longer; however, they yield better results. For instance, to measure privacy concerns, Braunstein et colleagues found that indirect questions diminish emotional responses and better reflect privacy concerns. Examples of a direct question (DQ) compared to the indirect questions (IQ) used in the study:

DQ: “How private do you consider this information?”

IQ: “How frequently do you check [content type]?”

IQ: “How frequently do you forward or otherwise share (e.g., by printing and giving the printed copy) [content type] with your close friends or close family members?”

Diary Studies

Diary studies require participants to journal and reflect on their experiences. They are often used as a tool to make people aware of certain habits (for example, when we ask individuals who want to lose or gain weight to keep a diary of what they eat and drink or when we use fitness tracking as a strategy to encourage people to exercise).  

In addition to the challenge of keeping participants engaged, like surveys, diary studies are open to social-desirability bias, with people logging only favorable events or engaging in the activity to shape how the researchers perceive them.

Mitigation Strategies

Keep diary-study questionnaires short. Refrain from overwhelming the participants with lengthy, unnecessary questions when your survey design allows that. This means spending more time planning the diary questions and applying some way to filter out the noise. For example, use Caroline Jarrett’s technique of narrowing down the questions by asking yourself:

  1. What do you want to know?
  2. Why do you want to know it?
  3. What decision will you make based on those answers? 
  4. What number do you need to make a decision? 

That would help to eventually come up with what she calls the “Most Crucial Question (MCQ),” the main missing puzzle piece to making a necessary business decision that your research intends to facilitate. 

Explain what you want from your participants and why their honest responses matter. Spending time onboarding and maintaining appropriate contact with participants is essential to build trust and accountability. It does not mean micromanaging every diary entry but being there to help with technical difficulties, expanding the time of the study, if necessary, and serving as a confidential and careful reader who cares about what the participant might share. Remember that the response depends on trust, effort, and reward. Show that you act on the answers you received by sharing updates, thanking participants, and demonstrating how their feedback made an impact and led to positive change.

Provide reasonable accommodations. As a researcher, you should consider how familiar your participants are with journaling and how journaling could affect their lives and responses. For example, if they frequently use online platforms to document every meal, trip, and mood change, filling out their diary online might not affect their daily routine as much. On the other hand, a person who uses their phone to make audio and video calls and occasionally play their favorite word game may need a detailed explanation of the diary software that you’ll use. If there is flexibility in the diary-study design, work around participants’ daily routines to choose the most convenient time for them to fill the entries by asking for their preferences. When people are in a hurry, too tired, or sleepy to complete their diary study, they would not be likely to open up and share their thoughts.

Conclusion

Why do we still conduct studies if validity might be an issue? Because data is better than guessing! However, it is important to critically assess the findings and recommendations based on diversified studies and secondary research. We need to be aware that our study design and the wording of the questions can affect users' behavior. Yes, the Hawthorne effect will primarily be associated with people's tendency to alter their behavior when they are being observed. However, it can also lead to benign applications, prompting accountability, awareness, or reflection on certain behaviors (e.g., spending time on a mobile device) so people can manage them and make informed decisions.

In user experience, the observer bias can serve as a foundation for more empathy for the users and for better study design –– would you want somebody to watch you when you work or live your life? How can you make that experience less intrusive, more enjoyable, and a safe space for people to share? 

References

Braunstein, A., Granka, L. and Staddon, J. (2011) “Indirect content privacy surveys,” Proceedings of the Seventh Symposium on Usable Privacy and Security [Preprint]. Available at: https://doi.org/10.1145/2078827.2078847.

Dickson, W.J. and Roethlisberger, F.J. (1966) Counseling in an organization: A sequel to the Hawthorne Researches. Boston, MA: Division of Research, Graduate School of Business Administration, Harvard University. 

Elton Mayo. British Library. Retrieved from https://www.bl.uk/people/elton-mayo https://www.bl.uk/people/elton-mayo

Jarrett, C. and Krug, S. (2021) Surveys that work: A practical guide for designing better surveys. Brooklyn, NY: Rosenfeld Media.

Latour, B. and Woolgar, S. (1986) Laboratory life the construction of Scientific Facts. Princeton, NJ: Princeton Univ. Press.