George Loewenstein chose two popular topics—food and sex—to illustrate potential issues and pitfalls in field experiments in his keynote address on the second day of the "Advances with Field Experiments" conference.
One such issue is external validity—the extent to which findings can reliably be generalized beyond the specific experiment. “I tell my students that lab experiments offer high internal validity, or ability to reach confident causal conclusions, but low external validity. Observational field studies are the opposite, with low internal and high external validity. The gold standard is field experiments, which are high in both. But in this talk, I’m going to call that assumption into question, to some degree.”
He illustrated that with a study that attempted to understand the causal linkages between health, happiness, and frequency of sexual activity. Although there is a lot of research indicating that sex is a huge determinant of happiness and health, it’s unclear which causes which, or if another factor like love causes all three.
“The only way to find out is to run an experiment,” Loewenstein said. “We couldn’t figure out how to run a lab experiment, so we came up with a field experiment.” He recruited healthy monogamous couples between the ages of 35 and 65 who would agree to change their sexual frequency. Participants were paid for completing entrance and exit surveys on health status and behaviors, relationship, and many other factors.
The study succeeded in getting people to have more sex, but showed no correlation between sex, health, and happiness. Asked if participant bias was involved, Loewenstein agreed that there were many reasons why study participants might not be like the others. “That gets to my point about external validity,” he said. “Our study had no impact on anything but the independent variable; we just got them to have more sex. We were not able to show a correlation that has been observed repeatedly in general population:”
The problem with the study is one common to many field experiments. “We’re moving them away from their optimum. It’s not surprising that this may have a negative effect,” he said. “A better study would have tried to change both the amount of desired and actual sex.”
Turning to food, Loewenstein outlined a series of studies on food labeling that looked at shifting people toward healthier food choices. Labeling is an attractive approach for behavioral and traditional economists because both believe additional information supports decision making.
However, the most significant nutrition labeling study showed at best a mixed impact and only for white females. Food labeling has been shown to be ineffective in reducing fat and cholesterol consumption, although the labels do seem to make people pay more attention to good factors like vitamins.
Motivated by an “incredibly flawed” study at Subway, New York City began requiring calorie labeling on restaurant menus in 2007‒08. Most subsequent research has found that calorie posting has little impact; in New York, calorie consumption actually increased a bit.
Labels showing calorie counts may not help if diners don’t know how many calories they should be consuming, but providing those recommendations also made no difference on how much they ate. Taking numbers out of the equation and labeling foods with a red/yellow/green stoplight system just made people order more of the low-calorie green light items.
There’s little evidence that providing information changes behavior and it may have perverse effects, Loewenstein observed. What may be more effective is a nudge–offering a more convenient, preset menu of healthier choices, with a complete menu still available but harder to access. A weak nudge had little effect, because diners simply offset the low-calorie sandwich with drinks and more side items. With a stronger nudge, where it was more difficult to find the full menu, there was no offsetting and a significant decline in calories consumed.
“I think our dependent measures on policy studies are often too narrow, he concluded. It’s important to consider all behavior. “If people order the low-calorie sandwich, they add high-calorie drinks or order more sides. If they have less for lunch, maybe they are going to snack. You have to think about overall policy.”
In summary, he noted,
Field experiments have limited external validity if they drive people to behave as they wouldn’t normally behave.
Some interventions work—or backfire—for reasons we don’t understand.
Too many policies are implemented without valid field studies.
But tests of policies do suffer from their own set of problems.