Gosh, am I tired of the pop-sociology and pop-psychology studies popping up in my news feed. Almost every one of these has severe limitations that get ignored in the hype.
I’m going to pick on one example. According to a new study by the National Marriage Project, your love life before you get married is crucial to your chances of having a successful marriage. If you have a child before marriage, or a large number of sex partners, or if you begin your relationship with your spouse by hooking up, then, according to this study, your marriage has a lower chance of success.
Sounds plausible, right? But this study suffers from most of the garden-variety empirical problems that economists learn to recognize in their first year in grad school (or even as undergraduates).
The most famous empirical problem -- the one that’s the subject of countless Internet jokes -- is that correlation doesn't equal causation. Like most of the studies you see in your news feed, the National Marriage Project study makes a claim along the lines of “X is associated with Y.” That phrase, “associated with,” means “correlated with,” and even though “associated” doesn’t start with a “c,” you can make all the same jokes. For example, the NMP study says:
We discovered that having more guests at the wedding is associated with higher marital quality.
So to improve the quality of American marriages, we should just hire a bunch of people to fill out the guest list at every wedding, right? Right.
There are two basic ways that correlation can lead you astray. The first is reverse causality. There’s a correlation between roosters crowing and the sunrise, but that doesn’t mean roosters summon the sun. In the NMP study, this actually isn't much of a concern.
The second danger is omitted variables. These are things that cause both X and Y separately, but which the person doing the study didn’t think about. For example, the NMP study finds that people who wait to have sex later tend to have higher marital quality. That’s a correlation. Does it mean that if you choose to wait longer to have sex, you will have higher marital quality? Not necessarily!
For example, suppose that there is a group of people in the NMP study sample who had very neglectful parents, or who came from broken homes. These people might tend to have sex earlier in their relationships, because their parents didn't educate them about the dangers of STDs, pregnancy, etc. And suppose that these people also tend to have bad marriages, because their parents didn’t show them a good example. Even if this only represents a small subset of the people in the study, it could drive the entire result.
In this case, the omitted variable -- bad parents -- isn't something you can control. That is crucial. Defending their conclusions from a Twitter assault by Yours Truly, an NMP researcher argued that emphasizing omitted variables (which he called “selection,” though that is actually a different thing) decreases the importance of personal choice. But that is precisely the reason omitted variables are important -- we want to know the effects of our choices, and if our conclusions are confounded by omitted variables, we might make the wrong choices.
Another big problem that plagues studies like this is selection bias. This is when your sample is not chosen randomly. In the case of this NMP study, it’s a serious flaw.
The NMP study concludes that people who had a child before their current marriage are less likely to have a successful marriage. But those children didn’t arrive randomly out of thin air. They represent the existence of a previous failed relationship!
To make an analogy, suppose I wanted to test whether having taken the bar exam in the past makes you more or less likely to fail it in the future. So I select a bunch of law graduates and measure their scores on the test. Lo and behold, I find that the people who are taking the test for the second or third time tend to fail more than those who are taking it for the first time. I conclude that experience actually makes you worse at the bar exam.
But this conclusion is absurd. The people taking the test for the first time are a random sample, but the people taking the bar exam for the second or third time are retaking it because they all failed it the first time! Even if experience helps you pass the bar, my study will show the exact opposite!
In the same way, the NMP study selects a sample of people who have demonstrated a tendency to break up -- i.e., people who have had a kid in the past with another partner -- and compares them to a random sample of the population. It very well might be the case that having experience with child-rearing, or with breakups, actually helps you form a stable relationship in the future, all else being equal. But because of selection bias, the NMP study would still tell you the opposite.
Now the NMP study’s conclusions still might be right, but if they’re wrong, they could damage the very cause they want to promote. In general, I think the glut of overhyped social-science correlation studies in our newsfeeds is doing us no favors.
To contact the author on this story:
Noah Smith at firstname.lastname@example.org