Adrianna McIntyre and Austin Frakt have a long piece about the shortcomings of randomized controlled trials, considered the “gold standard” of research, at the Incidental Economist. I asked Jim Manzi, who has literally written the book on randomized controlled trials, to share his thoughts. Below is what he said:
Gina Kolata pointed out in a major New York Times article that the Innovation Center -- created with a $10 billion, 10-year budget under the Patient Protection and Affordable Care Act to discover how to most effectively deliver health care -- has been criticized for not using randomized controlled trials (RCTs) to determine the effectiveness of various programs it has sponsored.
Adrianna McIntyre and Austin Frakt of the influential health-care policy blog the Incidental Economist have written a critique of the article, which links to related critiques by Aaron Carroll and Dan Diamond. All of these pieces begin by accepting that an RCT is the most reliable method for determining the effect of an intervention, but argue that there are reasons that this is not necessarily an appropriate method to be used by the Innovation Center. What I found positive and productive about all of these pieces was their focus on trying to be practical.
I think it’s fair to group the objections to RCTs across these pieces into two big types. First, that RCTs won’t work in this context: They take too long and the Innovation Center is trying to do rapid-cycle evaluation and modify programs as they go; are too expensive; or we can’t realistically get patients or providers to agree to participate. And second, that RCTs wouldn’t really tell us what we want to know anyway, because they are good are determining what works under ideal conditions (efficacy) rather than what works in the real world (effectiveness); they do not allow us to generalize from the findings from a demonstration project to what would happen in a general rollout; or, in a context like this, they require hybridization with nonexperimental analysis.
I agree with the weight and seriousness of each of these objections. My agreement is not ad hoc; I wrote a book that tried to describe how businesses have implemented experimental processes that operate in the face of all of these issues. But before making any comment about how the Innovation Center might deploy randomized experiments, let me start with just two background observations.
First, it is usually hard to know what the impact some intervention has on an outcome of interest, even after we go out into the world and try it out on some real people, hospitals, markets or whatever. Even when the people doing the program on the ground are confident that it is helping. The fundamental reason is that so many other possible causes of the outcome of interest are changing at the same time. Random assignment to test and control is not a nerdy nice-to-have but the bedrock of getting even the direction of the effect of most interventions right.
Second, almost no innovative programs work, in the sense of reliably demonstrating benefits in excess of costs in replicated RCTs. Only about 10 percent of new social programs in fields like education, criminology and social welfare demonstrate statistically significant benefits in RCTs. When presented with an intelligent-sounding program endorsed by experts in the topic, our rational Bayesian prior ought to be “It is very likely that this program would fail to demonstrate improvement versus current practice if I tested it.”
In other words, discovering program improvements that really work is extremely hard. We labor in the dark -- scratching and clawing for tiny scraps of causal insight.
So when we come to the idea of applying randomized trials to the work of a place like the Innovation Center, we confront many challenges. I go through this in a lot more detail in "Uncontrolled," but it seems to me that the proper response is to find ways to imperfectly apply the method, rather than to throw up our hands. Here are some practical ideas:
--Invest in infrastructure to lower the cost per test and reduce the elapsed time per test to allow more rapid-cycle innovation. Businesses have had great success with this. Yes, partial randomization failures do occur frequently in the real world, but invest in analytical methods that let you do your best to detect and, when possible, adjust for this without losing all opportunities to randomize.
--Exploit your new lower-cost-per-test infrastructure to radically ramp up the number of tests per unit time so that you can test for context dependence as a partial solution to the problem of how to generalize results.
--Abandon the distinction between efficacy and effectiveness, and focus only on effectiveness -- in plain English, “Test what you would roll.” And so on. In other words, don’t let the perfect be the enemy of the good.
I have a great deal of empathy for people who are trying to make innovation happen on the ground when they are confronted by kibitzers giving them helpful advice about how to up their game. If my mental model for what is meant by “you should use more RCTs” was “do stuff a lot like what a drug company does in a Phase III clinical trial,” I’d probably be pretty negative about it, too. But the choice in front of us is not all or nothing. There are ways to exploit the power of controlled experimentation to make health-care policy a little bit better on the margin. We should take what we can get.
To contact the writer of this article: Megan McArdle at firstname.lastname@example.org.
To contact the editor responsible for this article: Brooke Sample at email@example.com.