“Study after study shows that the earlier a child begins learning, the better he or she does down the road,” said U.S. President Barack Obama in Feb. 14 speech in Decatur, Georgia. “Every dollar we invest in high-quality early education can save more than seven dollars later on -- boosting graduation rates, reducing teen pregnancy, reducing violent crime.”

Obama wants to help our nation’s children flourish. So do I. So does everyone who is aware of the large number of children who are not flourishing. There are just two problems with his solution: The evidence used to support the positive long-term effects of early childhood education is tenuous, even for the most intensive interventions. And for the kind of intervention that can be implemented on a national scale, the evidence is zero.

Let me begin with the two studies in the early education literature that are so famous you may well have heard of them: the Perry Preschool Project and the Abecedarian Project. The Perry Preschool study took place a half-century ago, in the early 1960s. It treated 58 children ages 3 and 4 years old. The Abecedarian Project took place in the early 1970s. It treated 57 children, starting a few months after birth and continuing through age 5.

Many Caveats

Both programs achieved positive results of the kind that Obama described. But caveats about those results have troubled careful observers of the programs for years, especially when they hear Perry Preschool and Abecedarian cited as proof that early education accomplishes great things. The main problem is the small size of the samples. Treatment and control groups work best when the numbers are large enough that idiosyncrasies in the randomization process even out. When you’re dealing with small samples, even small disparities in the treatment and control groups can have large effects on the results. There are reasons to worry that such disparities existed in both programs.

Another problem is that the evaluations of both Perry Preschool and Abecedarian were overseen by the same committed, well-intentioned people who conducted the demonstration projects. Evaluations of social programs are built around lots of judgment calls -- from deciding how the research is designed to figuring out how to analyze the data. People with a vested interest in the results shouldn’t be put in the position of making those judgments. This is why evaluations undertaken by program staffs usually aren’t taken seriously. These considerations don’t automatically discredit the positive findings produced by Perry Preschool or Abecedarian, but neither program provides evidence one wants to bet the ranch on.

The most concrete reason for doubting the wider applicability of the Perry Preschool and Abecedarian effects is this: A large-scale, high-quality replication of the Abecedarian approach failed to achieve much of anything. Called the Infant Health and Development Program, it was begun in 1985. Like Abecedarian, IHDP identified infants at risk of developmental problems because of low birth weight and supplied similarly intensive intervention. Unlike Abecedarian, IHDP had a large sample (377 in the treatment group, 608 in the control group) spread over several sites assessed by independent researchers. IHDP provided a level of early intervention that couldn’t possibly be replicated nationwide, but it gave us by far the most thorough test of intensive early intervention to date.

Few Differences

The follow-ups at ages 2 and 3 were positive, with large gains in cognitive functioning for the treatment group. But by age 5, those gains had attenuated. Where are things now? In the most recent report, the children in the study had reached 18. For the two-thirds of the sample who weighed no more than 2,000 grams (4.4 pounds) at birth, almost all of the outcome measures weren’t even in the right direction: The control group did slightly better. For those who weighed 2,001 to 2,500 grams at birth, the best news the analysts could find were positive differences on a math test and on a self-report of risky behaviors that reached statistical significance but were substantively small. Combine the results for both groups, and the IHDP showed no significant effects on any of the reported measures -- not cognitive tests, measures of behavior problems and academic achievement, or arrest, incarceration and school-dropout rates.

Should we conclude that the IHDP results were depressed because of infants with serious neurologic complications, and that it would have worked on neurologically normal infants? The researchers ran the analysis on children who were free of significant neurological problems and found no difference.

Another possibility is that the aggregate results were damped down because some of the IHDP children were not socioeconomically disadvantaged. Were the results any better when the disadvantaged members of the sample were analyzed separately? The 18-year follow-up report is silent on that question. I can’t help but assume that if the results for the disadvantaged children had been better, we would have heard about them. Based on the published record, the IHDP results give no reason for optimism about even the most intensive early education approaches.

The disappointing results from the IHDP don’t mean that early education can’t do any good. Other studies of good technical quality have convinced me that the best early education programs sometimes have positive long-term effects, though much more modest than the ones ascribed to Perry Preschool and Abecedarian. That leaves us with one last problem: None of those first-rate programs are replicable on a large scale. The kind of nationwide expansion of early education that Obama wants won’t have the highly motivated administrators and hand-picked staffs that demonstration projects enjoy, and the per-child cost of the interventions on the Perry Preschool and Abecedarian model are prohibitively high. If you’re going to have a national program, you’re going to get the kind of early education that Head Start provides.

Rigorous Evaluation

This brings us to the third-grade follow-up of the national impact assessment of Head Start, submitted to the government in October and released to the public late last year. Head Start has been operating since the 1960s. After decades of evaluations that mostly showed no effects, Congress decided in 1998 to mandate a large-scale, rigorous, independent evaluation of Head Start’s impact, including randomized assignment, representative samplings of programs and a comprehensive set of outcomes observed over time.

Of the 47 outcome measures reported separately for the 3-year-old and 4-year-old cohorts that were selected for the treatment group, 94 separate results in all, only six of them showed a statistically significant difference between the treatment and control group at the .05 level of probability -- just a little more than the number you would expect to occur by chance. The evaluators, recognizing this, applied a statistical test that guards against such “false discoveries.” Out of the 94 measures, just two survived that test, one positive and one negative.

One aspect of the Head Start study deserves elaboration. The results I gave refer to the sample of children who were selected to be part of the treatment group. But 15 percent of the 3-year-old cohort and 20 percent of the 4-year-old cohort were no-shows --- a provocative finding in itself. When the analysis is limited to children who actually participated in Head Start, some of those outcomes do become statistically significant, though still substantively small. But keep in mind that we’re looking at selection artifacts: Children who end up coming to the program every day have cognitive, emotional or parental assets going for them that children who fail to participate don’t have. This means that if somehow the no-shows could be forced to attend, you couldn’t expect them to get the same benefit as those who participated voluntarily. If you’re asking what impact we could expect by making Head Start available to all the nation’s children who might need it, you have to make the calculation based on giving access to the service.

Modest Good

So what should we make of all this? The take-away from the story of early childhood education is that the very best programs probably do a modest amount of good in the long run, while the early education program that can feasibly be deployed on a national scale, Head Start, has never proved long-term results in half a century of existence. In the most rigorous evaluation ever conducted, Head Start doesn’t show results that persist even until the third grade.

Let me rephrase this more starkly: As of 2013, no one knows how to use government programs to provide large numbers of small children who are not flourishing with what they need. It’s not a matter of money. We just don’t know how.

Is there anything that money can buy for these children? I am sure that Head Start buys some of them a few hours a day in a safer, warmer and more nurturing environment than the one they have at home. Whenever that’s true, I don’t care about long-term outcomes. Accomplishing just that much is a good in itself. But how often is it true? To what extent does Head Start systematically fail to serve the children who need those few hours of refuge the most?

Asking those questions forces us to confront a reality that politicians and other opinion leaders have ducked for decades: America has far too many children born to men and women who do not provide safe, warm and nurturing environments for their offspring -- not because there’s no money to be found for food, clothing and shelter, but because they are not committed to fulfilling the obligations that child-bearing brings with it.

This head-in-the-sand attitude has to change. If we don’t know how to substitute for absent, uncaring or incompetent parenting with outside interventions, then we have to think about how we increase the odds that children are born to present, caring and competent parents.

How to do that? That’s a topic for another day. My limited goal here is to ask that we not fool ourselves into thinking that expanding early childhood education is going to improve the life chances of the children most in need of help.

(Charles Murray is a scholar at the American Enterprise Institute and the author of “Coming Apart: The State of White America, 1960-2010.” The opinions expressed are his own.)

To contact the writer of this article: Charles Murray at Caroline.Kitchens@aei.org.

To contact the editor responsible for this article: David Shipley at djshipley@bloomberg.net.