Prove It or Lose It

Print More

Barack Obama campaigned on the promise to eliminate federal programs that don’t work, change programs that don’t quite measure up and consolidate duplicate programs.

Hunger grows in the summer as youths lose access to school feeding programs. Take a look at four different ways programs around the country are getting food to kids out of school. Page 12.

That seemingly simple declaration has turned into the most ambitious attempt yet to determine through strict scientific methods which federal programs are deemed effective, and use those results to set national policy. It’s a concept almost universally embraced – until the harsh spotlight of research turns in one’s own direction.

Youth programs are beginning to feel the heat, as even popular programs such as Teach for America and Reading Is Fundamental face possible cuts in the pending fiscal 2011 budget. And big changes in Head Start and Upward Bound seem destined for the fiscal 2012 budget.

Orszag: If programs don’t work, “we’ll redirect their funds to other more promising efforts.”

Just last month, the director of the president’s Office of Management and Budget (OMB) indicated that for the 2012 budget, the targets will be duplicative youth-serving programs spread across numerous departments, including mentoring, programs that strengthen and interest students in math, engineering and the sciences, and employment and training.

The budget-chopping has set off a national twist on NIMBY: Not My Program (NMP). Out have come the lobbyists, Facebook campaigns, pleas to members of Congress, appeals to morality, attacks on research methods and claims of national priority – just about every tactic possible to fight off funding cuts.

But with a need to trim soaring deficits, the Obama administration is demanding that many social programs show that they accomplish what they were designed to. Programs that have the research to prove they are effective go to the head of the line for funding; those that have shown some positive effects will also be funded, but will have to prove their worth.

Baron: Administration will “deal first with programs that directly affect people’s lives.”

If studies have shown a program doesn’t work, “we’ll redirect their funds to other more promising efforts,” OMB Director Peter Orszag told federal departments and agencies last summer in his official blog.

Orszag reinforced that dictum in a letter to all departments last fall, spelling out how funding priorities for the 2011 budget had to be evidence-based or in the process of being evaluated. In it, he also announced a government-wide network to promote stronger evaluations in all departments and agencies. Some departments, such as Justice, already are providing for evaluations as new programs are introduced.

Some, however, are skeptical about the approach. Douglas Besharov, the first head of the U.S. National Center on Child Abuse and Neglect and now a professor of public policy at the University of Maryland, is a frequent critic of tying social policy to evidence-based research.

“Sometimes it’s a fig leaf for defunding or broadening possible grantees, or undoing an earmark,” Besharov said. “Sometimes it’s just an excuse to dump a lot of money into programs they think they like.”

Besharov: Demanding evidence is sometimes “a fig leaf for defunding or broadening possible grantees.”

Still, any youth-serving agency that receives federal funding – especially those in fields singled out by Orszag, even though he is leaving his post – should be prepared to show results. Program directors need to know what kinds of evaluations are needed and how such studies are carried out. And rather than relying only on tried and true methods, they need to strive for improved outcomes with innovative techniques.

Streamlining funding, starting fights

In coming weeks, as fiscal 2011 appropriations bills work their way through Congress, the administration’s efforts to make such high-profile programs as Teach for America and Reading Is Fundamental prove their effectiveness are likely to become more public and heated – and may portend battles to come concerning youth services.

Rather than the line-item appropriations these programs now command, the Obama administration wants to merge them into large funding streams, forcing them to compete with similar programs for money.

Instead of bolstering the evidence of their effectiveness, Teach for America and Reading Is Fundamental have used their popularity to make direct appeals to members of Congress to save their line-item appropriations. Teach for America says it has lined up 100 congressional representatives to back a $50 million appropriation for the program in 2011, compared with the $18 million it received from the Department of Education last year. (It also won $1.3 million in competitive grants from the Corporation for National and Community Service).

Teach for America took its campaign to Facebook and onto college campuses through student publications that urged students to contact their elected officials. Although joining Teach for America is one of the most popular aspirations of new college graduates – there were about 10 applicants for each of the program’s 4,500 slots this year – evaluations of the program have shown mixed results and don’t seem to support the group’s overwhelmingly positive public image.

Administration officials say that under the new funding scheme, Teach for America could increase its appropriation if it shows its effectiveness. Budget documents from the Education Department show that programs in the funding stream that would include Teach for America received $138 million in 2010, while the new funding stream (called Teacher and Career Pathways) would total $450 million in 2011.

As for Reading Is Fundamental, programs that are being placed in the Effective Teaching and Learning: Literary funding stream received $413 million last year; the new stream would provide $450 million in 2011.

The documents do not spell out how proposed changes in the Elementary and Secondary Education Act, the reauthorization of which is now being considered, will affect the new funding streams.

One reason for nervousness among these targeted programs: While other presidents have vowed to cut waste, the Obama administration has been unusually successful so far. The administration achieved 60 percent of the $17 billion in terminations, reductions and savings OMB proposed for the 2010 budget. The goal for the 2011 budget is $23 billion.

A new way

“The direction the administration is going is to deal first with programs that directly affect people’s lives,” said Jon Baron, president of the Coalition for Evidence-Based Policy.

Baron is encouraged by the administration’s efforts so far, although he laments that so few government programs have been evaluated in depth. The problem is that the purported “success” of many such programs is based on initial-result studies done early in the life of a program that usually are based on a small sample of the program’s participants. Such studies are rarely confirmed by larger, more structured evaluations, Baron said.

In general, Baron said, few federal programs have been subjected to the kind of rigorous randomized controlled trial (RCT) evaluation that is the best measure of a program’s success. In an RCT, participants have the same chance of being assigned to the treatment group as to the control group. That means participants in the treatment group are not determined by the investigators or program officials, but by random selection.

There is still much resistance to the scientific approach for determining the effectiveness of social programs, especially when the results don’t conform to the results that policymakers have envisioned.

After all, this is hardly the first time a presidential administration has said it would award funds based on evidence of program effectiveness. The George W. Bush administration said it would do the same, and cited lack of evidence of its effectiveness as a reason to propose trimming the 21st Century Community Learning Centers’ funding. That set off a major lobbying campaign by after-school advocates, who attacked the study as flawed, and Congress restored the funding.

The Obama administration is trying to take the movement toward evaluations a step further – and is drawing similar backlash.

Upward Bound employees have been fighting for nearly a decade over the large program evaluation released early last year. First, they attempted to have Congress block the evaluation on the grounds that it is immoral to withhold services for the sake of a study (by assigning youths to a control group). After the study was released showing that the program has no overall impact on increasing the number of minority and at-risk students who go to college, the Upward Bound employees, represented by the Council for Opportunity in Education, mounted a full-fledged attack on its methodology, trying to stave off changes.

Steven Barnett, head of the National Institute of Early Education Research at Rutgers University, said not evaluating programs is like “wandering around in the dark,” but that arguing an evaluation, rather than the program, is flawed, is like turning on the light and “discovering you are walking in the wrong direction and wanting to blame the problem on the light.”

Mark Dynarski, vice president and director of the Center for Improving Research Evidence at the research company Mathematica, said a researcher must be neutral in carrying out any study, “very transparent” about the methodology of the evaluation, and then should step out of the way and “let the debate ensue.”

Even if there are significant findings, researchers “don’t want to be over strong” in reporting them, Dynarski said. “Over-strong discussions are a kind of advocacy.” Researchers must “stay in the line of science and let the policymakers pick up from there.”

But Upward Bound programs have been told in no uncertain terms that changes are coming, and there are indications that individual Upward Bound programs may have to battle for grants, possibly with other programs that seek to steer at-risk students to college.

The controversial Upward Bound evaluation’s main conclusion was that programs are skimming students headed to college anyway and not seeking out sufficient numbers of at-risk students. In contrast, the Upward Bound Math and Science Program, which positions students for careers in those areas, was found to be effective.

Education Department officials have signaled that Upward Bound programs need to be more creative and innovative and suggested they apply for funding under the Education Department’s prime funding stream for that: the Investing in Innovation Initiative (i3).

The rise of RCTs

What the White House is doing is part of an evolution that has taken decades to reach social programs. The use of large randomized controlled trials is relatively new, even in medicine. One of the first major randomized controlled trials in American medicine was conducted in the 1950s – to determine the effectiveness and safety of the Salk polio vaccine. At the time, polio was considered the worse disease of the post-World War II area. That trial, conducted in 1954 and involving more than 1.8 million schoolchildren, stands as the largest medical trial in U.S. history.

By the 1970s, Baron said, there were only about 100 randomized controlled trials a year in medicine, compared with about 10,000 a year now.

The National Institutes of Health didn’t begin funding a series of RCTs on medical treatments that had been widely accepted but never scientifically tested – eating fiber as a way to prevent colon cancer, giving over-the-counter cold medicines to children and hormone replacement therapy for women – until more than two decades later. Those studies found, among other things, that fiber had no effect on preventing colon cancer, over-the-counter cold medicines were ineffective for children under 6 years old and hormone replacement therapy for women, which had been portrayed as decreasing the risk of cancer, actually increased the risk. There was heated debate and substantial resistance among the public to accepting all three of these findings.

The jump to social programs came in the 1970s.

The research company now known as MDRC – created in 1974 by the Ford Foundation and several federal agencies as the Manpower Demonstration Research Corp. – pioneered the use of randomized controlled trials in evaluating social programs in the United States. Some of the corporation’s earliest work involved assessing programs to move welfare recipients off the relief rolls and into jobs. That study found that short-term assistance programs moved participants into the workforce and better paying jobs quicker and at less expense than long-term training programs. The evaluation formed the basis of President Bill Clinton’s welfare reforms, passed by Congress in 1996.

But the youth field’s use and acceptance of such large randomized controlled studies has developed slowly.

The U.S. Department of Education’s Institute of Education Sciences, created in 2002, is a standard bearer in applying RCTs to educational programs, often with surprising results. One of its early studies focused on Read First, a federal initiative that trained teachers to use phonics, among other techniques, to teach young children to read. The evaluation showed that the program had increased the amount of teacher training but had no scientifically significant impact on their students’ reading skills.

The institute is one of several government entities – including the Office of Management and Budget under the George W. Bush administration, the Justice Department and the Substance Abuse and Mental Health Services Administration – that maintain their own lists of programs deemed effective.

The Coalition for Evidence-Based Social Policy has more rigid standards: It focuses on programs that are not only successful but are ready to be replicated. Its panels of research and program experts, chosen based on their fields of expertise, review programs.

But more than 35 years after RCTs were first used to evaluate social programs, the coalition’s list of what it calls effective “interventions” consists of only two dozen programs, spanning the fields of early childhood (four programs), K-12 education (six), youth development (two), crime/violence prevention (three), substance abuse prevention and treatment (two), mental health (one), employment and welfare (five) and international development (one). (See the full list at http://evidencebasedprograms.org/wordpress.)

The new evaluation standard

Although long multimillion-dollar studies of massive programs – such as Head Start and Upward Bound – are the norm, new evaluations are likely to be more tightly focused on differing approaches within a large program.

Baron, whose Washington, D.C.-based coalition has worked with OMB and the Government Accountability Office to help train government departments and agencies in the types of evaluations needed and how they should be conducted, said that in those large studies, poorly performing locations tend to cancel out efforts that show strong results.

The new approach will be to identify both good and poor performers, so that the strong performers can be replicated throughout a program. This targeted approach should mean that evaluations will not cost so much.

“The big expenses are making sense of the data and conducting interviews for such large numbers of participants,” Baron said. He believes more focused evaluations can be performed for as little as $80,000 to perhaps $800,000 tops, which means the $100 million set aside in the 2011 budget would pay for dozens of studies.

How to evaluate the evaluations?

With an increase in the overall number of evaluations, the question arises of how to treat competing information.

In the medical profession, questions about the validity of methodologies and competing evaluations are usually addressed by a Cochrane Review. The Oxford, England-based Cochrane Collaboration reviews primary research in human health care and human policy on a particular topic – such as “Do antibiotics help in alleviating the symptoms of a sore throat?” – to determine the larger overall findings that are supported by the separate studies. These systemic reviews distill the findings, relieving physicians and researchers from having to read dozens or hundreds of independent studies on a single topic.

The reviews (online for nonprofessionals at http://consumers.cochrane.org) assist physicians in determining what intervention is best for prevention, treatment and rehabilitation for thousands of medical conditions.

Compiled by unpaid experts in scores of specific fields, such as lung cancer or neonatal infections, there are now about 4,000 reviews, which are updated regularly. This collaborative effort began in 1993.

The Campbell Collaborative, a sister organization of the Cochrane Collaborative, seeks to perform similar reviews in the social policy field, but its efforts suffer from a dearth of studies. A recent Campbell Collaboration review of studies involving “Formal System Processing of Juveniles: Effects on Delinquency” covered 29 studies on the topic, but most were from the 1990s or earlier, long before extensive use of programs that seek alternatives to detention and incarceration, limiting its usefulness.

What do to about Head Start?

Many experts say the mettle of Obama’s evidence-based drive will be shown in how he handles Head Start funding, after a massive study released this year showed that any academic advantage won by Head Start participants is gone by the end of the first grade.

Head Start is up for congressional reauthorization in 2012, and the fight over the program is gearing up.

Ron Haskins, senior fellow and co-director of the Center on Children and Families at the Brookings Institution, said Head Start is very popular, and its backers are extremely critical of those who say it doesn’t work.

“Considering how long Head Start has been around and how generously it has been funded, it should have been in the prime of its impact,” Haskins said. Instead, Head Start was found to lag behind similar state programs. Haskins believes the outcomes of various programs within Head Start need to be studied, and the results should be used to improve the program.

“We know that the characteristics of programs that have been highly effective are not characteristic of programs that we have,” Steven Barnett, co-director of the National Institute for Early Education Research at Rutgers University, said of Head Start.

In addition, Barnett said, “Head Start has huge performance standards. They would probably do better to cut them to 20 pages and let the programs experiment, to have more freedom to develop more effective programs, and then systematically test those out.”

But the group that represents most Head Start staff members, the National Head Start Association, has not publicly endorsed major changes in Head Start. Instead, it has taken its battle directly to federal legislators, arguing for more money, specifically an extension of money added to the program under last year’s Recovery Act. But their plea first argues against eliminating their jobs, before it mentions the needs of children.

“Without these funds, newly created jobs will be lost and low-income children and families will lose services they are receiving through Head Start and Early Head Start,” the association said in letters to the House and Senate.

Despite Head Start’s widely perceived problems, there seems to be no national effort to do away with the popular program.

“I’m in print saying I think they are doing a terrible job,” Besharov said. “The evidence is, by any standard you use to cite such work, that it doesn’t work. But I don’t see that driving the funding. … I don’t think it should be defunded.

  • Richard Wexler

    The letter to the President about home visiting raises some fair points.  The letter would have more credibility, however, had one of the authors disclosed the fact that one of the programs she is asking the President to lower “evidence-based” standards to fund is a program she helped develop in the first place.  That same author also seems to apply different standards of evidence to different programs.

     Though there are some excellent scholars in child welfare, this behavior once again illustrates the extent to which “scholarship” in this field is not ready for prime time, because of the extent to which some of its practitioners tolerate bias and a very blurry line between “scholarship” and advocacy.  Details, and other examples, are in the NCCPR Child Welfare Blog here: http://nccpr.blogspot.com/2010/07/evaluating-alternatives-to-foster-care.html

     Richard Wexler

    Executive Director

    National Coalition for Child Protection Reform

    http://www.nccpr.org

  • Kristin A. Moore

    Your cover story on evidence-based programs raises important questions about evaluations (Prove It or Lose It, July/August issue).  We commend the Administration for its attention to research and the focus on evaluation research, in particular.  Creating public policies based on research-based evidence is complex but essential for allocating funds to programs that are proven to work.  Since 2000, Child Trends has been compiling one of the largest compendiums of experimentally evaluated programs for children, with support from the Edna McConnell Clark Foundation, the Stewart Trust and the John S. and James L. Knight Foundation.  LINKS, which stands for Lifecourse Interventions to Nurture Kids Successfully, presents knowledge about social programs found to “work,” or not, to enhance children’s development, in a user-friendly format for policy makers, program designers, and funders.  We currently have identified over 500 experimental evaluations, of which 440 are described on our Web site.

    We have also compiled LINKS Effectiveness charts, which visually show varied program approaches that have impacts on children’s development at different ages.  We are also synthesizing what we have learned from experimentally evaluated programs in the LINKS database to present “what works” in numerous areas, such as summer learning, adolescent reproductive health, and the prevention and treatment of childhood obesity.  All of these products are available at http://www.childtrends.org/WhatWorks

    While LINKS focuses on experimental studies, we have also prepared numerous research briefs that share information about all types of evaluation issues.  One brief uses a triangle to illustrate the varied approaches to evaluation as a hierarchy in terms of rigor, but notes that many approaches to evaluation contribute to understanding what works for children.

  • Jeffrey Butts

    I’ll believe the evidence-based movement is sincere when policymakers support serious investigations into the relationship between punishment and public safety. Evidentiary scrutiny always focuses on education, social services and the work of the “helping professions,” while the actions of police, prosecutors, and correctional agencies are exempt. Until we have a truly level playing field, the evidence-based movement is just another way for the rich to trick the poor into fighting over table scraps.

  • bnatplay