Youth-serving programs might take comfort in knowing that even federal agencies feel the pain of funding, designing and conducting program evaluations.
The thorn in the agencies’ sides? The Office of Management and Budget’s (OMB) Program Assessment Rating Tool, widely known as PART.
Launched by President Bush in 2003, PART is designed to provide evidenced-based justification to continue or increase funding for effective programs and to eliminate ineffective ones. OMB plans to review 20 percent of all federal programs each year and to have completed at least one review of each program by the end of 2007.
But the inflexibility of this one-size-fits-all evaluation tool has caused considerable confusion and controversy among agency officials over the past three years.
A recent study by the Government Accountability Office (GAO) found that while PART has forced federal program managers to devote more attention and resources to evaluation, it has also brought to the surface difficult questions, such as: Which evaluation measures best inform program improvement? Could inflexible and overly frequent evaluations produce what the GAO calls “superficial findings of limited utility”?
The study – “OMB’s PART Reviews Increased Agencies’ Attention to Improving Evidence of Program Results,” released in October – illustrates the universality of some concerns about evaluations, uncovers some creative solutions to evaluation obstacles and offers relevant advice for programs of all sizes.
How PART Works
PART covers programs that are administered and managed by federal agencies, such as the Runaway and Homeless Youth program of the Department of Health and Human Services (HHS) and the Office of National Drug Control Policy’s Youth Anti-Drug Media Campaign.
The 30-item PART questionnaire is divided into four weighted sections: purpose and sensibility of the program design (20 percent), strategic planning and goal-setting (10 percent), program management, improvement and financial oversight (20 percent), and program accountability and performance results as measured by evaluation (50 percent).
Program managers must answer such questions as: “Does the program address a specific interest, problem or need?” And they must support their answers with research. The answers are scored from zero to 100 for each section.
In a complex and little-explained process, OMB analysts use the four weighted scores to judge each program as “ineffective,” “moderately effective,” “adequate” or “effective.” Programs lacking adequate measurements or data get a finding of “results not demonstrated.” Half the 234 programs assessed for fiscal 2004 (the first year PART was applied) received that designation.
Impetus for Evaluation
The GAO found that, just as at nonprofits, “evaluation generally competes with other program and department activities for resources,” and managers tend to be reluctant to reallocate those resources. But impending PART reviews stimulated agencies to increase their evaluation capacity by identifying appropriate outcome measures and credible data sources.
All four agencies reviewed for the study – HHS, the Department of Energy (DOE), the Small Business Administration (SBA) and the Department of Labor (DOL) – indicated that the PART reviews “brought agency management attention – and sometimes funds – to getting these evaluations done,” the GAO reported.
In other words, sometimes it takes a mandate, or the threat of funding cuts, to push managers to commit the attention, time and resources needed for an evaluation.
But conducting an evaluation doesn’t require a sudden, drastic change, said Stephanie Shipman, assistant director of the Center for Evaluation Methods and Issues at GAO. Programs can lay solid groundwork for later evaluations by using program measurements to keep track of how they’re making a difference in the lives of their clients, she said.
“When agencies start with that tack, they find ways to actually improve services without a whole lot of cost,” she said.
What Gets in the Way
Finding money to pay for evaluation is one of the two major barriers to the successful completion of programs’ PART reviews, agency officials said.
GAO found that tight funding often compelled agencies to delay or narrow the scope of their evaluations. For example, SBA officials chose to conduct evaluations sequentially as funds became available. DOL suggested focusing only on program components with substantial costs, or those whose effectiveness was most uncertain. DOE implemented a peer review system to lower evaluation costs.
The other major barrier was obtaining valid measurements of program outcomes.
Researchers found that some federal programs had no formal mechanism for collecting or analyzing data. HHS noted that data collection was especially difficult if it depended on the cooperation of state and local offices to gather and submit information from the field. Other agencies, such as the SBA, were collecting data on outputs, such as the number of clients counseled, rather than on outcomes, such as changes in employment or income levels.
DOL officials struggled with conceptualizing cost-benefit outcomes for safety regulations. They, like officials of many programs that serve high-risk youth, had trouble calculating the value of a saved human life.
GAO found that the “centralized coordination” of evaluation efforts helped agencies. Shipman noted that for smaller programs, that kind of support could come from a professional association or a collaborative network focused on evaluation issues.
Just What Is It You Want?
The PART guidelines are clear: When it comes to supportive evidence, only independent evaluations of a program’s effectiveness will be accepted.
But agency officials told GAO that they’ve had little other guidance over the years, and that OMB has not adequately addressed their repeated requests for clarifications on acceptable study design and scope.
OMB recently acknowledged that evaluation by randomized controlled trials – the method it strongly encourages – might not suit all programs. It now urges programs to consult evaluation experts “when choosing or vetting rigorous evaluations.”
OMB also recommends that agencies seek guidance on evaluation plans and goals early in the process, which not all agencies do.
One reason: Program managers told GAO they had concerns about OMB’s “increased focus on process” and were “more interested in learning how to improve program performance than in meeting an OMB checklist” designed to determine their program’s relevance or value, the study said.
GAO noted that this resentment, and the resistance by some programs to accept OMB’s guidance, resulted in OMB’s refusal to accept the evaluations cited by some programs in their PART reviews.
HHS (which reports spending about $2.6 billion a year on research) was particularly frustrated by OMB’s rejection of what HHS argued was a rigorous independent evaluation of a refugee program conducted before the agency’s PART review. OMB analysts told GAO researchers that the HHS evaluation “did not show the mechanism by which the program achieved [its] outcomes.” While HHS deemed the program successful, OMB slapped it with a finding of “results not demonstrated.”
Thus, the GAO study documents an impasse familiar to programs of all shapes and sizes: the desire of an oversight body for evaluation focused on a program’s design and its ability to address and solve problems, pitted against the needs of programs for evaluations focused on resource allocation and the effectiveness of activities.
“This has been a tension for the two decades I’ve been involved in evaluation,” Shipman said.
Shipman said ongoing communication is the key to moving beyond the impasse, echoing GAO’s encouragement of discussions among stakeholders to help ensure that evaluation findings are “timely, relevant, credible and used.”
Contact: GAO (202) 512-2700. The study is available at www.gao.gov/highlights/d0667
high.pdf.