TLDR
An evaluation plan that lists outputs as outcomes — 'we will serve 300 clients' rather than '60% of clients will achieve stable housing within 6 months' — tells the program officer the organization has not thought about why its program works, which is the question the evaluation is supposed to answer. Federal programs including AmeriCorps and HRSA 330 require independent evaluation; most foundation grants require only internal evaluation with a validated instrument.
AmeriCorps grants require independent evaluation for awards above program-specific thresholds. HRSA 330 performance measure reporting is mandatory for all federally qualified health center grantees. OJJDP juvenile justice grants must align with the Office of Juvenile Justice and Delinquency Prevention’s evidence-based program evidence tiers.
For most nonprofit grant proposals, the requirement is less formal — but the evaluation section still determines whether the proposal clears a scoring threshold. Understanding the difference between what satisfies a federal rubric and what satisfies a foundation’s standard is the first step in writing an evaluation plan you can actually execute.
What Program Officers Actually Look for in an Evaluation Plan
Program officers read evaluation plans to answer three questions: Do you know the difference between outputs and outcomes? Do you have a plausible way to measure the outcomes you promised? And can your organization actually collect the data without a research team?
Proposals that use the word “outcomes” but describe only outputs (“we will provide 1,200 units of service”) fail the first question. Program officers who have reviewed 50 proposals in a cycle recognize the output-disguised-as-outcome immediately — it is the most common evaluation plan failure.
Proposals that promise rigorous randomized controlled trial designs for a $75,000 two-year community program fail the third question in the opposite direction. An evaluation methodology that requires random assignment, a comparison group, and academic statistical analysis implies resources far beyond what a small nonprofit can sustain. Reviewers discard aspirational evaluation plans as quickly as inadequate ones.
The evaluation plan that works is honest about the organization’s evaluation capacity, names a specific and validated measurement instrument, and describes data collection procedures that program staff can execute within their normal workload.
Process vs. Outcome Evaluation: The Distinction That Matters
Process evaluation answers: Did we do what we said we would do?
Process measures are outputs: number of participants enrolled, number of sessions delivered, attendance rate, staff training hours completed, percentage of participants who completed all program components. Process evaluation confirms program fidelity — that the activities described in the proposal actually happened at the stated intensity.
Process evaluation is necessary but not sufficient. A program can deliver every activity exactly as planned and still produce no meaningful change in participants. Process evaluation cannot tell you whether the program worked.
Outcome evaluation answers: Did participants change in the way the program was designed to produce?
Outcome measures describe change: knowledge gained, skills demonstrated, behavior changed, condition improved. The change must be attributable to program participation, must be measurable with a defined instrument, and must be tracked over time.
A complete evaluation plan includes both. Process data tells you the program was delivered; outcome data tells you whether it worked. Federal proposals that require “program performance measures” typically require both types, with specific measures defined in the NOFO (HRSA 330 Uniform Data System measures, for example, are non-negotiable).
How to Define Measurable Outcomes Without Hiring an Evaluator
Every measurable outcome statement follows the same format: “[X%] of [population] will [demonstrate specific change] by [timepoint] as measured by [instrument or data source].”
Walk through each element:
Population: The specific subgroup, not “participants.” Example: “adults 18–65 enrolled in the 12-session financial literacy program.”
Specific change: Observable and countable. “Demonstrate improved financial knowledge” is not observable. “Correctly answer 8 of 10 questions on the Financial Literacy Assessment” is observable.
Timepoint: When you will measure. “At 6-month follow-up” or “at program exit” or “at the end of the grant period.”
Instrument: The specific tool you will use to measure. Not “a survey” but “the validated Financial Health Network FinHealthScore® assessment.”
The percentage target (X%) should be based on prior program data if available, or on published efficacy data for the intervention model you are using. A SAMHSA National Registry of Evidence-Based Programs (NREPP) listing will include expected effect sizes. If no prior data exists, 60–70% is a defensible default for first-program-year targets with a note that targets will be recalibrated in year two based on baseline data.
The Logic Model Connection: Outputs → Outcomes → Impact
The evaluation plan is built from the outcomes column of the logic model. If you have completed the logic model before writing the evaluation section (which you should have — see the grant proposal template guide for logic model construction), the work is largely done.
The evaluation plan translates each outcome in the logic model into a SMART statement with an instrument assigned to measure it. The mapping looks like:
Logic model outcome: “Increased knowledge of diabetes self-management among enrolled patients”
Evaluation plan outcome statement: “75% of enrolled patients will demonstrate a minimum 20-point improvement on the Diabetes Knowledge Test (DKT2) from pre-program assessment to 12-week post-program follow-up.”
The instrument (DKT2) is validated — it has been used in peer-reviewed research and has established reliability and validity data. Using a validated instrument rather than a staff-created survey strengthens the evaluation plan significantly, because reviewers can verify that the instrument measures what you claim it measures.
When no validated instrument exists for your specific outcome, a staff-created instrument must include a description of how it was developed and any pilot testing conducted. Describing the development process (expert review, pilot with a sample of 10 participants, revision based on comprehension feedback) is more credible than simply stating you will create a survey.
Data Collection Methods That Nonprofits Can Actually Execute
The most credible evaluation plans for mid-sized nonprofit grants use one of three data collection approaches:
Pre/post surveys with a validated instrument. Participants complete the instrument before the program begins and again at a defined follow-up point. Appropriate for programs designed to change knowledge, attitudes, or self-reported behavior. Cost: staff time to administer, score, and analyze — typically 1–3 hours per cohort.
Administrative data from the case management system. For programs that track client outcomes in an existing database (housing placements, employment outcomes, medical appointments kept, criminal recidivism), administrative data is the most defensible source — it is not subject to survey response bias and does not require separate data collection. Cost: staff time to pull and analyze reports — typically 2–5 hours per reporting period.
Follow-up interviews or focus groups with a sample. Qualitative data from 10–20% of participants provides depth that surveys cannot capture. Appropriate as a supplementary method, not as the primary outcome measure. Cost: interview design, 30–60 minutes per interview, transcription, analysis — typically 20–40 hours per data collection round.
Methods that exceed typical organizational capacity and should be avoided unless you have specific infrastructure: randomized controlled trials, matched comparison group designs, multilevel modeling, real-time biomarker collection.
External vs. Internal Evaluation: When Each Is Required
Internal evaluation — data collected and analyzed by your own staff or with the support of a consultant — is sufficient for foundation grants and most federal grants below the threshold that triggers independent evaluation requirements.
External evaluation — conducted by an independent evaluator with no stake in the program — is required by:
- AmeriCorps grants above program-specific thresholds (AmeriCorps publishes evaluation requirements in each NOFO)
- HRSA programs that specify evaluation as part of the scope of work
- OJJDP and BJA programs at the Tier 2+ evidence level under What Works in Reentry or the CrimeSolutions evidence classification
- NSF and NIH research grants (evaluation is embedded in the research design)
- Some state child welfare and behavioral health grants with legislative evaluation mandates
When external evaluation is required, budget for it from the start. A credible external evaluator for a $300,000 federal program typically costs $20,000–$40,000 depending on scope — within the 8–15% evaluation budget range.
What to Write When the Funder Requires a “Rigorous Evaluation Design”
“Rigorous evaluation design” is a phrase that appears in many federal NOFOs and is frequently misread as a requirement for experimental research.
What it actually means varies by agency. Check the NOFO language immediately after the phrase:
- HHS frequently uses “rigorous evaluation” to mean adherence to the HHS Evidence of Effectiveness standards — a five-tier framework that includes internal evaluation with validated instruments as acceptable at Tiers 3–4.
- DOJ/OJJDP uses “evidence-based” to mean the program model appears on CrimeSolutions.gov at a “Promising,” “Effective,” or “No effects” rating — a citation question, not an evaluation design question.
- Some NIH programs require evaluation designs that include a comparison condition, defined sample sizes with power analysis, and pre-registration.
Before writing the evaluation section, search the NOFO for “evaluation” and read every paragraph that uses the word. The specifications are there. Writing the evaluation plan before reading those paragraphs is the primary cause of evaluation sections that fail to meet the funder’s requirements.
Budget for Evaluation: How Much to Allocate
Evaluation costs fall into two categories: personnel time and external evaluator costs.
Personnel time: Staff who administer surveys, pull administrative data, prepare evaluation reports, and coordinate with an external evaluator. For an internally managed evaluation with pre/post surveys, budget 5–10% of total project staff time, which translates to roughly 5–10% of the personnel budget.
External evaluator: For programs requiring independent evaluation, typical costs:
- Small program ($100,000–$300,000): $15,000–$30,000 for external evaluator (5–10% of project budget)
- Mid-sized program ($300,000–$1,000,000): $30,000–$80,000 for external evaluator (8–12% of project budget)
- Large federal program ($1,000,000+): $80,000–$150,000+ for external evaluator (8–15% of project budget)
For foundation grants that do not require external evaluation, allocating 3–5% of the project budget to evaluation (primarily staff time) is defensible. Funders that do not require evaluation still respond positively to proposals that include a modest but credible evaluation budget — it signals organizational commitment to learning rather than treating evaluation as a reporting obligation.
Free resource
Get the Nonprofit Grant Compliance Checklist
A practical checklist for post-award grant compliance: restricted funds, reporting cadence, audit prep, and common failure points. Delivered by email.
- Outcome evaluation
- Systematic measurement of whether program participants achieved the changes in knowledge, attitude, behavior, or condition that the program was designed to produce. Outcome evaluation is distinct from output counting (participants served, sessions delivered) — it asks whether participants changed, not just how many showed up.
DEFINITION
- Logic model
- A one-page diagram showing the causal relationship between program inputs, activities, outputs, short-term outcomes, and long-term outcomes. Logic models are required by HHS (HRSA, SAMHSA, ACF) and DOJ grant programs and are strongly recommended in foundation proposals. The evaluation plan is built from the outcomes column of the logic model.
DEFINITION
- SMART outcome
- An outcome statement that is Specific, Measurable, Achievable, Relevant, and Time-bound. Grant reviewers assess evaluation plans against SMART criteria. An outcome is not SMART if it cannot be counted, does not specify the change expected, or does not include a timeframe for measurement.
DEFINITION
Q&A
What is an evaluation plan in a grant proposal?
An evaluation plan describes how the organization will measure whether the proposed program achieved its intended outcomes. It includes the specific outcomes being measured, the instruments used to collect data, the frequency of measurement, who is responsible for data collection and analysis, and how findings will be used and reported to the funder.
Q&A
What is the difference between process evaluation and outcome evaluation?
Process evaluation measures whether program activities were implemented as planned — fidelity to the intervention model, sessions delivered, participation rates. Outcome evaluation measures whether participants changed in the ways the program intended. Funders who require 'rigorous evaluation design' are asking for outcome evaluation with specified instruments and analysis methods.
Frequently asked