Data Details
Data Details
What's a Program?
In this dataset, a “program” is defined as a statutorily grounded, policy-relevant unit of government activity that can be meaningfully linked to a flow of funds. Programs are constructed by aligning authorizing law, how agencies actually operate, and where money moves through federal financial structures (e.g., outlays, trust funds, or tax expenditures). They are intentionally defined at a middle level of granularity—large enough to reflect distinct policy purposes and funding streams, but not so granular that they fragment into administrative artifacts or accounting line items. This means programs are not taken directly from any single reporting system, but are synthesized to answer a practical question: what does the government do, and how much does it cost?
Jump to:
Methodology
What this is
This document is a reconstruction of FY2024 federal spending that connects three things that are usually fragmented:
- what the government is authorized to do (statute)
- what it says it does (program inventories)
- where the money actually flows (financial accounts, outlays, and revenue loss)
There is no single government dataset that does this end-to-end. This model exists because those systems do not align.
What problem this is solving
The official Office of Management and Budget Federal Program Inventory (FPI) includes thousands of entries, but:
- it mixes levels of abstraction (programs, activities, line items)
- it does not reconcile cleanly to total federal spending
- it is difficult to use to answer basic questions like:
- “What are we actually spending money on?”
- “How much does each major function of government cost?”
This model takes a different approach:
Start from what the government is supposed to do, and force a reconciliation to where the money actually goes.
How it was built
The model was constructed in layers.
1. Statute-first program definition
Programs are defined based on authorizing law, not reporting artifacts.
That means:
- fewer, more meaningful rows
- elimination of duplicative or purely administrative entries
2. Program → financial structure mapping
Each program is mapped to a financial anchor:
- Treasury Account Symbols (TAS), where available
- trust funds (e.g., Social Security, Highway Trust Fund)
- appropriation/account group structures
- revenue-loss estimates for tax expenditures
Where a direct mapping was not available, the model uses:
- account-group proxies
- and explicitly labels them as such
3. Dollar assignment
Each program is assigned one primary fiscal measure:
- Outlays (preferred, for actual spending)
- Obligations (for credit programs)
- Revenue loss (for tax expenditures)
The model is anchored to total FY2024 federal spending (~$6.75T) and forces reconciliation to that total.
4. Residual handling (this part matters)
Where data cannot be cleanly mapped:
- dollars are not estimated away
- they are isolated into explicit residual rows
These rows are labeled:
“Unmapped – [Department] … (requires agency clarification)”
This is intentional.
It surfaces where:
- agency reporting is unclear
- financial structures do not align with program definitions
- or public data is insufficient
5. Confidence levels
Each row includes a Confidence Level:
- High → directly tied to Treasury accounts or audited data
- Medium-High → structured trust/account mapping
- Medium → account-group or proxy mapping
- Low → unresolved or estimated
- Detail / zero-carry → structural rows without direct dollars
This allows users to distinguish:
- what is known precisely
- what is inferred
- and what is not yet resolvable
What’s different from existing government data
This model makes several deliberate tradeoffs:
It prioritizes coverage over uniform precision
- 100% of spending is accounted for
- but not all spending is mapped at the same level of financial detail
It prioritizes structure over reporting conventions
- programs reflect statutory intent, not internal agency labels
It prioritizes transparency over cleanliness
- unresolved areas are shown, not smoothed over
Strengths
1. Full coverage
This model accounts for ~100% of FY2024 federal spending.
There are no hidden gaps—only explicit ones.
2. Cross-system reconciliation
It connects:
- statute
- program inventories
- and financial accounts
This is not available in any single government dataset.
3. Transparency of uncertainty
Instead of pretending precision:
- confidence is labeled
- assumptions are visible
- gaps are isolated
This makes the model more—not less—credible.
4. Usability
The model is structured to answer questions like:
- What are the largest functions of government?
- How much do direct benefits vs. tax expenditures cost?
- Which agencies control the most spending?
- Where is the data weakest?
Limitations
1. Mixed levels of financial precision
Not all programs are mapped to Treasury accounts.
- some are TAS-level (high precision)
- others rely on account-group proxies
This reflects real limitations in public data—not modeling shortcuts.
2. Credit and loan programs are inherently complex
Programs like student loans:
- use obligations, not outlays
- involve subsidy accounting and long-term flows
These are included, but not fully normalized to cash-flow equivalents.
3. Tax expenditures are estimates
Revenue-loss figures:
- come from modeling, not observed spending
- vary depending on assumptions
They are included for completeness, but are not directly comparable to outlays.
4. Agency reporting inconsistencies
Some residuals exist because:
- agencies report at incompatible levels
- financial accounts do not map cleanly to programs
These are surfaced explicitly.
What this is not
- not an official government dataset
- not an audited financial statement
- not a replacement for detailed budget documents
What this is
A transparent, structured, and fully reconciled model of what the federal government does and how much it spends doing it.
Final note
No single source provides this view.
This model exists because:
- the underlying systems (policy, budgeting, accounting, reporting) are not aligned
- and answering basic questions requires stitching them together
Where the model is precise, it shows that.
Where it is not, it shows that too.
That’s the point.
Definitions
Department
The cabinet-level department or independent agency responsible for the program.
Examples:
- Health and Human Services (HHS)
- Treasury
- Department of Defense
Program
The name of the program, defined based on statute or policy function.
Programs are structured to reflect:
- what the government is authorized to do
- not just how agencies report activities
Agency
The specific agency or sub-agency that administers the program.
Examples:
- Centers for Medicare & Medicaid Services (CMS)
- Internal Revenue Service (IRS)
Statute
The primary law authorizing the program.
This anchors the program to:
- congressional intent
- legal authority
Program Type
The functional category of the program.
Examples include:
- Direct Benefit (e.g., Social Security)
- Grant
- Tax Expenditure
- Credit / Loan / Guarantee
- Procurement / Operations
In Official FPI?
Whether the program appears in official federal program inventories.
Values:
- Fully represented → clearly included in official data
- Partially represented → present but fragmented or inconsistent
- Not represented → missing or not clearly identifiable
💰 Dollar Columns
Outlays (millions)
Actual money spent in FY2024.
- Represents cash leaving the federal government
- Includes benefits, salaries, contracts, and grants
👉 Best measure of real spending
Outlays Source
Where the outlay figure comes from.
Indicates whether the value is:
- directly tied to financial accounts
- derived from mapping
- or estimated
Obligations (millions)
Money the government committed to spend in FY2024.
- Created when a contract, grant, or loan is approved
- May be spent in future years
👉 “Money promised,” not necessarily spent yet
Obligations Source
Where the obligation value comes from.
Budget Authority (millions)
The amount Congress authorized agencies to commit.
- Sets the legal limit on obligations
- Does not mean money was spent
👉 “Spending permission”
Budget Authority Source
Where the budget authority value comes from.
Revenue Loss (millions)
Estimated revenue not collected due to tax policy (tax expenditures).
Examples:
- tax credits
- deductions
- exclusions
- Not direct spending
- Based on estimates
👉 Spending through the tax code
Revenue Loss Source
Where the revenue loss estimate comes from.
📊 Core Modeling Columns
Dollar Basis (Primary)
The primary fiscal measure used for the program.
Each program is assigned one:
- Outlays
- Obligations
- Revenue Loss
This ensures:
- no double counting
- consistent totals
Dollar Quality
How directly the dollar amount reflects observed data.
Values:
- Actual → reported or observed spending
- Estimated → derived from modeling or mapping
- Carry-Zero → structural row with no direct dollars
Confidence Level
How tightly the program’s dollars are tied to financial systems.
Values:
- High → directly tied to Treasury accounts or audited data
- Medium-High → structured trust/account mapping
- Medium → account-group or proxy mapping
- Low → unresolved or estimated
- Detail / zero-carry → structural row without direct dollars
Why Unresolved
Explanation for any residual or unmapped dollar amounts.
Used when:
- data does not align cleanly
- agency reporting is unclear
- financial structures are inconsistent
Dollar Method Note
Description of how the dollar value was derived.
Provides context for:
- mapping approach
- assumptions
- reconciliation logic
🧩 Context / Metadata Columns
Baseline Match Detail
Details on how the program maps to official program inventories.
Used to:
- explain alignment or mismatch
- document differences in structure
Funding Status
Indicates whether the program is active, ongoing, or uncertain.
Source
High-level reference for where the data originated.
Examples:
- government datasets
- agency reports
- modeled estimates
Notes
Additional context or clarifications about the program.
⚠️ Important Notes
1. All dollar amounts are in USD
(And depending on your version: either full dollars or millions—label clearly)
2. Not all programs use the same type of dollars
- Outlays = actual spending
- Obligations = commitments
- Revenue Loss = tax-based spending
3. Totals use a single “Primary” measure per program
This avoids double counting and ensures full coverage.
4. This dataset prioritizes transparency
- uncertainty is labeled
- gaps are shown
- assumptions are documented