Updated: June 9, 2026

Data & Methodology

Every number on this site is derived from public federal records. This page documents exactly where the data comes from, how it is queried, what the figures mean, and where they can mislead.

Primary Data Source: the NIH RePORTER API

The site is built on the official NIH RePORTER API at api.reporter.nih.gov, specifically the v2 /projects/search endpoint. This is the same public, no-key-required API that powers reporter.nih.gov itself, and it returns funded-project records: project numbers, titles, abstracts, investigators, organizations, award amounts, activity codes, and fiscal years.

  • Funded-project records, abstracts, and award amounts come from the RePORTER v2 API.
  • Activity code definitions and mechanism rules referenced in guides come from official NIH grants pages and funding opportunity notices.
  • Publication lists in the Grant Output tool come from public NCBI/PubMed records linked to grant numbers.

We do not use private, scraped, or paid data sources. If a figure cannot be reproduced from public federal records, it does not belong on this site.

How the interactive tools get their data

The interactive tools query the RePORTER API live at request time. When you run a search in Trends, PI Finder, Grant Search, Compare Topics, Institute Fit, Check PI, or Weekly Updates, the result reflects whatever RePORTER returns at that moment, not a cached copy we control.

These server endpoints sit between the browser and the NIH API:

  • /api/recent-grants: projects awarded in a recent window, with abstracts and award amounts.
  • /api/fetch-trends: yearly totals, counts, and activity-code breakdowns for charts.
  • /api/fetch-pis: investigator search by keyword, institute, and funding role.
  • /api/check-pi: status check on a PI's recent awards.

Each endpoint adds retry logic and rate limiting around the upstream API but does not alter the underlying award values.

How reference pages get their data (snapshots)

The reference pages for research topics and activity codes work differently from the live tools. Award tables on those pages are generated from a snapshot of RePORTER data that we pull from the same v2 API and refresh periodically. Each page shows its own data-refresh date so you can see how current its table is.

Snapshots keep reference pages fast and stable, but they can trail the live database between refreshes. If a snapshot table and a live RePORTER search disagree, the live RePORTER record is the current one. For time-sensitive questions, use the live tools or reporter.nih.gov directly.

Cleaning and quality checks

Before display, records pass through a small number of conservative processing steps:

  • Subprojects are excluded from topic and activity-code queries through the RePORTER API where that filter applies.
  • Organization names are displayed as RePORTER returns them. Different spellings can therefore split one institution across multiple rows.
  • Missing abstracts and zero or unavailable award amounts are omitted from the display rather than estimated.

Processing never changes award values. Project numbers remain visible so readers can verify individual records in the official database.

How keyword searches work, and where they can mislead

Topic-based numbers on this site come from keyword matching against project titles, abstracts, and project terms in RePORTER. Multi-word phrases are sent as quoted phrases so that a search for "gene therapy" matches the phrase rather than any project containing both words separately. This approach is transparent and reproducible, but it has known failure modes:

  • Overcounting. A project that mentions a keyword in its abstract is counted even if the keyword is peripheral to the work. Broad terms ("cancer", "machine learning") inflate totals.
  • Undercounting. Fields with multiple naming conventions are split across keywords. A search for "heart failure" misses projects that only say "cardiac dysfunction". Trend comparisons between differently-named fields are inherently rough.
  • Reporting lag. NIH adds and updates records on its own schedule, so the most recent fiscal year almost always understates its final total. A downward tick in the latest year of a trend chart usually means incomplete data, not a funding cut.
  • Subprojects and supplements. Large multi-component awards can appear in RePORTER as multiple records. Topic and activity-code queries exclude subprojects, but supplements and related award records can still affect counts. Mechanism-level comparisons involving large center grants (P30, U54, and similar) should be read carefully.

Because of all this, treat topic-level totals as indicators of direction and scale, not precise measurements. Two searches with slightly different keywords can legitimately produce different totals.

What "award amount" means

Where this site shows an award amount, it is the total cost reported by RePORTER for the listed award record: direct plus indirect costs for that record's funding period, as published in the federal data. It is not the lifetime value of a multi-year grant unless the tool explicitly aggregates years.

  • Some records, particularly very recent ones, are published without cost data; we flag these rather than estimating.
  • Costs can be revised after initial publication as supplements and adjustments post.
  • The authoritative figure for any award is the official Notice of Award held by the recipient institution; verify there before relying on a number for anything consequential.

Fiscal years

All years on this site are NIH fiscal years unless labeled otherwise. The federal fiscal year runs October 1 through September 30, so FY2026 covers October 1, 2025 through September 30, 2026. An award made in November 2025 therefore counts toward FY2026, which surprises people comparing our charts to calendar-year figures. Year-over-year comparisons within the site are consistent because every tool uses the same fiscal-year field from RePORTER.

How to sanity-check anything on this site

Every grant record shown here carries an NIH project number (for example, 5R01CA123456-03). You can paste that number into the search box at reporter.nih.gov and see the official record: full abstract, investigators, organization, and funding history. If our display ever disagrees with that record, the official record wins, and we want to hear about it.

For aggregate claims (trend totals, institute breakdowns), you can reproduce our queries yourself using RePORTER's advanced search with the same keyword, since we use the same public API with no private adjustments.

Corrections

If you find a number, label, or claim that does not match the underlying NIH record, email admin@labcat.ai or use the contact page. Include the page URL and, for grant data, the project number. Confirmed errors are corrected on the page; discrepancies caused by reporting lag or keyword limitations are documented as caveats instead.

See our Editorial Guidelines for review standards, the Contributors page for who applies them, and the Privacy Policy for data handling.