← back to map

Methodology

Every field on this site, where it comes from, what we did to it, and what we don't know. Cite it.

Projects

A "project" here means any housing development, transit infrastructure, zoning permit, or city capital project that we found in one of the source feeds. One row in the civic.projects table represents one real-world project, deduplicated by (datasource, external_id). The same physical site can show up multiple times across different feeds (a housing site might appear in both Affordable Housing Production and in Zoning Permits) and they get separate rows because they're tracking different things.

Data sources

Affordable Housing Production
City of Philadelphia / DHCD
~470
Quarterly
Every project funded through the city's affordable-housing program since 2011. We pull via the ArcGIS FeatureServer. fiscal_year_complete is sometimes just a year value (2018); we fold to January 1 of that year. We don't have AMI breakdown, unit mix, or council district on the source rows: district + tract are joined in spatially. Status is inferred from development_type + fiscal_year_complete.
~4,800
Continuous
Issued zoning permits from the city's Carto SQL API. We query the last 24 months, capped at the 5,000 most-recent rows per refresh. We drop permits whose geometry falls outside our Philadelphia bounding box (~1% of rows, usually misgeocoded). Status is a best-effort normalization of L&I's free-text values. The Zoning Board of Adjustment decisions dataset would have been a better fit for this category but it appears to have been retired from OpenDataPhilly without replacement.
7
Manual / annual
Hand-curated from the published FY26 capital budget book. SEPTA does not publish a machine-readable capital project list. Locations for line-wide projects (Trolley Mod, BSL cars) are plotted at the line midpoint; this is a simplification. Funding amounts are total project life-cycle cost, not annual spend.
Major City Capital Infrastructure
City of Philadelphia + partners
8
Manual / annual
Hand-curated from the published Capital Improvement Program and agency press releases. The OpenDataPhilly Capital Program Projects dataset returns 404, so there is no live feed. Refreshed manually when the CIP is updated. Citywide programs (Rebuild, Green City) are plotted at City Hall as a placeholder.

Context layers

408
Annual (Dec)
Boundaries from TIGER for state 42, county 101. Values from the ACS 5-year 2018-2022 release. Rent burden is computed as (households spending 30%+ of income on gross rent) divided by (total renter households). Renter share is renter-occupied / occupied households. Race percentages are B03002 series divided by total population. Some tracts have null values where the source row had no margin of error or was suppressed.
Council districts
City of Philadelphia
10
Per redistricting
ArcGIS FeatureServer. These are the 2024 boundaries. A project's district is computed by spatial join (ST_Intersects) and stored on the project row so we don't have to re-join at read time. Reruns of the backfill update any project whose location changed.
Registered Community Organizations
City of Philadelphia / L&I
239
Quarterly
RCO polygons typically overlap (a single parcel often falls in 2-4 RCOs). We pick the smallest-area RCO containing the point on the theory that the smaller boundary is the more local body. This is a simplification: in practice many projects have to notice multiple RCOs.
Elected officials
Hand-curated
18
Manual on election
City Council (10 district + 7 at-large) and the Mayor, with email, phone, office address, and link to the official site. Refreshed manually after every election. Doesn't include state or federal representatives.
~1,400
Continuous
L&I demolition permits from the last 3 years. We use this as a proxy for housing stock loss because Philadelphia does not publish eviction filings as open data. A demolition permit doesn't always equal displacement (vacant structures get demolished too), but the trend over time and the spatial clustering pattern still tell you something.

Derived fields

developer: pulled from developer_name on housing rows andcontractorname on zoning rows. The same firm may appear under slightly different spellings; we don't normalize beyond trimming whitespace.

status_history: every time the scrape pipeline sees a project's status change vs the last snapshot, we record a new row. This is what powers the "stalled projects" view and the timeline on each project page. Since the table just started populating, the initial backfill stamps every project with its current status; the meaningful history accumulates from there.

search vector: a Postgres tsvector generated column over name (weight A), address + developer (weight B), and description (weight C). Queries hit the GIN index using websearch_to_tsquery so multi-word and quoted phrase searches both work.

Known gaps

  • Eviction filings are not publicly available as open data. We use demolition permits as a proxy.
  • SEPTA capital projects are hand-curated, not a live feed.
  • The city's Capital Program Projects dataset is gone; major infrastructure projects are also hand-curated.
  • Zoning Board of Adjustment decisions used to be on OpenDataPhilly but the endpoint 404s. We use issued zoning permits instead.
  • Some affordable-housing rows have only a fiscal year, not a real completion date.
  • RCOs overlap; we pick the smallest area. In practice, organizers should always check the full RCO list for a site.
  • State and federal elected officials aren't included.
  • Funding amounts on city infrastructure are project lifetime totals, not annual spend.

Use, license, citation

All underlying data is in the public domain (City of Philadelphia, SEPTA, US Census). Code is MIT. If you cite this site in reporting, the line is:

civic-philly (civic-philly.vercel.app), accessed 2026-05-16.
Aggregated from OpenDataPhilly, US Census ACS 5-year 2018-2022, and SEPTA.

Reproduce

Code at github.com/c-tonneslan/civic-philly. The scripts/ directory contains every loader. Anyone can run them against their own Postgres + PostGIS instance with a Census API key.