You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A comprehensive inventory of open/public data sources for US (and international) electric grid intelligence. Covers what CommonGrid currently has, what we plan to add, and where the gaps are.
Comprehensive registry of US electric utilities — investor-owned utilities (IOUs), cooperatives, municipal utilities, power marketers, and CCAs. Includes organizational metadata, EIA identifiers, customer counts, peak demand, revenue, sales, BA codes, NERC regions, meter counts, and jurisdictional information.
The 7 US Independent System Operators / Regional Transmission Organizations: CAISO, ERCOT, ISO-NE, MISO, NYISO, PJM, SPP. Includes boundary polygons and regional linkages.
45 structural balancing authorities responsible for maintaining load-generation balance within their control areas. Includes EIA codes and control area boundary polygons.
4. Regions (Service Territories, ISO Regions, CCAs, BA Areas)
Field
Value
Description
Geographic regions representing various grid boundaries — utility service territories, ISO/RTO regions, Community Choice Aggregator (CCA) territories, and balancing authority control areas.
Record Count
~3,000 regions
Schema
Region metadata with type classification and boundary linkages
Source
HIFLD Electric Retail Service Territories ArcGIS + CEC (California CCAs) + HIFLD Control Areas
~3,000 GeoJSON polygon files representing geographic boundaries for utility service territories (by EIA ID), ISO/RTO boundaries, CCA territories, and BA control areas.
Wholesale electricity market pricing nodes across all 7 US ISOs/RTOs. Includes trading hubs, load zones, SUBLAPs (CAISO), LAPs, and generation pricing nodes. Generation nodes are cross-referenced from EIA-860 power plant data for geographic coordinates.
Generator-level data for all US power plants ≥1 MW combined nameplate capacity. Includes plant location, generator specs, fuel type, capacity, ownership, in-service dates, energy storage details, wind/solar-specific data, environmental equipment, and proposed/retired generators.
ZIP containing XLSX files (Utility, Plant, Generator, Wind, Solar, Energy Storage, MultiFuel, Owner, Environmental)
Coverage
All US states + territories
Update Frequency
Annual (final data ~September; preliminary monthly via EIA-860M)
License
Public domain (US government)
Notes
~10,000+ generators. Includes proposed generators and recently retired ones. Construction cost data collected since 2013. The preliminary monthly version (EIA-860M) at https://www.eia.gov/electricity/data/eia860m/ provides near-real-time generator inventory updates.
EIA Form 923 — Power Plant Operations Report
Field
Value
Description
Monthly and annual plant-level data on electricity generation, fuel consumption, fuel stocks, fuel receipts and costs. Covers ~3,034 monthly plants + ~9,528 annual plants. Includes schedules for fuel receipts, generator data, fossil fuel stocks, and environmental data.
Combines what was previously EIA-906, EIA-920, and FERC-423. Essential for actual generation output, heat rates, and fuel cost analysis. Links to EIA-860 via plant/generator codes.
EIA-860M — Monthly Electric Generator Inventory
Field
Value
Description
Preliminary monthly updates to the generator inventory. Includes newly operating, proposed, and retired generators reported since the last annual EIA-860. Comprehensive list of retirements since 2002.
Comprehensive plant-level data on emissions (CO₂, NOₓ, SO₂, CH₄, N₂O, Hg, PM₂.₅, NH₃, VOCs), generation, heat input, resource mix, and emission rates for almost all US grid-connected power plants. Aggregated at generator, plant, state, NERC region, BA, and eGRID subregion levels.
XLSX (multiple aggregation levels), plus R scripts for reproduction
Coverage
US (plants ≥1 MW combined capacity)
Update Frequency
Annual (latest: eGRID2023, released 2024, revised June 2025)
License
Public domain (US government)
Notes
Gold standard for grid emissions analysis. Eight aggregation levels from individual generators up to NERC region. Underlying data comes from EPA CAMD and EIA Forms 860/923.
WRI Global Power Plant Database
Field
Value
Description
Global database of ~35,000 power plants across 167 countries. Includes plant name, location (lat/lon), capacity (MW), primary fuel, owner, generation data (2013–2017), and data source.
Irregular (last major update was v1.3.0 in June 2021 — may be stale)
License
CC-BY 4.0
Notes
Great for international coverage but US data is sourced from EIA (so somewhat redundant domestically). Last updated 2021 — check if newer versions exist. Generation data only through 2017.
HIFLD Power Plants
Field
Value
Description
Locations of power plants in the US with attributes including plant name, primary fuel, capacity, operator, and geographic coordinates.
Source
Homeland Infrastructure Foundation-Level Data (DHS)
Derived from EIA data. Useful for geospatial analysis but EIA-860 is more authoritative for detailed plant/generator data.
2. Transmission & Substations
HIFLD Electric Power Transmission Lines
Field
Value
Description
Transmission line routes for lines operating at 69 kV to 765 kV. Includes voltage, owner/operator where available. Underground lines included where publicly available.
Source
Homeland Infrastructure Foundation-Level Data (DHS)
⚠️As of April 2023, HIFLD reclassified the Electric Substations layer as a secured dataset. It is no longer publicly available without special permission via the HIFLD Secure portal. Alternative: use OpenStreetMap data (see below). The transmission lines dataset remains public.
OpenStreetMap Power Infrastructure
Field
Value
Description
Community-mapped power infrastructure including transmission lines, substations, power plants, transformers, and distribution infrastructure worldwide. Visualized at Open Infrastructure Map.
OSM PBF (raw); GeoJSON via Overpass API; Shapefile/GeoPackage via Infrageomatics
Coverage
Global (quality varies significantly by region — excellent in Europe, good in US, sparse in developing countries)
Update Frequency
Continuous (community-edited)
License
ODbL (Open Database License) — requires attribution and share-alike
Notes
Best open source for global power infrastructure. For US substations, this may be the best remaining open alternative after HIFLD secured theirs. Processing raw OSM data at scale requires tools like Imposm3 or osm2pgsql. Small extracts via Overpass Turbo are straightforward. Tag schema: power=line, power=substation, power=plant, power=generator, power=tower, etc.
3. Grid Operations & Market Data
EIA-930 — Hourly Electric Grid Monitor
Field
Value
Description
Hourly demand, generation by fuel type, and interchange data for all US balancing authorities. Provides near-real-time visibility into grid operations.
https://api.eia.gov/v2/electricity/rto/ (multiple sub-routes for demand, fuel-type, interchange)
Format
JSON API, CSV download, interactive dashboard
Coverage
All US balancing authorities in the Lower 48
Update Frequency
Hourly (with ~1-2 hour lag)
License
Public domain (US government)
Notes
Incredible dataset for real-time grid analysis. Sub-BA data also available since 2024. API returns max 5,000 rows per request. Data available back to mid-2015. Already integrated into PUDL.
EIA Open Data API v2 — Electricity Routes
Field
Value
Description
RESTful API providing access to all EIA electricity data series — retail sales, prices, generation, capacity, fuel consumption, state profiles, and more. Hundreds of data series with flexible querying.
Varies by series (monthly, annual, hourly for RTO data)
License
Public domain (US government)
Notes
Free API key required (instant registration). Max 5,000 rows per request with pagination. Key routes: /retail-sales, /electric-power-operational-data, /rto/, /state-electricity-profiles. This is the programmatic gateway to essentially all EIA electricity data.
Individual ISO Data Portals
Each ISO/RTO publishes its own market and operational data. These are the authoritative primary sources for real-time and historical grid data:
Hourly emissions data (SO₂, NOₓ, CO₂) and operational data (heat input, gross load, steam load) for fossil-fuel-fired units. The most granular temporal emissions data available for US power plants.
Massive dataset (billions of hourly records since 1995). Essential for granular emissions tracking. Already integrated into PUDL. Use the bulk download files for historical analysis — the API is better for targeted queries.
WattTime API
Field
Value
Description
Real-time and historical marginal emission rates for electricity grids. Provides the marginal operating emissions rate (MOER) — the emissions impact of consuming one additional unit of electricity at a specific time and location.
Free tier available (current data only); paid plans for historical data and forecasts
Notes
Industry standard for real-time marginal emissions. Free tier provides current signal + basic grid region data. Pro tier ($) needed for historical data, forecasts, and health damage data.
Electricity Maps (formerly electricityMap)
Field
Value
Description
Real-time carbon intensity of electricity consumption/production for grid zones worldwide. Shows electricity flow between zones, generation mix, and carbon intensity.
Free tier (limited); commercial API for production use. Contribution pipeline is open source (MIT).
Notes
Their open-source repo contains the parsers and zone definitions. The API itself has rate limits on the free tier. Excellent for international carbon intensity data. Methodology is transparent.
Singularity Energy — Carbon Analytics
Field
Value
Description
Grid carbon analytics including marginal and average emission rates, power flow tracking, and clean energy matching.
Commercial API (some free/academic access may be available)
Notes
Competes with WattTime. Provides both marginal and average emissions. Check for academic/research pricing.
5. Rates, Tariffs & Retail Data
NREL Utility Rate Database (URDB)
Field
Value
Description
Comprehensive database of US utility rate structures. Contains detailed tariff information including energy charges, demand charges, time-of-use periods, tiered rates, and more. Used by NREL's System Advisor Model (SAM) and other analysis tools.
Ongoing community updates (not comprehensive — some rates are stale)
License
CC-BY (Creative Commons Attribution)
Notes
The most comprehensive open rate database available, but coverage is incomplete and some entries are outdated. Requires NREL API key. Rate structures are complex (JSON objects with nested tier/TOU structures).
NREL Utility Rates API (v3)
Field
Value
Description
Returns annual average utility rates ($/kWh) for residential, commercial, and industrial sectors plus local utility name for a given lat/lon. Simple flat rate lookup.
⚠️Data is from 2012 and there are currently no plans to update
License
Public (NREL API terms)
Notes
Simple API for quick rate lookups by location, but extremely stale (2012 data). Use EIA-861 retail price data for current rates, or URDB for detailed rate structures.
EIA Electric Sales, Revenue, and Average Price
Field
Value
Description
Retail electricity sales (MWh), revenue ($), customer counts, and average price (¢/kWh) by state, utility, and sector. Monthly and annual data.
Useful for supplementary rate information. Quality varies by contribution.
6. Renewable Energy & Distributed Energy Resources
NREL National Solar Radiation Database (NSRDB)
Field
Value
Description
Hourly and half-hourly solar irradiance data (GHI, DNI, DHI), meteorological data, and derived solar resource data at 4km×4km grid resolution. Covers 1998–present.
Annual (new years added; historical data may be reprocessed)
License
Public domain (US government-funded)
Notes
Essential for solar resource assessment and PV modeling. Pairs with NREL's System Advisor Model (SAM). Free API key required.
NREL Wind Integration National Dataset (WIND Toolkit)
Field
Value
Description
Modeled wind resource data at 5-minute resolution for 126,000+ sites across the US. Includes wind speed, direction, temperature, pressure at multiple hub heights.
Periodic (historical dataset, 2007–2014 base period)
License
Public domain
Notes
High-resolution wind resource data for wind energy analysis. Also available on AWS Open Data.
NREL Annual Technology Baseline (ATB)
Field
Value
Description
Technology cost and performance projections for electricity generation technologies (solar, wind, battery, natural gas, nuclear, etc.) used in energy system modeling. Includes LCOE projections through 2050.
Standard reference for energy technology cost assumptions. Integrated into PUDL. Essential for capacity expansion modeling and LCOE comparisons.
EIA-861 Net Metering Data
Field
Value
Description
Net metering program statistics by utility, state, and sector. Includes number of customers, installed capacity, and energy sold back to grid. A proxy for distributed solar adoption.
Source
US Energy Information Administration (within EIA-861)
Companion to "Tracking the Sun" but for wind. Includes capacity factors, PPA prices, and technology trends.
7. EV Charging Infrastructure
DOE AFDC Alternative Fuel Station Locator
Field
Value
Description
Comprehensive database of alternative fuel stations in the US and Canada, including EV charging stations. Includes station location, network, connector types, power level (L1/L2/DCFC), access type, and hours.
Source
US Department of Energy / NREL Alternative Fuels Data Center
The authoritative source for EV charging station locations in the US. ~70,000+ EV charging locations. API key required (free from NREL). Filter by fuel type, connector type, network, state, etc.
Open Charge Map
Field
Value
Description
Community-driven global registry of EV charging locations.
Best open global source for EV charging data. Quality varies by region. US data may lag AFDC.
8. Energy Storage
DOE Global Energy Storage Database (GESDB)
Field
Value
Description
Database of energy storage projects worldwide. Includes technology type, rated power (kW), energy capacity (kWh), status, application, location, and commissioning date.
⚠️ The database may not have a straightforward bulk data export. Web scraping or manual download may be needed. Verify current data access options — the site has historically had limited programmatic access.
EIA-860 Energy Storage Data
Field
Value
Description
Generator-level data for energy storage installations (batteries, pumped hydro, flywheels, compressed air) as part of the annual EIA-860 survey. Includes rated power, energy capacity, technology type, and status.
Most authoritative source for utility-scale energy storage in the US. Track this alongside EIA-860M for near-real-time additions.
9. Interconnection Queues
LBNL Queues — Electricity Markets & Policy
Field
Value
Description
Compiled and cleaned interconnection queue data from all 7 US ISOs/RTOs plus some non-ISO utilities. Tracks proposed generation and storage projects waiting to connect to the grid. Includes capacity, fuel type, status, request date, and estimated in-service date.
The definitive cleaned dataset for interconnection queue analysis. Raw queue data from individual ISOs is messy and inconsistent — LBNL standardizes it. Essential for understanding the pipeline of future generation additions. As of 2024, there was >2,600 GW in the US interconnection queues (mostly solar, wind, and storage).
Utility-reported reliability metrics including SAIDI (System Average Interruption Duration Index), SAIFI (System Average Interruption Frequency Index), and CAIDI. Collected via EIA-861 Schedule of Reliability.
Primary source for standardized US utility reliability metrics. Data quality varies — some utilities report with and without major event days separately.
DOE OE-417 Electric Emergency and Disturbance Reports
Field
Value
Description
Reports of major electric disturbances and unusual occurrences reported to DOE. Includes cause, affected area, customers affected, demand loss, and duration.
Great for real-time outage monitoring. Historical data available via paid API. Independently operated — not an official government source.
11. Demand Response & Energy Efficiency
EIA-861 Demand Response Data
Field
Value
Description
Demand response program enrollment, energy savings, peak savings (potential and actual), and program costs by utility, state, sector, and balancing authority.
Best national-level data on demand response programs. Collected since 2013 in current format.
EIA-861 Energy Efficiency Data
Field
Value
Description
Energy efficiency program data including incremental energy savings, peak demand savings, weighted average life cycle, and costs for utility-administered programs.
Detailed financial and operational data for major US electric utilities (Class A and B). Includes income statements, balance sheets, rate base, depreciation, O&M expenses, sales for resale, purchased power, generating plant statistics, and transmission/distribution data.
Extremely detailed financial data but notoriously difficult to work with in raw format (Visual FoxPro DBF files). Strongly recommend using PUDL's processed version. Essential for utility financial analysis, rate cases, and cost-of-service studies.
FERC Form 714 — Annual Electric Balancing Authority Area and Planning Area Report
Field
Value
Description
Hourly load data, planning area descriptions, and peak demand data for balancing authorities and planning areas. Includes hourly system load profiles for transmission planning.
Valuable for hourly load shape analysis at the BA/planning area level. Historical data back to 2006+.
FERC Form 2 — Natural Gas Pipeline Annual Report
Field
Value
Description
Financial and operational data for major interstate natural gas pipelines. Relevant to grid analysis because gas pipeline capacity affects gas-fired generation availability.
Tangential to grid data but relevant for gas-electric coordination analysis. PUDL converts raw VFP to SQLite.
13. International Grid Data
ENTSO-E Transparency Platform (Europe)
Field
Value
Description
Comprehensive European electricity market and grid data. Includes generation by type, cross-border flows, day-ahead prices, installed capacity, load, outages, and transmission constraints for all EU/EEA countries.
Source
European Network of Transmission System Operators for Electricity
Free registration required; data is public under ENTSO-E terms
Notes
The European equivalent of combining all US ISO data portals into one. Free API token after registration. Massive amount of data. Python libraries like entsoe-py simplify access.
Ember Global Electricity Data
Field
Value
Description
Monthly and annual electricity generation, capacity, and emissions data for 200+ countries. Provides consistent methodology for global electricity sector analysis.
Best source for global renewable capacity by country. Complementary to Ember's generation data.
14. Meta-Sources & Aggregators
Catalyst Cooperative PUDL (Public Utility Data Liberation)
Field
Value
Description
Open-source ETL pipeline that cleans, integrates, and standardizes US energy data from multiple federal sources into analysis-ready SQLite and Parquet databases. Currently processes EIA-860, EIA-861, EIA-923, EIA-930, EIA-176, EPA CEMS, FERC Forms 1/2/6/60/714, and NREL ATB.
Extremely valuable. Instead of manually downloading and cleaning EIA/FERC data, PUDL provides it pre-cleaned with consistent entity IDs, foreign key relationships, and unit conversions. Hundreds of tables. Available on Kaggle and AWS Open Data for easy download. Also archives raw inputs on Zenodo for reproducibility.
Open Energy Data Initiative (OEDI)
Field
Value
Description
DOE's centralized platform for publishing open energy data. Hosts datasets from national labs and DOE programs on topics including solar, wind, buildings, transportation, and grid.
Good for discovery but data quality and freshness varies widely. Often better to go to the source agency directly.
Gap Analysis
Data We Need But Don't Have Good Open Sources For
🔴 High Priority Gaps
Gap
Why It Matters
Current State
Potential Approaches
Distribution-level infrastructure (feeder lines, distribution substations, transformers, pole locations)
Essential for understanding last-mile grid capacity, DER hosting capacity, and outage analysis at the local level
Almost no open data exists. Distribution infrastructure data is held by individual utilities and rarely published. Some utilities publish hosting capacity maps but not raw infrastructure data.
Monitor utility hosting capacity map portals (many CA utilities publish these per Rule 21); scrape where possible. HIFLD's substation data (now secured) included some distribution-level facilities. OSM has some distribution-level data but coverage is very sparse.
Real-time or near-real-time outage data (beyond PowerOutage.us)
Critical for reliability analysis, storm response, and customer impact assessment
PowerOutage.us scrapes utility outage maps but is a commercial product. Individual utility outage maps exist but there's no standardized open dataset. DOE OE-417 captures major events but with significant delay.
Build scrapers for individual utility outage map APIs (many utilities use Kubra or OMS platforms with public-facing APIs). Consider partnering with PowerOutage.us or building a similar aggregation layer.
Needed for bill calculation, DER economics, and rate comparison tools
NREL URDB exists but is incomplete and partially outdated. No single source has all current US utility tariffs in machine-readable format.
Build utility tariff scrapers (tariffs are public documents, usually PDFs on utility websites). Consider OpenEI contributions. Some startups (Genability/Arcadia) have comprehensive databases but they're commercial.
Grid interconnection queue data (standardized, real-time)
Understanding the future generation pipeline and grid congestion
LBNL compiles queue data but with lag. Individual ISO queue portals are messy and inconsistent. No open real-time standardized feed exists.
Scrape directly from individual ISO queue portals (each ISO publishes queue data publicly). Supplement with LBNL annual compilation. Consider building a standardized aggregation layer.
✅ Partially addressed — /pricing-nodes dataset provides 4,065 nodes across all 7 ISOs, with trading hubs, load zones, SUBLAPs, and generation nodes cross-referenced with EIA-860 coordinates. Full pnode lists (e.g., CAISO's 21k+ nodes) available via OASIS but most lack individual coordinates.
Expand by scraping more ISO-specific node-to-geography mappings; PJM and MISO may publish GIS files for nodes. Match remaining CAISO pnodes to substations via HIFLD or OSM data.
Grid congestion / curtailment data
Understanding where and when the grid is constrained affects renewable integration and investment decisions
Some ISOs publish congestion reports. CAISO publishes curtailment data. Not standardized.
Significant and growing share of capacity is invisible to grid operators
LBNL Tracking the Sun covers distributed solar in ~30 states. No national comprehensive DER registry exists. EIA-861 net metering data is aggregate only.
Combine LBNL Tracking the Sun + state interconnection data (CA NEM, NY, etc.) + Census ACS data for estimation. Some states have DER registries.
Weather data correlated with grid events
Weather is the primary driver of both load and renewable generation
NOAA has excellent weather data but it's not pre-joined with grid data.
Regulatory decisions affect rates, grid investment, and market structure
Every state PUC has its own filing system with different formats and access methods. No national aggregated source.
Would require state-by-state scraping. Some states have good electronic filing systems (CA CPUC, NY PSC, TX PUCT). Enormous effort to standardize.
Grid-level load profiles (sub-BA, feeder-level)
Enables granular demand forecasting and DER planning
Very limited. Some utilities publish aggregate load profiles. ISO load data is at the zone level.
Monitor utility data portals. Some utilities participate in DOE's Grid Modernization initiative and publish data.
Electricity trade (international)
Important for border regions and understanding continental energy flows
EIA publishes some US-Canada/Mexico trade data. ENTSO-E covers European cross-border flows well.
EIA international data + FERC Form 714 (which includes interchange data).
Microgrid installations
Growing segment of grid infrastructure
No comprehensive open database. DOE tracks some projects.
DOE microgrid program reports; industry surveys.
Hydrogen electrolyzer locations (emerging)
Growing intersection with grid (large loads)
Very early stage. No comprehensive database.
Monitor DOE hydrogen hub announcements and EIA generator data (electrolyzers starting to appear in EIA-860).
Data center locations and power demand
Fastest-growing load segment in many regions
No open comprehensive database of data centers with power consumption. Some reporting via EIA-861M.
Monitor industry reports (Synergy Research, etc.); some data centers appear in interconnection queues.
Key Observations
Federal data is excellent — EIA, EPA, FERC, and DOE provide world-class open energy data for the US. The main challenge is cleaning and integrating it (which PUDL addresses).
The biggest gap is distribution-level data — Almost everything below the transmission level is proprietary to individual utilities. This is the #1 structural gap in open grid data.
Real-time data is getting better — EIA-930 (hourly grid monitor), individual ISO data portals, and emissions APIs (WattTime, Electricity Maps) provide increasingly good real-time coverage. But it requires API access and stitching together multiple sources.
International data lags US data — Europe (via ENTSO-E) is the main exception. Most other regions have limited open grid data.
HIFLD securing substation data is a setback — The 2023 decision to restrict substation location data means OpenStreetMap is now the best open source for this critical infrastructure layer.
Machine-readable tariffs remain the holy grail — Despite multiple attempts (URDB, OpenEI), nobody has cracked comprehensive, current, machine-readable tariff data for all US utilities. This is a massive opportunity.
Open-source tools that are useful for working with grid data. These are independent projects — not data sources, but they can help you access and work with the primary sources listed above.
Python library providing standardized access to ISO data portals (CAISO, ERCOT, ISO-NE, MISO, NYISO, PJM, SPP). Returns pandas DataFrames. Note: the team behind it also offers a commercial hosted API at gridstatus.io.
Catalyst Cooperative's Public Utility Data Liberation project. Cleans and integrates FERC, EIA, and EPA data into analysis-ready SQLite/Parquet datasets.