core_epa__assn_eia_epacamd_subplant_ids
Return to SearchAssociation table providing connections between EPA units and EIA units/generators, at the subplant level.
- Most-recent data:
2024
- Processing:
Data has been cleaned and organized into well-modeled tables that serve as building blocks for downstream wide tables and analyses.
- Source:
EPA -- Mix of multiple EPA sources
- Primary key:
This table has no primary key. The primary keys would have been: plant_id_eia, generator_id, subplant_id and emissions_unit_id_epa, but there are some null records in the generator_id column. ~2 percent of all EPA CAMD records are not successfully mapped to EIA generators.
Additional Details
This table is an augmented version of the core_epa__assn_eia_epacamd crosswalk table which initially comes from the EPA's Github repo camd-eia-crosswalk: https://github.com/USEPA/camd-eia-crosswalk.
This table identifies subplants within plant_ids, which are the smallest coherent units for aggregation. A plant_id refers to a legal entity that often contains multiple distinct power plants, even of different technology or fuel types.
EPA CEMS data combines information from several parts of a power plant:
emissions from smokestacks
fuel use from combustors
electricity production from generators
But smokestacks, combustors, and generators can be connected in complex, many-to-many relationships. This complexity makes attribution difficult for, as an example, allocating pollution to energy producers. Furthermore, heterogeneity within plant_ids make aggregation to the parent entity difficult or inappropriate.
This table inherits from the EPA's crosswalk, the IDs from EPA CAMD core_epacems__hourly_emissions table itself, the core_eia860__assn_boiler_generator table and the core_eia860__scd_generators table. While the core_epa__assn_eia_epacamd table is the core backbone of the table, EPA CAMD IDs ensure there is complete coverage of EPA CAMD reporting units. The EIA 860 table addition ensures there is also complete coverage of those units as well.
For more information about the how this subplant_id is made, see the documentation for pudl.dagster.assets.core.glue.make_subplant_ids and pudl.dagster.assets.core.glue.update_subplant_ids.
But by analyzing the relationships between combustors and generators, as provided in the core_epa__assn_eia_epacamd crosswalk, we can identify distinct power plants. These are the smallest coherent units of aggregation.
Columns
The unique six-digit facility identification number, also called an ORISPL, assigned by the Energy Information Administration.
The ORISPL ID used by EPA to refer to the plant. Usually but not always the same as plant_id_eia.
Sub-plant ID links EPA CEMS emissions units to EIA units.
Dynamically assigned PUDL unit id. WARNING: This ID is not guaranteed to be static long term as the input data and algorithm may evolve over time.
Emissions (smokestack) unit monitored by EPA CEMS.
Generator ID is usually numeric, but sometimes includes letters. Make sure you treat it as a string!