core_eia__entity_generators

package: pudl

Entity table containing static information about generators compiled from across the EIA-860 and EIA-923.

Processing:

Data has been cleaned and organized into well-modeled tables that serve as building blocks for downstream wide tables and analyses.

Source:

EIA -- Mix of multiple EIA Forms

Primary key:

plant_id_eia, generator_id

Usage Warnings

  • Data has been drawn from several EIA sources which are not always consistent with each other, and PUDL chooses the most consistent or relevant value to facilitate cross-referencing even if that means some values will differ from the raw sources. See Harvesting for details, and see Entity Resolution Methodology for a fuller conceptual overview.

  • Contains information from multiple raw inputs.

Additional Details

This is one of two tables where canonical values for generators are set. It contains values which are expected to remain fixed, while core_eia860__scd_generators contains those which may vary from year to year. EIA reports many attributes in many different tables across EIA-860 and EIA-923. In order to compile tidy, well-normalized database tables, PUDL collects all instances of these values and and chooses a canonical value. By default, PUDL chooses the most consistently reported value of a given attribute as long as it is at least 70% of the given instances reported. If an attribute was reported inconsistently across the original EIA tables, then it will show up as a null value. See /methodology/entity_resolution for a conceptual overview of this process. All tables downstream of this one inherit the canonical values established here.

Columns
plant_id_eia

The unique six-digit facility identification number, also called an ORISPL, assigned by the Energy Information Administration.

generator_id

Generator ID is usually numeric, but sometimes includes letters. Make sure you treat it as a string!

duct_burners

Indicates whether the unit has duct-burners for supplementary firing of the turbine exhaust gas

generator_operating_date

Date the generator began commercial operation. If harvested values are inconsistent, we default to using the most recently reported date.

topping_bottoming_code

If the generator is associated with a combined heat and power system, indicates whether the generator is part of a topping cycle or a bottoming cycle

solid_fuel_gasification

Indicates whether the generator is part of a solid fuel gasification system

pulverized_coal_tech

Indicates whether the generator uses pulverized coal technology

fluidized_bed_tech

Indicates whether the generator uses fluidized bed technology

subcritical_tech

Indicates whether the generator uses subcritical technology

supercritical_tech

Indicates whether the generator uses supercritical technology

ultrasupercritical_tech

Indicates whether the generator uses ultra-supercritical technology

stoker_tech

Indicates whether the generator uses stoker technology

other_combustion_tech

Indicates whether the generator uses other combustion technologies

bypass_heat_recovery

Can this generator operate while bypassing the heat recovery steam generator?

rto_iso_lmp_node_id

The designation used to identify the price node in RTO/ISO Locational Marginal Price reports

rto_iso_location_wholesale_reporting_id

The designation used to report the specific location of the wholesale sales transactions to FERC for the Electric Quarterly Report

associated_combined_heat_power

Indicates whether the generator is associated with a combined heat and power system

original_planned_generator_operating_date

The date the generator was originally scheduled to be operational

can_switch_when_operating

Indicates whether a fuel switching generator can switch fuels while operating.

previously_canceled

Indicates whether the generator was previously reported as indefinitely postponed or canceled

_core_eia__forensics_entity_resolution_generators

package: pudl

Forensic table of the statistics determining how we choose a single consistent value during entity resolution for generators.

Processing:

Data has been cleaned but not tidied/normalized. Published only temporarily and may be removed without notice.

Source:

EIA -- Mix of multiple EIA Forms

Primary key:

This table has no primary key.

Usage Warnings

  • This table is meant for forensic purposes only. It contains all values which were used to choose canonical or golden-record. See Entity Resolution Methodology for a fuller conceptual overview.

  • Contains information from multiple raw inputs.

Additional Details

This is a forensic table containing the input values used to choose canonical values during entity resolution. It is not a cleaned up table - it is meant for forensic purposes only. If you have a question about why a value is reported in an scd, entity or out table, you can find out all of the inputs that were used as ingredients to find the canonical value. You can filter by the column_name and the entity id to find all of the possible input values.

Columns
plant_id_eia

The unique six-digit facility identification number, also called an ORISPL, assigned by the Energy Information Administration.

generator_id

Generator ID is usually numeric, but sometimes includes letters. Make sure you treat it as a string!

report_date

Date reported.

valid_until_date

The record in the changelog is valid until this date. The record is valid from the report_date up until but not including the valid_until_date.

column_name

The name of the column.

record_value

The original values found in PUDL _core table records that were used as ingredients to the entity resolution process.

entity_occurrences

The number of times this entity - aka this particular utility, plant, etc - occurs across the pre-entity resolution tables.

record_occurrences

The number of times this particular record_value occurs across the pre-entity resolution tables in association with this particular entity.

consistent_rate

What portion of the entity's records were reported with this particular record_value. This is calculated by dividing the record_occurrences by the entity_occurrences.

is_candidate

Is this record a candidate for being the canonical value? This is based on consistent_rate. By default PUDL requires values to be at least 70 percent consistent to pass this consistency check. There are exceptions to the default 70 percent consistency check for columns like plant or utility names when we always want a value - for those instances we choose the most frequently occurring value regardless of how consistently it was reported.