Conversation
There was a problem hiding this comment.
Pull request overview
This PR updates the industry ETL to align with an end-of-period data convention (e.g., model period 2025 representing 2030 “data year”), while keeping the period labels unchanged for Temoa/database compatibility.
Changes:
- Add a
data_year()helper and use it to annotate notes with end-of-period “data year” context. - Update demand GDP-year filtering to include both period years and their end-of-period years for growth calculations.
- Adjust runtime configuration (version/period list), data_id prefixing, and add a
.gitignorefor generated artifacts.
Reviewed changes
Copilot reviewed 6 out of 7 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
common.py |
Adds data_year() helper to convert model period/vintage to representative end-of-period data year. |
demands.py |
Incorporates end-of-period years into GDP filtering and uses data_year() when scaling demand values/notes. |
techinput.py |
Updates notes to include end-of-period “data year” via data_year(). |
efficiency.py |
Updates notes to include end-of-period “data year” via data_year(). |
setup.py |
Changes generated data_id prefix for industry high-resolution IDs. |
input/params.yaml |
Updates version and the configured periods list. |
.gitignore |
Ignores generated outputs and Python bytecode/cache files. |
Comments suppressed due to low confidence (1)
demands.py:142
- GDP scaling is computed as a ratio between consecutive years in
all_gdp_years(e.g., 2030/2025), but the demand baseline you multiply by is explicitly 2022 (base_2022). This makesval = base_2022 * scaleinconsistent (it should be scaled relative to the baseline year’s GDP, e.g., GDP(dy)/GDP(2022), or you should change the baseline year to match the GDP anchor). Consider building a year->GDP-level map frompop_dfand computingscale = gdp_level[dy] / gdp_level[baseline_year](and include the baseline year in the filtered GDP years).
# ---- GDP scaling dict from CER CEF GNZ ----
# End-of-period convention: include both period years and their end-of-period
# data years so we can compute growth rates that span each period.
end_years = [data_year(p, periods) for p in periods]
all_gdp_years = sorted(set(periods) | set(end_years))
gdp_df = pop_df.copy()
gdp_df = gdp_df[gdp_df['Year'].isin(all_gdp_years)]
gdp_df = gdp_df[gdp_df['Variable'] == 'Real Gross Domestic Product ($2012 Millions)']
gdp_df = gdp_df[gdp_df['Scenario'] == 'Global Net-zero']
gdp_df = gdp_df.sort_values('Year').reset_index(drop=True)
# gdp_dict[y] = GDP(y) / GDP(previous year in filtered set), first entry = 1.0
gdp_dict: dict[int, float] = {}
for i, row in gdp_df.iterrows():
year = int(row['Year'])
val = float(row['Value'])
if i == 0:
gdp_dict[year] = 1.0
else:
prev_val = float(gdp_df.loc[i - 1, 'Value'])
gdp_dict[year] = (val / prev_val) if prev_val != 0 else 1.0
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| def data_year(period_or_vintage: int, model_periods: list[int]) -> int: | ||
| """Return the representative data year for a model period/vintage. | ||
|
|
||
| End-of-period convention: each model period uses data from the end of that | ||
| period, i.e. period + 5 years (uniform 5-year step). | ||
|
|
||
| Pre-existing vintages (before the first model period) use same-year data. | ||
| """ | ||
| if period_or_vintage < model_periods[0]: | ||
| return period_or_vintage | ||
| else: | ||
| return period_or_vintage + 5 No newline at end of file |
There was a problem hiding this comment.
data_year will raise an IndexError if model_periods is an empty list (e.g., if periods: [] is provided in params.yaml). It would be better to validate upfront and raise a clear ValueError (or provide a default) so failures are easier to diagnose.
| - 2045 | ||
| - 2050 No newline at end of file | ||
| #Write the version of the NRCan Comprehensive database you wish to use | ||
| NRCan_year: 2022 #Check the NRCan website and find the most reason year in the database, that is the value to use |
There was a problem hiding this comment.
Typo in the inline comment: “most reason year” should be “most recent year”.
| NRCan_year: 2022 #Check the NRCan website and find the most reason year in the database, that is the value to use | |
| NRCan_year: 2022 #Check the NRCan website and find the most recent year in the database, that is the value to use |
While Temoa indexes periods by period-start (e.g., 2025 for 2025-2030), we now plan for END of period, with all end of period data (e.g., 2030 data for 2025-2030, though it is still labelled 2025 in the database and in Temoa).