Groundwork for DB Version 3.1 and Unit Checking #124

jeff-ws · 2025-09-23T03:22:37Z

This PR sets some groundwork for version 3.1 of the DB and includes a unit checking module

Establishes v3.1 schema which adds 'units' to many tables
Provides a migration tool from v3.0 to v3.1 of the database
Updates many of the requirements to current versions, including adding pint as a requirement for future use with units
Adds a simple comparison tool to utilities that compares a DB to an arbitrary Temoa schema that is helpful in migration tasks

Also included is the basic Unit Checking package in temoa/temoa_model/unit_checking

Unit checks can be run on a DB by:

Use the utility to convert the DB to version 3.1
Select unit checking in the config file (or leave it blank, currently default is True)
Run in perfect foresight or check mode

Summary by CodeRabbit

New Features
- Added full and minimal v3.1 database schemas, a DB migration tool, schema-compare report, and an automated unit-screening workflow with reports and a config flag.
Bug Fixes
- Fixed season reference in curtailment outputs; renamed capacity-share column from max_proportion → min_proportion across example/test DBs.
Refactor
- Standardized output writes to use explicit column lists for v3.x compatibility.
Tests
- Added unit-checker tests.
Chores
- Added Pint and updated dependency pins; adjusted migration CLI default schema path.

- adding timing to post-processing

- queues are now swapping data vs. models - table_writer.py broken up into data pulling portion/writing portion

…s at 20 threads ea.

- commented some functions - removed some erroneous "sizeof" measures

- was hanging while putting in shutdown signals

- works with Utopia - logging cleaned up

Added additional QA steps during data transfer.

… and flex to avoid collisions if both write independently

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (9)

tests/testing_data/storageville.sql (2)
439-441: Fix wrong FK: season must reference TimeSeason (not TimePeriod).

As in v3/v3.1 schemas; this FK is incorrect here.
-    season      TEXT
-        REFERENCES TimePeriod (period),
+    season      TEXT
+        REFERENCES TimeSeason (season),
528-531: Use TimeSeason for season FK in OutputFlowOut and OutputCurtailment

Replace
season TEXT REFERENCES TimePeriod (period)
with
season TEXT REFERENCES TimeSeason (season)
for the OutputCurtailment and OutputFlowOut table definitions in these files:

tests/testing_data/simple_linked_tech.sql

tests/testing_data/utopia.sql

tests/testing_data/storageville.sql

tests/testing_data/mediumville.sql

tests/testing_data/emissions.sql

tests/testing_data/test_system.sql
data_files/example_dbs/utopia.sql (2)
799-801: Update OutputCurtailment season FK to TimeSeason.

v3 schema now references TimeSeason; example DB should match.
-    season      TEXT
-        REFERENCES TimePeriod (period),
+    season      TEXT
+        REFERENCES TimeSeason (season),
887-889: Update OutputFlowOut season FK to TimeSeason.

Align with schema.
-    season      TEXT
-        REFERENCES TimePeriod (period),
+    season      TEXT
+        REFERENCES TimeSeason (season),
temoa/temoa_model/hybrid_loader.py (1)
943-949: Fix wrong target param: load MaxNewCapacityShare into M.MaxNewCapacityShare (not M.MaxCapacityShare).

Same routing bug as above for the “Max” table.
-            load_element(M.MaxCapacityShare, raw, self.viable_rt, (0, 2))
+            load_element(M.MaxNewCapacityShare, raw, self.viable_rt, (0, 2))
data_files/example_dbs/stepped_demand.sql (1)

11-12: Bump DB_MINOR to 1 for DBs using 3.1 schema semantics

Several SQL assets declare DB_MAJOR=3 and DB_MINOR=0 but include 3.1 schema changes — min_proportion appears in MaxNewCapacityShare. Affected files:

tests/testing_data/utopia.sql

tests/testing_data/storageville.sql

tests/testing_data/emissions.sql

tests/testing_data/test_system.sql

tests/testing_data/mediumville.sql

tests/testing_data/simple_linked_tech.sql

data_files/example_dbs/utopia.sql

data_files/example_dbs/morris_utopia.sql

data_files/example_dbs/stepped_demand.sql

Action: either set DB_MINOR=1 in these files or revert the 3.1 schema change (ensure min_proportion is only on MinNewCapacityShare).
tests/testing_data/test_system.sql (2)
802-804: Fix FK: season should reference TimeSeason, not TimePeriod (OutputCurtailment).

Current FK prevents valid season values and can break inserts on FK‑enforced DBs.

Apply this diff:
-    season      TEXT
-        REFERENCES TimePeriod (period),
+    season      TEXT
+        REFERENCES TimeSeason (season),
891-892: Fix FK: season should reference TimeSeason, not TimePeriod (OutputFlowOut).

Aligns with OutputFlowIn and the canonical schema.
-    season      TEXT
-        REFERENCES TimePeriod (period),
+    season      TEXT
+        REFERENCES TimeSeason (season),
data_files/example_dbs/morris_utopia.sql (1)
11-12: DB version metadata likely inconsistent with schema changes

This example DB sets DB_MINOR=0 but includes the MinNewCapacityShare rename indicative of v3.1 changes. If this file is intended to represent v3.1, bump DB_MINOR to 1 to avoid migration/versioning confusion.

Apply:
-INSERT INTO MetaData VALUES('DB_MINOR',0,'DB minor version number');
+INSERT INTO MetaData VALUES('DB_MINOR',1,'DB minor version number');

🧹 Nitpick comments (12)

data_files/temoa_schema_v3.sql (1)

410-411: Consider adding a sector FK for consistency.

Other Output* tables constrain sector to SectorLabel(sector); OutputCurtailment’s sector is unconstrained here.

Apply:
   sector      TEXT,
+        REFERENCES SectorLabel (sector),

requirements.txt (2)

371-375: Avoid installing both psycopg[binary] and psycopg-binary.

Pick one form to prevent redundant wheels and resolver confusion (prefer psycopg[binary]).

Apply in requirements.in (then re‑compile) to leave only the chosen variant.

534-541: Move sqlalchemy[mypy] to dev dependencies.

Pulls mypy into runtime; reserve for a dev/extra lockfile.

Propose: drop sqlalchemy[mypy] from runtime requirements.in and add it to a dev requirements file; re‑compile.

temoa/utilities/db_migration_v3_to_v3_1.py (4)

191-198: Severity label mismatch: “WARNING” but exits.

Message says WARNING yet aborts; make it ERROR.

-            print(f'WARNING: Column mismatch in {table_name}')
+            print(f'ERROR: Column mismatch in {table_name}')

Apply similarly at Lines 229-236.

164-170: Exclude SQLite internal tables from coverage check.

Prevents false deltas if sqlite_sequence appears.

-table_query_result = con_new.execute('SELECT name FROM sqlite_master WHERE type="table"').fetchall()
+table_query_result = con_new.execute(
+    "SELECT name FROM sqlite_master WHERE type='table' AND name NOT LIKE 'sqlite_%'"
+).fetchall()

300-308: Tweak FK check output.

Print a neutral header; only shout on failures.

-    print('FK check fails (MUST BE FIXED):')
-    if not data:
-        print('\tNo Foreign Key Failures.  (Good news!)')
-    else:
-        print('\t(Table, Row ID, Reference Table, (fkid) )')
+    print('Foreign key check:')
+    if not data:
+        print('\tNo Foreign Key Failures.  (Good news!)')
+    else:
+        print('\tFAILURES (Table, Row ID, Reference Table, (fkid))')

128-146: Preflight existence checks for source DB and schema file.

Fail fast with clear errors.

 options = parser.parse_args()
 legacy_db: Path = Path(options.source_db)
 schema_file = Path(options.schema)
 
+if not legacy_db.exists():
+    print(f'ERROR: source database not found: {legacy_db}')
+    sys.exit(2)
+if not schema_file.exists():
+    print(f'ERROR: schema file not found: {schema_file}')
+    sys.exit(2)

temoa/temoa_model/hybrid_loader.py (1)

211-211: Remove unreachable statement.

This pass after a return is dead code.
-        pass

tests/testing_data/test_system.sql (1)

799-799: Add missing FK for sector (OutputCurtailment).

Other output tables constrain sector; this one should too for consistency and data integrity.
-    sector      TEXT,
+    sector      TEXT
+        REFERENCES SectorLabel (sector),

temoa/temoa_model/table_writer.py (3)

235-239: Batch insert objectives; avoid per‑row commit.

Reduces round‑trips and speeds up writes.

-        for obj_name, obj_value in obj_vals:
-            qry = 'INSERT INTO OutputObjective (scenario, objective_name, total_system_cost) VALUES (?, ?, ?)'
-            data = (scenario_name, obj_name, obj_value)
-            self.con.execute(qry, data)
-            self.con.commit()
+        rows = [(scenario_name, obj_name, obj_value) for obj_name, obj_value in obj_vals]
+        qry = 'INSERT INTO OutputObjective (scenario, objective_name, total_system_cost) VALUES (?, ?, ?)'
+        self.con.executemany(qry, rows)
+        self.con.commit()

258-260: Lint: S608 false positive; table is constant here.

Safe to suppress on these constant‑table INSERTs.

-        qry = f'INSERT INTO OutputEmission (scenario, region, sector, period, emis_comm, tech, vintage, emission) VALUES {_marks(8)}'
+        qry = f'INSERT INTO OutputEmission (scenario, region, sector, period, emis_comm, tech, vintage, emission) VALUES {_marks(8)}'  # noqa: S608

-        qry = f'INSERT INTO OutputFlowOutSummary (scenario, region, sector, period, input_comm, tech, vintage, output_comm, flow) VALUES {_marks(9)}'
+        qry = f'INSERT INTO OutputFlowOutSummary (scenario, region, sector, period, input_comm, tech, vintage, output_comm, flow) VALUES {_marks(9)}'  # noqa: S608

Also applies to: 404-405

334-356: Avoid dynamic SQL fields; pre-bake whitelisted queries to satisfy S608 and harden.

Current construction is safe (whitelisted), but a static mapping silences S608 and removes string assembly at runtime.

-        table_associations = {
-            FlowType.OUT: ('OutputFlowOut', 'flow'),
-            FlowType.IN: ('OutputFlowIn', 'flow'),
-            FlowType.CURTAIL: ('OutputCurtailment', 'curtailment'),
-        }
-        for flow_type, (table_name, value_field_name) in table_associations.items():
-            fields = ', '.join(
-                (
-                    'scenario',
-                    'region',
-                    'sector',
-                    'period',
-                    'season',
-                    'tod',
-                    'input_comm',
-                    'tech',
-                    'vintage',
-                    'output_comm',
-                    value_field_name,
-                )
-            )
-            qry = f'INSERT INTO {table_name} ({fields}) VALUES {_marks(11)}'
-            self.con.executemany(qry, flows_by_type[flow_type])
+        table_associations = {
+            FlowType.OUT: 'OutputFlowOut',
+            FlowType.IN: 'OutputFlowIn',
+            FlowType.CURTAIL: 'OutputCurtailment',
+        }
+        QUERIES = {
+            'OutputFlowOut': 'INSERT INTO OutputFlowOut (scenario, region, sector, period, season, tod, input_comm, tech, vintage, output_comm, flow) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)',
+            'OutputFlowIn': 'INSERT INTO OutputFlowIn (scenario, region, sector, period, season, tod, input_comm, tech, vintage, output_comm, flow) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)',
+            'OutputCurtailment': 'INSERT INTO OutputCurtailment (scenario, region, sector, period, season, tod, input_comm, tech, vintage, output_comm, curtailment) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)',
+        }
+        for flow_type, table_name in table_associations.items():
+            qry = QUERIES[table_name]
+            self.con.executemany(qry, flows_by_type[flow_type])

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ad035bb and 08586e0.

📒 Files selected for processing (18)

data_files/example_dbs/morris_utopia.sql (1 hunks)
data_files/example_dbs/stepped_demand.sql (1 hunks)
data_files/example_dbs/utopia.sql (1 hunks)
data_files/temoa_schema_minimal_v3.sql (1 hunks)
data_files/temoa_schema_minimal_v3_1.sql (1 hunks)
data_files/temoa_schema_v3.sql (2 hunks)
data_files/temoa_schema_v3_1.sql (1 hunks)
requirements.txt (15 hunks)
temoa/temoa_model/hybrid_loader.py (1 hunks)
temoa/temoa_model/table_writer.py (11 hunks)
temoa/utilities/db_migration_v3_to_v3_1.py (1 hunks)
temoa/utilities/db_schema_compare.py (1 hunks)
tests/testing_data/emissions.sql (1 hunks)
tests/testing_data/mediumville.sql (1 hunks)
tests/testing_data/simple_linked_tech.sql (1 hunks)
tests/testing_data/storageville.sql (1 hunks)
tests/testing_data/test_system.sql (1 hunks)
tests/testing_data/utopia.sql (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

temoa/utilities/db_schema_compare.py
data_files/temoa_schema_v3_1.sql

🧰 Additional context used

🧬 Code graph analysis (1)

temoa/temoa_model/table_writer.py (1)

temoa/temoa_model/table_data_puller.py (2)

_marks (48-52)

FlowType (60-67)

🪛 Ruff (0.13.1)

temoa/temoa_model/table_writer.py

258-258: Possible SQL injection vector through string-based query construction

(S608)

355-355: Possible SQL injection vector through string-based query construction

(S608)

404-404: Possible SQL injection vector through string-based query construction

(S608)

temoa/utilities/db_migration_v3_to_v3_1.py

201-201: Possible SQL injection vector through string-based query construction

(S608)

214-214: Possible SQL injection vector through string-based query construction

(S608)

243-243: Possible SQL injection vector through string-based query construction

(S608)

253-253: Possible SQL injection vector through string-based query construction

(S608)

270-270: Consider [*old_cols, 'units'] instead of concatenation

Replace with [*old_cols, 'units']

(RUF005)

279-279: Possible SQL injection vector through string-based query construction

(S608)

291-291: Possible SQL injection vector through string-based query construction

(S608)

🔇 Additional comments (18)

data_files/temoa_schema_v3.sql (1)

414-414: Correct FK target for season in OutputCurtailment.

Switch to TimeSeason(season) is correct and consistent with the rest of the schema.

requirements.txt (2)

2-5: Lockfile now matches Python 3.12 (earlier mismatch resolved).

Header reflects 3.12; good.

349-353: pint addition acknowledged.

Pinned at 0.25; aligns with units groundwork.

temoa/utilities/db_migration_v3_to_v3_1.py (1)

239-245: Good fix: parameterized WHERE for excludes.

Eliminates SQL‑injection risk flagged earlier.

data_files/temoa_schema_minimal_v3_1.sql (2)

38-45: OutputObjective adds units and PK—good.

Matches v3.1 design.

358-362: Correct FK: season now references TimeSeason.

Fix aligns minimal schema with full schema.

tests/testing_data/storageville.sql (1)

849-849: Rename to min_proportion is correct.

Schema and loaders expect min_proportion now.

data_files/example_dbs/utopia.sql (1)

1211-1211: Rename to min_proportion is correct.

Matches schema updates.

data_files/temoa_schema_minimal_v3.sql (1)

349-349: Correct FK target for season in OutputCurtailment.

Now references TimeSeason(season); consistent with other tables.

data_files/example_dbs/stepped_demand.sql (1)

1233-1233: LGTM: MinNewCapacityShare now uses min_proportion.

Matches loader/query expectations and schema intent.

tests/testing_data/mediumville.sql (1)

931-931: LGTM: MinNewCapacityShare uses min_proportion.

tests/testing_data/simple_linked_tech.sql (1)

825-825: LGTM: MinNewCapacityShare uses min_proportion.

tests/testing_data/emissions.sql (1)

838-838: LGTM: MinNewCapacityShare uses min_proportion.

tests/testing_data/utopia.sql (1)

1211-1211: LGTM: MinNewCapacityShare uses min_proportion.

temoa/temoa_model/table_writer.py (2)

323-333: Good fix: merge FLEX into CURTAIL to avoid PK collisions.

Resolves prior UNIQUE violations on OutputCurtailment inserts.

274-276: LGTM: explicit column lists for INSERTs.

Prevents schema‑order coupling; compatible with v3.0/v3.1.

Also applies to: 283-285, 292-294, 531-533, 545-546, 564-565

tests/testing_data/test_system.sql (1)

1237-1237: Rename to min_proportion is correct — approved.

Verified: MinNewCapacityShare CREATE TABLEs use min_proportion (e.g. tests/testing_data/test_system.sql 1228–1245) and hybrid_loader.py selects min_proportion from main.MinNewCapacityShare (hybrid_loader.py:939); max_proportion occurrences remain only for Max* tables, no lingering max_proportion on MinNewCapacityShare.

data_files/example_dbs/morris_utopia.sql (1)

1014-1026: Rename to min_proportion looks correct — no remaining MinNewCapacityShare.max_proportion references

Search found no occurrences of MinNewCapacityShare.max_proportion. Confirmed min_proportion is used for MinNewCapacityShare in:

temoa/temoa_model/hybrid_loader.py ('SELECT ... min_proportion FROM main.MinNewCapacityShare')

temoa/temoa_model/temoa_rules.py (value(M.MinNewCapacityShare[...] ))

temoa/temoa_model/temoa_model.py (Param M.MinNewCapacityShare)

CREATE TABLEs in data_files/temoa_schema_v3(.sql), data_files/temoa_schema_v3_1.sql and data_files/example_dbs/*

max_proportion remains only on Max* tables (expected).

data_files/temoa_schema_v3.sql

temoa/temoa_model/hybrid_loader.py

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

temoa/temoa_model/hybrid_loader.py (1)

944-948: MaxNewCapacityShare: correct target; mirror myopic period gating for consistency.

The loader now targets M.MaxNewCapacityShare correctly. Recommend adding mi window filtering like other period-indexed params.

Apply:

-        if self.table_exists('MaxNewCapacityShare'):
-            raw = cur.execute(
-                'SELECT region, period, tech, group_name, max_proportion FROM main.MaxNewCapacityShare'
-            ).fetchall()
-            load_element(M.MaxNewCapacityShare, raw, self.viable_rt, (0, 2))
+        if self.table_exists('MaxNewCapacityShare'):
+            if mi:
+                raw = cur.execute(
+                    'SELECT region, period, tech, group_name, max_proportion '
+                    'FROM main.MaxNewCapacityShare '
+                    'WHERE period >= ? AND period <= ?',
+                    (mi.base_year, mi.last_demand_year),
+                ).fetchall()
+            else:
+                raw = cur.execute(
+                    'SELECT region, period, tech, group_name, max_proportion FROM main.MaxNewCapacityShare'
+                ).fetchall()
+            load_element(M.MaxNewCapacityShare, raw, self.viable_rt, (0, 2))

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 08586e0 and e4a1e20.

📒 Files selected for processing (1)

temoa/temoa_model/hybrid_loader.py (1 hunks)

temoa/temoa_model/hybrid_loader.py

…hould have the read "gated" by myopic index if running in myopic mode

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (5)

temoa/temoa_model/hybrid_loader.py (3)

924-934: Reduce repeated myopic gating boilerplate with a small helper.

The mi/else SELECT pattern repeats across many blocks; extract a helper to keep SQL uniform and reduce copy/paste risk.

Example helper and usage:

+        def select_with_optional_period_bounds(sql_base: str, params: tuple[int, int] | None):
+            if params:
+                return cur.execute(sql_base + ' WHERE period >= ? AND period <= ?', params).fetchall()
+            return cur.execute(sql_base).fetchall()

Apply (illustrative for MinCapacityShare):

-            if mi:
-                raw = cur.execute(
-                    'SELECT region, period, tech, group_name, min_proportion FROM main.MinCapacityShare'
-                    ' WHERE period >= ? AND period <= ?',
-                    (mi.base_year, mi.last_demand_year),
-                ).fetchall()
-            else:
-                raw = cur.execute(
-                    'SELECT region, period, tech, group_name, min_proportion FROM main.MinCapacityShare'
-                ).fetchall()
+            raw = select_with_optional_period_bounds(
+                'SELECT region, period, tech, group_name, min_proportion FROM main.MinCapacityShare',
+                (mi.base_year, mi.last_demand_year) if mi else None,
+            )

1070-1080: Validate output commodity too for AnnualCapacityFactor (use r,p,t,oc validator).

Use viable_rpto to ensure oc keys are in the viable set; current viable_rt ignores oc.

Apply:

-            load_element(M.MinAnnualCapacityFactor, raw, self.viable_rt, (0, 2))
+            load_element(M.MinAnnualCapacityFactor, raw, self.viable_rpto, (0, 1, 2, 3))

1084-1094: Mirror validator change for MaxAnnualCapacityFactor.

Screen on (r,p,t,oc) rather than just (r,t).

Apply:

-            load_element(M.MaxAnnualCapacityFactor, raw, self.viable_rt, (0, 2))
+            load_element(M.MaxAnnualCapacityFactor, raw, self.viable_rpto, (0, 1, 2, 3))

temoa/utilities/db_migration_v3_to_v3_1.py (2)

301-313: Foreign-key check messaging prints “fails” even when passing.

Only print failures header when data is non-empty.

Apply:

-con_new.execute('PRAGMA FOREIGN_KEYS=1;')
+con_new.execute('PRAGMA FOREIGN_KEYS=1;')
 try:
     data = con_new.execute('PRAGMA FOREIGN_KEY_CHECK;').fetchall()
-    print('FK check fails (MUST BE FIXED):')
-    if not data:
-        print('\tNo Foreign Key Failures.  (Good news!)')
-    else:
-        print('\t(Table, Row ID, Reference Table, (fkid) )')
-        for row in data:
-            print(f'\t{row}')
+    if not data:
+        print('Foreign key check: No failures. (Good news!)')
+    else:
+        print('Foreign key check failures (MUST BE FIXED):')
+        print('\t(Table, Row ID, Reference Table, (fkid) )')
+        for row in data:
+            print(f'\t{row}')
 except sqlite3.OperationalError as e:

152-153: Remove unused cursor variable.

cur is assigned but never used.

Apply:

-con_new = sqlite3.connect(new_db_path)
-cur = con_new.cursor()
+con_new = sqlite3.connect(new_db_path)

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e4a1e20 and 32628e3.

📒 Files selected for processing (2)

temoa/temoa_model/hybrid_loader.py (3 hunks)
temoa/utilities/db_migration_v3_to_v3_1.py (1 hunks)

🧰 Additional context used

🪛 Ruff (0.13.1)

temoa/utilities/db_migration_v3_to_v3_1.py

201-201: Possible SQL injection vector through string-based query construction

(S608)

214-214: Possible SQL injection vector through string-based query construction

(S608)

243-243: Possible SQL injection vector through string-based query construction

(S608)

253-253: Possible SQL injection vector through string-based query construction

(S608)

270-270: Consider [*old_cols, 'units'] instead of concatenation

Replace with [*old_cols, 'units']

(RUF005)

281-281: Possible SQL injection vector through string-based query construction

(S608)

293-293: Possible SQL injection vector through string-based query construction

(S608)

🔇 Additional comments (8)

temoa/temoa_model/hybrid_loader.py (6)

924-934: LGTM: Myopic gating and min_proportion for MinCapacityShare are correct.

Query columns and target param align with v3.1; validator and val_loc match existing pattern.

938-949: LGTM: MaxCapacityShare uses proper myopic gating and max_proportion.

Consistent with v3.1 schema and existing validator pattern.

952-963: Correct mapping for MinNewCapacityShare; good fix and gating.

Loads into M.MinNewCapacityShare with min_proportion and myopic gating; matches prior review guidance.

966-977: LGTM: MaxNewCapacityShare query and mapping look correct.

max_proportion, correct target, and myopic gating.

1008-1019: LGTM: MinActivityShare myopic gating and column selection are correct.

Validator and val_loc consistent with other r,p,t,g params.

1022-1033: LGTM: MaxActivityShare myopic gating and column selection are correct.

Matches schema and existing patterns.

temoa/utilities/db_migration_v3_to_v3_1.py (2)

199-216: Query construction uses f-strings; safe here, but keep table/column sources immutable.

Table and column names come from vetted in-code lists/PRAGMA, so S608 is a false positive. Don’t accept user input for these identifiers.

Confirm that:

direct_transfer_tables, add_units_tables, transfer_with_mod keys are not influenced by CLI args or external files; and

you don’t derive table/column identifiers from the legacy DB contents.

270-277: Units-column transfer: verify NULLability/defaults to prevent insert failures.

Migration inserts rows without the new "units" column — this will fail if "units" is NOT NULL without a DEFAULT. Automated scan of data_files/temoa_schema_v3_1.sql returned no "units" definitions (follow-up probe errored). Confirm that for every table in add_units_tables the "units" column is NULLABLE or has a DEFAULT; otherwise include "units" in the INSERTs or change the schema before running the migration.

temoa/utilities/db_migration_v3_to_v3_1.py

…on.py

…ure.

…y numerator/denominator

coderabbitai

Actionable comments posted: 9

🧹 Nitpick comments (5)

temoa/temoa_model/unit_checking/table_checker.py (1)
72-74: Prefer fullmatch for character validation.

Use re.fullmatch instead of search with anchors for clarity.
-        valid_chars = re.search(ACCEPTABLE_CHARACTERS, expr)
+        valid_chars = re.fullmatch(ACCEPTABLE_CHARACTERS, expr)
temoa/temoa_model/unit_checking/screener.py (1)
34-34: Top-level import of PROJECT_ROOT breaks library import.

Move from definitions import PROJECT_ROOT into the __main__ block to avoid import errors when consuming as a package.
-from definitions import PROJECT_ROOT
In the main block:
 if __name__ == '__main__':
+    from definitions import PROJECT_ROOT
     db_path = Path(PROJECT_ROOT) / 'data_files/mike_US/US_9R_8D_v3_stability_v3_1.sqlite'
     screen(db_path, report_path=Path(PROJECT_ROOT) / 'output_files/')
temoa/temoa_model/unit_checking/entry_checker.py (1)
83-87: Guard against SQL injection via table name (whitelist).

Validate table against an allowlist before string-interpolating into SQL. The call sites use internal constants; enforce that here.
-from temoa.temoa_model.unit_checking.common import (
-    UnitsFormat,
-)
+from temoa.temoa_model.unit_checking.common import (
+    UnitsFormat,
+    tables_with_units,
+)
@@
 def gather_from_table(conn: sqlite3.Connection, table: str) -> dict[str, list[int]]:
     """gather all unique "units" entries from a table and collect the row indices"""
 
     res = defaultdict(list)
     with conn:
         cur = conn.cursor()
+        if table not in tables_with_units:
+            raise ValueError(f'Unexpected table name: {table}')
         cur.execute(f'SELECT units FROM {table}')
Please confirm no external/user input reaches table. If it does, we should harden further (quote identifiers or map to known names). Based on learnings
temoa/temoa_model/unit_checking/relation_checker.py (2)
170-176: Prefer logging.exception to capture traceback on SQL failure.
-    except sqlite3.OperationalError:
-        logger.error('failed to process query: %s when processing table %s', query, table_name)
+    except sqlite3.OperationalError:
+        logger.exception('failed to process query: %s when processing table %s', query, table_name)
Also consider asserting table_name from an allowlist to mitigate S608. Based on hints

250-256: Prefer logging.exception on query failure.
-        except sqlite3.OperationalError:
-            logger.error(
+        except sqlite3.OperationalError:
+            logger.exception(
                 'failed to process query: %s when processing table %s', query, ct.table_name
             )

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 32628e3 and 74c12a8.

📒 Files selected for processing (13)

docs/source/Documentation.rst (1 hunks)
temoa/extensions/modeling_to_generate_alternatives/mga_sequencer.py (3 hunks)
temoa/extensions/myopic/myopic_sequencer.py (2 hunks)
temoa/temoa_model/temoa_config.py (3 hunks)
temoa/temoa_model/temoa_sequencer.py (4 hunks)
temoa/temoa_model/unit_checking/__init__.py (1 hunks)
temoa/temoa_model/unit_checking/common.py (1 hunks)
temoa/temoa_model/unit_checking/entry_checker.py (1 hunks)
temoa/temoa_model/unit_checking/relation_checker.py (1 hunks)
temoa/temoa_model/unit_checking/screener.py (1 hunks)
temoa/temoa_model/unit_checking/table_checker.py (1 hunks)
temoa/temoa_model/unit_checking/temoa_units.txt (1 hunks)
tests/test_unit_checker.py (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (7)

temoa/extensions/modeling_to_generate_alternatives/mga_sequencer.py (1)

temoa/temoa_model/unit_checking/screener.py (1)

screen (54-231)

temoa/temoa_model/unit_checking/table_checker.py (2)

temoa/temoa_model/unit_checking/common.py (1)

consolidate_lines (166-173)

temoa/temoa_model/unit_checking/entry_checker.py (3)

validate_units_expression (64-74)

validate_units_format (49-61)

gather_from_table (77-89)

temoa/temoa_model/unit_checking/screener.py (2)

temoa/temoa_model/unit_checking/relation_checker.py (5)

check_efficiency_table (88-152)

make_commodity_lut (53-65)

check_inter_table_relations (155-225)

check_cost_tables (228-341)

make_c2a_lut (68-80)

temoa/temoa_model/unit_checking/table_checker.py (1)

check_table (54-142)

temoa/temoa_model/unit_checking/entry_checker.py (1)

temoa/temoa_model/unit_checking/common.py (1)

UnitsFormat (144-146)

temoa/temoa_model/temoa_sequencer.py (1)

temoa/temoa_model/unit_checking/screener.py (1)

screen (54-231)

temoa/temoa_model/unit_checking/relation_checker.py (2)

temoa/temoa_model/unit_checking/common.py (1)

consolidate_lines (166-173)

temoa/temoa_model/unit_checking/entry_checker.py (2)

validate_units_format (49-61)

validate_units_expression (64-74)

temoa/extensions/myopic/myopic_sequencer.py (1)

temoa/temoa_model/unit_checking/screener.py (1)

screen (54-231)

🪛 LanguageTool

temoa/temoa_model/unit_checking/temoa_units.txt

[grammar] ~1-~1: There might be a mistake here.
Context: ...ew additions needed to for completeness in Temoa model # the units here AUGMENT th...

(QB_NEW_EN)

[grammar] ~4-~4: There might be a mistake here.
Context: ...ion for more info. passenger = [person] seat = [object] vehicle = [object] # se...

(QB_NEW_EN)

[grammar] ~5-~5: There might be a mistake here.
Context: ...o. passenger = [person] seat = [object] vehicle = [object] # see pint's notes a...

(QB_NEW_EN)

[grammar] ~8-~8: There might be a mistake here.
Context: ...at = [object] vehicle = [object] # see pint's notes about currency and conversion bef...

(QB_NEW_EN)

[grammar] ~11-~11: There might be a mistake here.
Context: ...ition to USD. dollar = [currency] = USD euro = [currency] = EUR # Temoa uses et...

(QB_NEW_EN)

🪛 Ruff (0.14.0)

temoa/temoa_model/unit_checking/table_checker.py

141-141: Avoid specifying long messages outside the exception class

(TRY003)

temoa/temoa_model/unit_checking/screener.py

67-67: Avoid specifying long messages outside the exception class

(TRY003)

temoa/temoa_model/unit_checking/entry_checker.py

72-72: Consider moving this statement to an else block

(TRY300)

83-83: Possible SQL injection vector through string-based query construction

(S608)

temoa/temoa_model/unit_checking/relation_checker.py

163-165: Possible SQL injection vector through string-based query construction

(S608)

168-168: Possible SQL injection vector through string-based query construction

(S608)

173-173: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

246-249: Avoid specifying long messages outside the exception class

(TRY003)

250-250: Possible SQL injection vector through string-based query construction

(S608)

254-256: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

tests/test_unit_checker.py

59-59: Unused function argument: expected

(ARG001)

🔇 Additional comments (1)

temoa/temoa_model/unit_checking/common.py (1)

166-173: LGTM: helpful line consolidation utility.

Clear and constrained output length; used consistently.

coderabbitai · 2025-10-17T00:53:09Z

temoa/temoa_model/unit_checking/relation_checker.py

+from mypy.checkexpr import defaultdict
+from mypy.message_registry import NamedTuple
+from pint.registry import Unit


⚠️ Potential issue | 🔴 Critical

Wrong imports: using mypy internals breaks runtime.

Import from stdlib modules.

-from mypy.checkexpr import defaultdict -from mypy.message_registry import NamedTuple +from collections import defaultdict +from typing import NamedTuple

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

from mypy.checkexpr import defaultdict

from mypy.message_registry import NamedTuple

from pint.registry import Unit

from collections import defaultdict

from typing import NamedTuple

from pint.registry import Unit

🤖 Prompt for AI Agents

In temoa/temoa_model/unit_checking/relation_checker.py around lines 34–36, the code imports mypy internals and pint.registry which breaks runtime; replace those with standard-library and public-package imports: use "from collections import defaultdict", "from typing import NamedTuple", and import Unit from pint's public API (e.g. "from pint.unit import Unit") so the module uses stable, supported symbols at runtime.

temoa/temoa_model/unit_checking/relation_checker.py

coderabbitai · 2025-10-17T00:53:09Z

temoa/temoa_model/unit_checking/relation_checker.py

+        for idx, (tech, raw_units_expression) in enumerate(rows, start=1):
+            # convert to pint expression
+            cost_units, measure_units = None, None
+            # screen for empty/missing raw inputs
+            if not raw_units_expression:
+                label = f'{ct.table_name}:  Unprocessed row (missing units): {raw_units_expression}'
+                table_grouped_errors[label].append(idx)
+                continue
+            valid, (raw_cost, raw_units) = validate_units_format(
+                raw_units_expression, RATIO_ELEMENT
+            )
+            if valid:
+                cost_valid, cost_units = validate_units_expression(raw_cost)
+                units_valid, measure_units = validate_units_expression(raw_units)
+            else:
+                cost_valid, units_valid = False, False
+            if not (cost_valid and units_valid):
+                label = f'{ct.table_name}:  Unprocessed row (invalid units--see earlier tests): {raw_units_expression}'
+                table_grouped_errors[label].append(idx)
+                continue
+
+            # Test 1: Look for cost commonality
+            if common_cost_unit is None:
+                # try to establish it
+                # check that what we have captured is in the currency dimension == "clean"
+                # dev note:  multiplying by 1 allows us to use the check_units_expression() function
+                if (1 * cost_units).check('[currency]'):
+                    common_cost_unit = cost_units
+                else:
+                    # something is wrong, hopefully it was just this entry?
+                    # mark it, dump it, and try again...
+                    error_msgs.append(
+                        f'{ct.table_name}:  Unprocessed row (unreducible cost units): {cost_units} at row: {idx}'
+                    )
+                    continue
+            else:  # use the common units to test
+                if cost_units != common_cost_unit:
+                    label = f'{ct.table_name}:  Non-standard cost found (expected common cost units of {common_cost_unit}) got: {cost_units}'
+                    table_grouped_errors[label].append(idx)
+
+            # Test 2:  Check the units of measure to ensure alignment with the tech's output units
+            # find the referenced commodity units from the tech or commodity depending on table structure...
+            if ct.commodity_reference:
+                commodity_units = commodity_lut.get(ct.commodity_reference)
+                if not commodity_units:
+                    label = f'{ct.table_name}:  Unprocessed row (unknown commodity: {ct.commodity_reference}) '
+                    table_grouped_errors[label].append(idx)
+                    continue
+            else:
+                tech_io = tech_lut.get(tech)
+                if tech_io:
+                    commodity_units = tech_io.output_units
+                else:
+                    label = f'{ct.table_name}:  Unprocessed row (unknown tech: {tech}) '
+                    table_grouped_errors[label].append(idx)
+                    continue
+
+            # pull the C2A factor if this table is capacity-based and determine the "match units" which should
+            # match the commodity units in the table, after removing the "per period" time factor, if it exists
+            c2a_units = None
+            if ct.capacity_based:
+                c2a_units = c2a_lut.get(tech, ureg.dimensionless)  # default is dimensionless
+                # apply to convert
+                measure_units *= c2a_units * ureg.year
+
+            if ct.period_based:
+                measure_units /= ureg.year  # remove the "per year" from this denominator
+
+            matched = measure_units == commodity_units
+
+            if not matched:
+                tech_reference = ct.commodity_reference if ct.commodity_reference else tech
+                label = (
+                    f'{ct.table_name}:  Non-matching measure unit found in cost denominator for tech/commodity {tech_reference}: {raw_units_expression}'
+                    f'\n    Commodity units: {commodity_units}, Discovered (after conversions): {measure_units}'
+                    f'\n    Conversions:  c2a units: {c2a_units*ureg.year if c2a_units else "N/A"}{", `per period` removed" if ct.period_based else ""}\n   '
+                )
+                table_grouped_errors[label].append(idx)


⚠️ Potential issue | 🔴 Critical

Fix Cost table commodity lookup and identifier handling.

The first selected column is a commodity when commodity_reference is set. Current code looks up units using the column name (e.g., 'emis_comm') instead of the row value, yielding false “unknown commodity” errors and bad comparisons.

- for idx, (tech, raw_units_expression) in enumerate(rows, start=1): + for idx, (ref, raw_units_expression) in enumerate(rows, start=1): @@ - if ct.commodity_reference: - commodity_units = commodity_lut.get(ct.commodity_reference) + if ct.commodity_reference: + # `ref` is the commodity value from the selected column + commodity_units = commodity_lut.get(ref) if not commodity_units: - label = f'{ct.table_name}: Unprocessed row (unknown commodity: {ct.commodity_reference}) ' + label = f'{ct.table_name}: Unprocessed row (unknown commodity: {ref}) ' table_grouped_errors[label].append(idx) continue else: - tech_io = tech_lut.get(tech) + tech_io = tech_lut.get(ref) if tech_io: commodity_units = tech_io.output_units else: - label = f'{ct.table_name}: Unprocessed row (unknown tech: {tech}) ' + label = f'{ct.table_name}: Unprocessed row (unknown tech: {ref}) ' table_grouped_errors[label].append(idx) continue @@ - if ct.capacity_based: - c2a_units = c2a_lut.get(tech, ureg.dimensionless) # default is dimensionless + if ct.capacity_based: + c2a_units = c2a_lut.get(ref, ureg.dimensionless) # default is dimensionless # apply to convert measure_units *= c2a_units * ureg.year @@ - if not matched: - tech_reference = ct.commodity_reference if ct.commodity_reference else tech + if not matched: + tech_reference = ref label = ( f'{ct.table_name}: Non-matching measure unit found in cost denominator for tech/commodity {tech_reference}: {raw_units_expression}' f'\n Commodity units: {commodity_units}, Discovered (after conversions): {measure_units}' f'\n Conversions: c2a units: {c2a_units*ureg.year if c2a_units else "N/A"}{", `per period` removed" if ct.period_based else ""}\n ' )

Optionally assert ct.table_name against a whitelist to mitigate S608. Based on hints

🤖 Prompt for AI Agents

In temoa/temoa_model/unit_checking/relation_checker.py around lines 260-337 the commodity lookup uses ct.commodity_reference (a column name) instead of the actual row value, causing false "unknown commodity" errors; change the lookup to use the row's commodity identifier (use commodity_lut.get(tech) when ct.commodity_reference is set) and adjust the error label to report the offending row value; also ensure tech_reference selection and subsequent messaging still use the correct identifier (tech when commodity_reference is present), and optionally assert ct.table_name against a whitelist to address S608 as hinted.

temoa/temoa_model/unit_checking/screener.py

temoa/temoa_model/unit_checking/table_checker.py

coderabbitai · 2025-10-17T00:53:10Z

temoa/temoa_model/unit_checking/table_checker.py

+                ref = {expr: converted_units[0] / converted_units[1]}
+                res.update(ref)
+            else:
+                raise ValueError('Unknown units format: %s', format_type)


⚠️ Potential issue | 🟡 Minor

Incorrect ValueError formatting; message not interpolated.

Use f-string; also satisfies TRY003 hint.

- raise ValueError('Unknown units format: %s', format_type) + raise ValueError(f'Unknown units format: {format_type}')

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

raise ValueError('Unknown units format: %s', format_type)

raise ValueError(f'Unknown units format: {format_type}')

🧰 Tools

🪛 Ruff (0.14.0)

141-141: Avoid specifying long messages outside the exception class

(TRY003)

🤖 Prompt for AI Agents

In temoa/temoa_model/unit_checking/table_checker.py around line 141, the ValueError is raised using comma-separated arguments which doesn't interpolate the format_type; change it to raise ValueError(f"Unknown units format: {format_type}") using an f-string (satisfies TRY003) so the message includes the actual format_type value.

tests/test_unit_checker.py

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (3)

temoa/temoa_model/unit_checking/relation_checker.py (3)

36-36: Wrong Unit import (Pint public API).

Import Unit from pint.unit, not pint.registry. Otherwise, runtime ImportError risk.

Apply:

-from pint.registry import Unit
+from pint.unit import Unit

116-129: Avoid KeyError when commodity missing; improve diagnostics.

Use .get() and handle unknown commodities in messages.

-        invalid_input_flag = input_units != comm_units[ic]
-        invalid_output_flag = output_units != comm_units[oc]
+        expected_in = comm_units.get(ic)
+        expected_out = comm_units.get(oc)
+        invalid_input_flag = expected_in is None or input_units != expected_in
+        invalid_output_flag = expected_out is None or output_units != expected_out
@@
-            if invalid_input_flag:
-                msg += f'\n    Invalid input units: {comm_units[ic]}'
-            if invalid_output_flag:
-                msg += f'\n    Invalid output units: {comm_units[oc]}'
+            if invalid_input_flag:
+                msg += f'\n    Invalid input units: {expected_in if expected_in is not None else "<unknown commodity>"}'
+            if invalid_output_flag:
+                msg += f'\n    Invalid output units: {expected_out if expected_out is not None else "<unknown commodity>"}'

260-336: Cost table: wrong identifier used for lookups and messages.

When commodity_reference is set, the first selected column is the commodity value, not the column name; also the loop var is misnamed. Fix lookups and labels; use the row value consistently. Adjust C2A tech key accordingly.

-        for idx, (tech, raw_units_expression) in enumerate(rows, start=1):
+        for idx, (ref, raw_units_expression) in enumerate(rows, start=1):
@@
-            if ct.commodity_reference:
-                commodity_units = commodity_lut.get(ct.commodity_reference)
+            if ct.commodity_reference:
+                # `ref` is the commodity value from the selected column
+                commodity_units = commodity_lut.get(ref)
                 if not commodity_units:
-                    label = f'{ct.table_name}:  Unprocessed row (unknown commodity: {ct.commodity_reference}) '
+                    label = f'{ct.table_name}:  Unprocessed row (unknown commodity: {ref}) '
                     table_grouped_errors[label].append(idx)
                     continue
             else:
-                tech_io = tech_lut.get(tech)
+                tech_io = tech_lut.get(ref)
                 if tech_io:
                     commodity_units = tech_io.output_units
                 else:
-                    label = f'{ct.table_name}:  Unprocessed row (unknown tech: {tech}) '
+                    label = f'{ct.table_name}:  Unprocessed row (unknown tech: {ref}) '
                     table_grouped_errors[label].append(idx)
                     continue
@@
-            if ct.capacity_based:
-                c2a_units = c2a_lut.get(tech, ureg.dimensionless)  # default is dimensionless
+            if ct.capacity_based:
+                c2a_units = c2a_lut.get(ref, ureg.dimensionless)  # default is dimensionless
                 # apply to convert
                 measure_units *= c2a_units * ureg.year
@@
-            if not matched:
-                tech_reference = ct.commodity_reference if ct.commodity_reference else tech
+            if not matched:
+                tech_reference = ref
                 label = (
                     f'{ct.table_name}:  Non-matching measure unit found in cost denominator for tech/commodity {tech_reference}: {raw_units_expression}'
                     f'\n    Commodity units: {commodity_units}, Discovered (after conversions): {measure_units}'
                     f'\n    Conversions:  c2a units: {c2a_units*ureg.year if c2a_units else "N/A"}{", `per period` removed" if ct.period_based else ""}\n   '
                 )

🧹 Nitpick comments (6)

temoa/temoa_model/unit_checking/relation_checker.py (6)
88-91: Fix type hint for commodity units.

These are pint Units, not strings.
-def check_efficiency_table(
-    conn: sqlite3.Connection, comm_units: dict[str, str]
-) -> tuple[dict[str, IOUnits], list[str]]:
+def check_efficiency_table(
+    conn: sqlite3.Connection, comm_units: dict[str, Unit]
+) -> tuple[dict[str, IOUnits], list[str]]:
163-169: Dynamic SQL identifiers: validate/whitelist to mitigate injection (S608).

table_name and identifiers should be validated against a known set (schema constants) before interpolation. Add an allowlist check.

Based on hints

Example helper (add once, then use before building the query):
def assert_known_identifier(name: str, allowed: set[str]) -> str:
    if name not in allowed:
        raise ValueError(f'Unknown identifier: {name}')
    return name
Ensure table_name comes from an allowlist (e.g., schema constants) and not user input, then keep the f-string.

173-175: Prefer logger.exception to preserve traceback.

Use exception() in except blocks.

Based on hints
-        logger.error('failed to process query: %s when processing table %s', query, table_name)
+        logger.exception('Failed to process query: %s when processing table %s', query, table_name)
250-250: Dynamic SQL identifiers: validate/whitelist (S608).

ct.table_name and ct.commodity_reference must be validated (allowlist) before interpolation.

Based on hints

Validate identifiers from CostTableData construction (known schema), then use the f-string safely. If user-controlled, reject early.

254-256: Preserve traceback on DB errors.

Switch to logger.exception for better debugging.

Based on hints
-            logger.error(
+            logger.exception(
                 'failed to process query: %s when processing table %s', query, ct.table_name
             )
75-80: Variable naming nit: use tech for C2A LUT clarity.

The first column is tech; rename local var for readability.
-    for comm, units in rows:
+    for tech, units in rows:
         valid, group = validate_units_format(units, SINGLE_ELEMENT)
         if valid:
             valid, units = validate_units_expression(group[0])
-            res[comm] = units
+            res[tech] = units

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 74c12a8 and 1d85dba.

📒 Files selected for processing (1)

temoa/temoa_model/unit_checking/relation_checker.py (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

temoa/temoa_model/unit_checking/relation_checker.py (2)

temoa/temoa_model/unit_checking/common.py (1)

consolidate_lines (166-173)

temoa/temoa_model/unit_checking/entry_checker.py (2)

validate_units_format (49-61)

validate_units_expression (64-74)

🪛 Ruff (0.14.0)

temoa/temoa_model/unit_checking/relation_checker.py

163-165: Possible SQL injection vector through string-based query construction

(S608)

168-168: Possible SQL injection vector through string-based query construction

(S608)

173-173: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

246-249: Avoid specifying long messages outside the exception class

(TRY003)

250-250: Possible SQL injection vector through string-based query construction

(S608)

254-256: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (1)

temoa/utilities/db_migration_v3_to_v3_1.py (1)
193-199: Fix severity label: "WARNING" should be "ERROR".

Past review indicated this mismatch label should be corrected. Since the script closes connections and exits, this is a fatal error, not a warning. The same pattern at lines 231-237 correctly uses "ERROR".

Apply this diff for consistency:
-            print(f'WARNING: Column mismatch in {table_name}')
+            print(f'ERROR: Column mismatch in {table_name}')

🧹 Nitpick comments (2)

temoa/utilities/db_migration_v3_to_v3_1.py (2)
178-178: Use positive exit codes for portability.

The script exits with -1 on errors. POSIX conventions expect positive exit codes (1-255), and -1 may be interpreted as 255 on some systems. For consistency with standard practice, use sys.exit(1) for error conditions.

Apply this diff:
-    sys.exit(-1)
+    sys.exit(1)
This same pattern appears in lines 199, 237, and should be updated throughout for consistency.

272-272: Prefer list unpacking over concatenation.

The static analyzer suggests using unpacking syntax for better readability.

Apply this diff:
-        if set(old_cols + ['units']) != set(new_cols):
+        if set([*old_cols, 'units']) != set(new_cols):

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1d85dba and ec89d26.

📒 Files selected for processing (1)

temoa/utilities/db_migration_v3_to_v3_1.py (1 hunks)

🧰 Additional context used

🪛 Ruff (0.14.0)

temoa/utilities/db_migration_v3_to_v3_1.py

203-203: Possible SQL injection vector through string-based query construction

(S608)

216-216: Possible SQL injection vector through string-based query construction

(S608)

245-245: Possible SQL injection vector through string-based query construction

(S608)

255-255: Possible SQL injection vector through string-based query construction

(S608)

272-272: Consider [*old_cols, 'units'] instead of concatenation

Replace with [*old_cols, 'units']

(RUF005)

283-283: Possible SQL injection vector through string-based query construction

(S608)

295-295: Possible SQL injection vector through string-based query construction

(S608)

🔇 Additional comments (1)

temoa/utilities/db_migration_v3_to_v3_1.py (1)

300-319: LGTM! Solid finalization sequence.

The commit, vacuum, FK re-enabling, and FK verification sequence is well-structured. The error handling for FK checks provides clear diagnostics, and connection cleanup is properly done.

jeff-ws and others added 30 commits November 22, 2024 10:23

Tinkering with MC worker Queue

0bdcf03

Tinkering with MC worker Queue

b8e739e

Tinkering with MC worker Queue

9f6c359

Tinkering with MC worker Queue

983b898

- adding timing to post-processing

Tinkering with MC worker Queue

912591d

- adding timing to post-processing

Tinkering with MC worker Queue

53e13b8

- adding timing to post-processing

Tinkering with MC worker Queue

4a4b48b

- adding timing to post-processing

Tinkering with MC worker Queue

e443ad7

Tinkering with MC worker Queue

69404ad

Tinkering with MC worker Queue

bd6499d

Tinkering with MC worker Queue

a9db306

- adding timing to post-processing

Tinkering with MC worker Queue

cacbfd5

- adding timing to post-processing

Tinkering with MC worker Queue

9580510

- adding timing to post-processing

Tinkering with MC worker Queue

218c615

- adding timing to post-processing

Merge remote-tracking branch 'origin/main'

ead691f

MC running with new MC Worker

1d7551c

- queues are now swapping data vs. models - table_writer.py broken up into data pulling portion/writing portion

Fixed bug in data_brick.py that was not actually returning an obj.

f31b86b

adjust workers for TRACE

8ad98b0

adding some additional logging, reverting worker settings to 6 worker…

59c1284

…s at 20 threads ea.

adding some additional logging, reverting worker settings to 6 worker…

db7eeb2

…s at 20 threads ea.

adding some additional logging, reverting worker settings to 6 worker…

1ca0677

…s at 20 threads ea.

Clean Up work on MC

dd3b60b

- commented some functions - removed some erroneous "sizeof" measures

Adding T/S log entries

45d1321

Increase size of work queue

7a80fb3

- was hanging while putting in shutdown signals

Rev 2.0 of MC complete

f59e011

- works with Utopia - logging cleaned up

Rev 2.0 of MC complete

06dd688

- works with Utopia - logging cleaned up

Added ability to output Dual Variables to Myopic Mode

eefe33a

Merge branch 'TemoaProject:main' into main

146a31b

Tuned up migration utility

1543cf8

Added additional QA steps during data transfer.

Formatting

60b404c

Update table_writer.py for OutputCurtailment to pre-merge curtailment…

08586e0

… and flex to avoid collisions if both write independently

coderabbitai bot reviewed Sep 23, 2025

View reviewed changes

data_files/temoa_schema_v3.sql Show resolved Hide resolved

temoa/temoa_model/hybrid_loader.py Outdated Show resolved Hide resolved

Fix bugs in loader for shared capacity parameters

e4a1e20

coderabbitai bot reviewed Sep 23, 2025

View reviewed changes

temoa/temoa_model/hybrid_loader.py Outdated Show resolved Hide resolved

Picked up a few more (seldom used?) tables in hybrid_loader.py that s…

32628e3

…hould have the read "gated" by myopic index if running in myopic mode

coderabbitai bot reviewed Sep 23, 2025

View reviewed changes

temoa/utilities/db_migration_v3_to_v3_1.py Outdated Show resolved Hide resolved

temoa/utilities/db_migration_v3_to_v3_1.py Show resolved Hide resolved

temoa/utilities/db_migration_v3_to_v3_1.py Show resolved Hide resolved

jeff-ws added 16 commits September 25, 2025 18:33

Fundamentals in place for unit checking w/ some scaffolding

d9e0f97

First cut at documentation addition for unit checking

8ad2a91

add custom units into default registry, remove MIXED, ETHOS from comm…

95a1ed5

…on.py

working through most inter-table relations

e21fbb3

Lots of DB cleanup to standardize. Augments to the re to improve capt…

6d20169

…ure.

working through capacity-related tables

b70971b

consolidation of error listing by row group in more tables

ead7049

additions, cleanup, and comments for tests and common elements.

b8a3d25

Working OK. Passing initial QA. Before splitting cost table entries b…

708fe59

…y numerator/denominator

Working with RATIO based expression expectation in Cost Tables

70d5deb

minor comment fixes and formatting

39623ec

augment test

eb37aec

correct the units pattern in the custom file

6dd7577

updates to screener to handle empty entries and multiple db files

03d6b22

incorporation into main body

a1b2378

kill verbose output

74c12a8

coderabbitai bot reviewed Oct 17, 2025

View reviewed changes

fix incorrect import packages

1d85dba

jeff-ws changed the title ~~Groundwork for DB Version 3.1~~ Groundwork for DB Version 3.1 and Unit Checking Oct 17, 2025

coderabbitai bot reviewed Oct 17, 2025

View reviewed changes

jeff-ws and others added 2 commits October 16, 2025 18:57

Update temoa/utilities/db_migration_v3_to_v3_1.py

9fbe6d7

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

Update temoa/utilities/db_migration_v3_to_v3_1.py

ec89d26

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

coderabbitai bot reviewed Oct 17, 2025

View reviewed changes

fix loop

b494104

	raise ValueError('Unknown units format: %s', format_type)
	raise ValueError(f'Unknown units format: {format_type}')

Groundwork for DB Version 3.1 and Unit Checking #124

Are you sure you want to change the base?

Groundwork for DB Version 3.1 and Unit Checking #124

Uh oh!

Conversation

jeff-ws commented Sep 23, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jeff-ws commented Sep 23, 2025 •

edited by coderabbitai bot

Loading