Conversation
Improve pg_log.py: add docstrings, fix bugs, and improve efficiency
Add Google-style docstrings and fix bugs in core modules
…pg_split pg_cmd.py: - Added docstrings to all methods and class - Fixed unescaped dot in dsid regex (r'^(ds\d|\d)\d\d.\d$' -> r'...\.\d$') - Fixed NoneType concatenation crash in get_dynamic_options when SYSERR is None pg_lock.py: - Added docstrings to all methods and class - Fixed update_partition_lock: pid counter underflow (added max(0, ...)) - Fixed update_partition_lock: potential KeyError on missing 'pid' key - Fixed update_partition_lock: inconsistent early-return value (0 -> None) - Fixed lock_process_info: implicit None check -> explicit is not None pg_opt.py: - Added docstrings to all 40+ methods and class - Fixed set_sudo_uid/set_root_uid: self.params['CACT'] -> self.PGOPT['CACT'] - Fixed set_sudo_uid/set_root_uid: missing closing quote in format strings - Fixed set_option_value: str builtin used instead of local variable dstr - Fixed validate_one_infile: == None -> is None - Fixed get_control_frequency: wrong return value on error (returned unit char instead of error message string) - Fixed append_format_string: unescaped regex metacharacters in fmt pg_split.py: - Added docstrings to all methods and class - Fixed compare_wfile: [[]]*flen shallow copy bug (all fields shared one list) - Fixed compare_wfile: dsrecs.keys() live view -> list(dsrecs.keys()) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add Google-style docstrings and fix bugs in pg_cmd, pg_lock, pg_opt, …
There was a problem hiding this comment.
Pull request overview
This PR primarily improves inline documentation across the RDADB helper modules and includes a few small behavioral fixes in option parsing, command tracking, and partition-lock bookkeeping.
Changes:
- Added/expanded docstrings on several core classes and methods (PgSplit, PgOPT, PgLock, PgCMD).
- Fixed a few correctness issues in option validation / parsing (e.g., wrong variable in an error message; normalized error returns; safer regex handling).
- Minor robustness improvements in command/error accumulation and partition-lock counter updates.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/rda_python_common/pg_split.py | Docstrings + small refactor; dsid-scoping helper touched. |
| src/rda_python_common/pg_opt.py | Extensive docstrings; several small bug fixes in option validation and frequency parsing. |
| src/rda_python_common/pg_lock.py | Docstrings + safer partition-lock bookkeeping updates. |
| src/rda_python_common/pg_cmd.py | Docstrings + minor robustness fixes in dscheck lifecycle and SYSERR accumulation. |
| pyproject.toml | Version bump to 2.1.2. |
Comments suppressed due to low confidence (1)
src/rda_python_common/pg_split.py:154
get_dsid_condition()intends to detect existingwid/dsidpredicates, but the regexr'(^|.| )(wid|dsid)\s*='matchesdsid/widsubstrings inside longer identifiers (e.g.mydsid = ...), which can incorrectly skip adding the dataset scope and potentially broaden updates/deletes. Use a word-boundary based check (e.g.\b(wid|dsid)\s*=) and/or allow an optional table prefix (e.g.\b(?:\w+\.)?(wid|dsid)\s*=).
if condition:
if re.search(r'(^|.| )(wid|dsid)\s*=', condition):
return condition
else:
dscnd = "wfile.dsid = '{}' ".format(dsid)
if not re.match(r'^\s*(ORDER|GROUP|HAVING|OFFSET|LIMIT)\s', condition, re.I): dscnd += 'AND '
return dscnd + condition # no where clause, append directly
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if hosts and re.match(r'^\w\d\d\d\d\d\d$', hosts): | ||
| self.pglog(hosts + ": Cannot pass DSID for hostname to submit batch process", self.LGEREX) |
There was a problem hiding this comment.
The DSID-as-hostname guard in init_dscheck() only checks ^\w\d\d\d\d\d\d$, which will (a) miss old-style dataset IDs like ds123.0 and (b) potentially reject non-dsid hostnames that happen to be one letter + 6 digits. Consider using the existing dataset ID detection logic (find_dataset_id / format_dataset_id) to detect both old and new DSID formats, and only reject when the entire hostname string is a valid DSID.
| else: | ||
| val = "invalid frequency '{}', unit must be (Y,M,W,D,H)".format(frequency) | ||
| return (None, unit) | ||
| errmsg = "invalid frequency '{}', unit must be (Y,M,W,D,H)".format(frequency) |
There was a problem hiding this comment.
get_control_frequency() accepts units Y,M,W,D,H,N,S (per the regex and docstring), but the invalid-unit error message still says "unit must be (Y,M,W,D,H)". This is misleading for callers and users; update the message to match the actual accepted units (including N and S) or restrict the accepted set to what the message claims.
| errmsg = "invalid frequency '{}', unit must be (Y,M,W,D,H)".format(frequency) | |
| errmsg = "invalid frequency '{}', unit must be (Y,M,W,D,H,N,S)".format(frequency) |
|
|
||
| Args: | ||
| dcnd (str): SQL condition string scoping the dataset (e.g. | ||
| ``"dsid = 'd123456'""``). |
There was a problem hiding this comment.
Docstring example has an extra trailing quote: "dsid = 'd123456'"". This looks like a typo and can confuse readers/copy-paste users; it should be something like "dsid = 'd123456'".
| ``"dsid = 'd123456'""``). | |
| ``"dsid = 'd123456'"``). |
No description provided.