added validation_config.json#1996
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a new validation configuration file, validation_config.json, for the World Bank WDI dataset. It defines rules for checking deleted record counts and performing golden file comparisons. The review feedback highlights path inconsistencies for golden files and a missing directory prefix for input files, which could prevent the validation tool from locating the necessary data.
| "rule_id": "check_goldens_output_csv", | ||
| "validator": "GOLDENS_CHECK", | ||
| "params": { | ||
| "golden_files": "golden_WorldBank.csv", |
There was a problem hiding this comment.
The path for golden_files is inconsistent with the rule on line 24, which uses the golden_data/ directory. For consistency and to ensure the file is correctly located by the validation tool, consider moving golden_WorldBank.csv to golden_data/golden_WorldBank.csv.
| "golden_files": "golden_WorldBank.csv", | |
| "golden_files": "golden_data/golden_WorldBank.csv", |
| "validator": "GOLDENS_CHECK", | ||
| "params": { | ||
| "golden_files": "golden_WorldBank.csv", | ||
| "input_files": "WorldBank.csv" |
There was a problem hiding this comment.
The input_files path seems to be missing the output/ directory prefix. According to worldbank.py (line 507), the output CSV is saved as output/WorldBank.csv. Updating this path will allow the validator to correctly find the generated data.
| "input_files": "WorldBank.csv" | |
| "input_files": "output/WorldBank.csv" |
No description provided.