Skip to content

Conversation

@vibhatha
Copy link
Member

No description provided.

sehansi-9 and others added 30 commits June 30, 2025 16:03
sehansi-9 and others added 28 commits August 11, 2025 09:54
… or department names during move operations.
Implement user interface for org gazette processing
…35be21edafe'

git-subtree-dir: gazettes/preprocess
git-subtree-mainline: 9135ef1
git-subtree-split: c489978
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @vibhatha, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request delivers a comprehensive solution for processing and managing Sri Lankan government gazette data. It establishes a new directory gazettes/preprocess containing a Python FastAPI backend responsible for parsing gazette JSON inputs, identifying organizational changes (ministry/department structure, person-portfolio assignments), maintaining a versioned state in SQLite databases, and generating CSV outputs. Complementing this, a React frontend provides an intuitive interface for users to interact with the system, visualize the organizational state, review predicted transactions, and commit updates, enhancing the overall data management workflow.

Highlights

  • New Full-Stack Application: This pull request introduces a complete web application for processing Sri Lankan government gazette data, featuring a Python FastAPI backend and a React frontend.
  • Gazette Processing Logic: The Python backend (gztprocessor) is designed to extract, classify, and manage structural changes from gazettes, covering both Ministry/Department (MinDep) and Person-Portfolio assignments.
  • Versioned State Management: A SQLite-based system is implemented to maintain versioned snapshots of the organizational state, enabling historical tracking and potential rollback capabilities for gazette changes.
  • Interactive Frontend Interface: A user-friendly React frontend (gztp-frontend) provides an intuitive interface for users to visualize the current organizational state, add new gazettes, preview detected transactions (adds, moves, terminates, renames), and approve/commit changes.
  • Automated CSV Output Generation: The system automatically generates structured CSV files for all detected transactions, facilitating seamless integration with graph databases like Neo4j for further analysis and visualization.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the GztProcessor application, a comprehensive system with a Python FastAPI backend and a React frontend for processing and managing gazette data. The changes are extensive, adding the entire application structure, including database handlers, state managers, API routes, and UI components. My review focuses on critical areas for improvement, such as configuration management, adherence to React best practices, and backend correctness. Key issues identified include hardcoded URLs and CORS origins which will hinder deployment, improper database initialization in the FastAPI application, and some non-idiomatic React patterns that affect maintainability and performance. Addressing these points will significantly improve the robustness and scalability of the new application.

const handleSubmit = async () => {
if (!gazetteNumber || !gazetteDate || !gazetteType) return;

const endpoint = `http://localhost:8000/mindep/${gazetteType === 'initial' ? 'initial' : 'amendment'}/${gazetteDate}/${gazetteNumber}`;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The API endpoint URL is hardcoded. This makes the application difficult to deploy to different environments (e.g., staging, production), as the URL will need to be manually changed. It's a best practice to use environment variables for such configurations. You can create a .env file in your frontend's root directory and access variables from it using import.meta.env.VITE_API_BASE_URL.

This issue is also present in other components, such as StateTable.jsx and TransactionPreview.jsx.

Suggested change
const endpoint = `http://localhost:8000/mindep/${gazetteType === 'initial' ? 'initial' : 'amendment'}/${gazetteDate}/${gazetteNumber}`;
const endpoint = `${import.meta.env.VITE_API_BASE_URL || 'http://localhost:8000'}/mindep/${gazetteType === 'initial' ? 'initial' : 'amendment'}/${gazetteDate}/${gazetteNumber}`;

app.include_router(transaction_router)
app.add_middleware(
CORSMiddleware,
allow_origins=["http://localhost:5173"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The CORS allow_origins is hardcoded to http://localhost:5173. This will prevent the frontend from connecting to the API in any environment other than local development (e.g., staging, production). This value should be configured through environment variables to allow for flexibility across different deployment environments.

Comment on lines +217 to +245
const handleGazetteCommitted = (committedIndex) => {
setGazetteWarnings((prev) => {
const gazettes = data.presidents[selectedPresidentIndex]?.gazettes || [];
const newWarnings = prev.slice();

// Clear warning on committed gazette
newWarnings[committedIndex] = false;

// Persist updated warning to drafts table
const committedGazette = gazettes[committedIndex];
axios.post(
`http://localhost:8000/transactions/${committedGazette.number}/warning`,
{ warning: false }
).catch(err => console.error("Failed to update warning:", err));

// Mark all gazettes after committedIndex as needing redo
for (let i = committedIndex + 1; i < gazettes.length; i++) {
newWarnings[i] = true;

// Persist warning to drafts table
axios.post(
`http://localhost:8000/transactions/${gazettes[i].number}/warning`,
{ warning: true }
).catch(err => console.error("Failed to update warning:", err));
}

return newWarnings;
});
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This function triggers API calls (axios.post) inside the setGazetteWarnings state updater function. State updater functions in React should be pure and must not contain side effects. This pattern can lead to unpredictable behavior, race conditions, and makes the component difficult to test and debug.

The side effects (API calls) should be moved out of the state updater. You can either place them in a useEffect hook that triggers when the relevant state changes, or handle them in a separate async function that subsequently calls the state setters.

Comment on lines +12 to +16
if __name__ == "__main__":
init_gov_db()
init_person_db()
init_transaction_db()
print("✅ Databases initialized.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The database initialization logic is placed within an if __name__ == "__main__": block. This block is not executed when the application is run by a production server like Uvicorn, which imports the app object directly. This will cause the application to fail at runtime because the databases are not initialized.

This logic should be moved to a FastAPI lifespan context manager, which is the modern and recommended way to handle startup and shutdown events.

alert(response.data.error); // show the error from backend
return;
}
if (gazetteType == 'initial') {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using the loose equality operator == can lead to unexpected bugs due to type coercion. It is a best practice to always use the strict equality operator === to ensure both value and type are compared.

Suggested change
if (gazetteType == 'initial') {
if (gazetteType === 'initial') {

Comment on lines +100 to +104
const handleMinisterNameChange = (index, newName) => {
const updatedData = JSON.parse(JSON.stringify(data));
updatedData.presidents[selectedPresidentIndex].gazettes[selectedGazetteIndex].transactions[index].name = newName;
setData(updatedData);
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Throughout this component, JSON.parse(JSON.stringify(data)) is used to create a deep copy of the entire data object for every small state update. This is inefficient, especially for large state objects, and is not an idiomatic React pattern. It forces React to re-render more of the component tree than necessary.

It's better to use immutable update patterns with the spread syntax to update only the necessary parts of the state tree. This is more performant and aligns with React's best practices.

Comment on lines +36 to +43
move_rows.append({
"transaction_id": transaction_id,
"old_parent": prev_min,
"new_parent": minister["name"],
"child": dept_name,
"type": "AS_DEPARTMENT",
"date": date_str
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The move_rows are generated with a key named type. However, in generate_amendment_csvs, the equivalent field is named rel_type. This inconsistency will create CSV files with different headers for the same kind of data, which will likely break any downstream processing logic. For consistency, this should be changed to rel_type.

Suggested change
move_rows.append({
"transaction_id": transaction_id,
"old_parent": prev_min,
"new_parent": minister["name"],
"child": dept_name,
"type": "AS_DEPARTMENT",
"date": date_str
})
move_rows.append({
"transaction_id": transaction_id,
"old_parent": prev_min,
"new_parent": minister["name"],
"child": dept_name,
"rel_type": "AS_DEPARTMENT",
"date": date_str
})

Comment on lines +76 to +78
writer = csv.DictWriter(f, fieldnames=[
"transaction_id", "old_parent", "new_parent", "child", "type", "date"
])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To align with the suggested change for generating move_rows and to maintain consistency with generate_amendment_csvs, the header field type should be renamed to rel_type.

Suggested change
writer = csv.DictWriter(f, fieldnames=[
"transaction_id", "old_parent", "new_parent", "child", "type", "date"
])
writer = csv.DictWriter(f, fieldnames=[
"transaction_id", "old_parent", "new_parent", "child", "rel_type", "date"
])

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants