-
Notifications
You must be signed in to change notification settings - Fork 20
fix(paper_review): add GraderError handling in pipeline #91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary of ChangesHello @XiaoBoAI, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly improves the resilience of the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request adds error handling for grader evaluations in the PaperReviewPipeline, which is a valuable improvement for robustness. The implementation correctly handles GraderError by logging it and allowing the pipeline to continue. My review focuses on a significant amount of code duplication introduced by this change. I've provided a suggestion to refactor the duplicated logic into a single helper method to improve code clarity and maintainability.
| correctness = await self.correctness_grader.aevaluate(pdf_data=pdf_data) | ||
| result.correctness = CorrectnessResult( | ||
| score=correctness.score, | ||
| reasoning=correctness.reason, | ||
| key_issues=correctness.metadata.get("key_issues", []), | ||
| ) | ||
| if isinstance(correctness, GraderError): | ||
| logger.error(f"Correctness grader error: {correctness.error}") | ||
| else: | ||
| result.correctness = CorrectnessResult( | ||
| score=correctness.score, | ||
| reasoning=correctness.reason, | ||
| key_issues=correctness.metadata.get("key_issues", []), | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While the error handling is a great addition for robustness, this pattern of calling aevaluate, checking for GraderError, and then processing the result is repeated multiple times in this file (for correctness, review, criticality, jailbreaking, and format graders).
To improve maintainability and reduce code duplication, consider refactoring this logic into a helper method. This method could accept the grader, its arguments, and a callback function to process a successful result.
For example, you could define a helper like this:
async def _run_grader(self, grader, on_success, *args, **kwargs):
"""Runs a grader, handles errors, and calls a success callback."""
grader_name = grader.name.replace('_', ' ').capitalize()
try:
result = await grader.aevaluate(*args, **kwargs)
if isinstance(result, GraderError):
logger.error(f"{grader_name} grader error: {result.error}")
else:
on_success(result)
except Exception as e:
logger.error(f"An unexpected error occurred in {grader_name} grader: {e}", exc_info=True)And then use it for the correctness check like this:
if self.config.enable_correctness:
logger.info("Running correctness detection...")
def on_success(res):
result.correctness = CorrectnessResult(
score=res.score,
reasoning=res.reason,
key_issues=res.metadata.get("key_issues", []),
)
await self._run_grader(
self.correctness_grader, on_success, pdf_data=pdf_data
)This approach would make the review_paper and _run_safety_checks methods much cleaner and easier to maintain.
Add error handling for all grader evaluations in PaperReviewPipeline to gracefully handle failures instead of crashing. When a grader returns a GraderError, it is now logged and the pipeline continues with remaining evaluations.
26ff5a3 to
a463792
Compare
ployts
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Add error handling for all grader evaluations in PaperReviewPipeline to gracefully handle failures instead of crashing. When a grader returns a GraderError, it is now logged and the pipeline continues with remaining evaluations.
OpenJudge Version
[The version of OpenJudge you are working on, e.g.
import openjudge; print(openjudge.__version__)]Description
[Please describe the background, purpose, changes made, and how to test this PR]
Checklist
Please check the following items before code is ready to be reviewed.
pre-commit run --all-filescommand