-
Notifications
You must be signed in to change notification settings - Fork 79
Clear Stale Persistent Tasks in Stop/Pause API #1629
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Clear Stale Persistent Tasks in Stop/Pause API #1629
Conversation
ankitkala
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two major feedback:
- The PR has lot of additional code changes which doesn't seems to be related to the actual change. Can you remove all the unnecessary changes so its easier to review
- Stale replication tasks are problem when you're trying to create the task again(start or resume). I think we should be able to simplify by just handling this during task creation here (need to verify though, with any stacktrace from last few occurence of the issue)
src/main/kotlin/org/opensearch/replication/action/index/TransportReplicateIndexAction.kt
Show resolved
Hide resolved
src/main/kotlin/org/opensearch/replication/action/index/TransportReplicateIndexAction.kt
Show resolved
Hide resolved
| /** | ||
| * Handles case where no replication state exists but stale artifacts might remain. | ||
| */ | ||
| private suspend fun handleMissingReplicationState(indexName: String) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you want to handle the stale tasks during stop flow? shouldn't we only do this during start and resume?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The aim is to prevent the stale tasks to occur on a best effort basis, so that STOP/PAUSE api can be called multiple times, making it idempotent and we execute the full workflow to cleanup:-
Goal: Remove/suspend replication artifacts
Stale artifacts: Leftovers from previous incomplete operations
User intent: "Make sure replication is stopped/paused"
Handling stale artifacts helps achieve the goal
src/main/kotlin/org/opensearch/replication/action/pause/TransportPauseIndexReplicationAction.kt
Show resolved
Hide resolved
96341a5 to
2351fa1
Compare
Signed-off-by: Mohit Kumar <[email protected]>
Signed-off-by: Mohit Kumar <[email protected]>
Signed-off-by: Mohit Kumar <[email protected]>
Signed-off-by: Mohit Kumar <[email protected]>
Signed-off-by: Mohit Kumar <[email protected]>
Signed-off-by: Mohit Kumar <[email protected]>
Signed-off-by: Mohit Kumar <[email protected]>
Signed-off-by: Mohit Kumar <[email protected]>
Signed-off-by: Mohit Kumar <[email protected]>
…cel" This reverts commit b85dc35. Signed-off-by: Mohit Kumar <[email protected]>
Signed-off-by: Mohit Kumar <[email protected]>
Signed-off-by: Mohit Kumar <[email protected]>
43629d5 to
e904ee1
Compare
This reverts commit 2351fa1. Signed-off-by: Mohit Kumar <[email protected]>
f407086 to
759e8dc
Compare
Signed-off-by: Mohit Kumar <[email protected]>
9a48b36 to
1cd2550
Compare
…narios Signed-off-by: Mohit Kumar <[email protected]>
1cd2550 to
2b60eb0
Compare
Description
When CCR is stopped or paused, all the index and shard replication tasks should be stopped. But if the stop/ pause is not completely successful, some of the replication tasks might stay running. This can cause conflict when we restart/resume the replication.
We have taken below actions to rectify this:-
Related Issues
Resolves #[Issue number to be closed when this PR is merged]
Check List
--signoff.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.