-
Notifications
You must be signed in to change notification settings - Fork 260
feat: job id is incremental #1267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
How would this work in a multi-scheduler setup? Maybe add the scheduler id as a first item? |
|
hey @Dandandan it does not, the change is focused more on aligning with spark, which to my knowledge does not have multi scheduler setup. I'm consider using ulid-s which may be better fit, they are random, and sortable. Issues with ulid-s they are quite big strings. Would work with multi scheduler. wdyt ? |
In the API, it's possible to sort by time (e.g. job start time) as well based on the job information. Why would it need to be in the job id? |
|
when going through logs, for example, it makes it easier to reason about. |
|
there is issue with having job id tied to physical directory, which may make mess when scheduler is restarted without restarting executors, making possibility to overlap job data directories (shuffle readers may read wrong spill in current implementation) thus just using incremental id in current implementation may make problems. I was looking at the snowflake like ids, are sortable and unique; for them machine id should be exposed as a configuration parameter. Current approach is not sortable, now there is slight chance to get name collision (with very low probability). |
|
Hm yeah. Not anything against it, but just my thoughts :). I think |
Which issue does this PR close?
Closes #.
Rationale for this change
previously
job idis generated randomly, without any ordering guarantees, which make is rather hard to determine ordering of jobs, and returning totally random results on rest api. Spark is generating job id incrementing atomic int.What changes are included in this PR?
Are there any user-facing changes?
Open points
job_idis used as directory name where job (shuffle) data is persisted, if we use incremental id, we may get into problems with leftover directory names in case of scheduler restart