feat: job id is incremental #1267

milenkovicm · 2025-05-21T19:27:26Z

Which issue does this PR close?

Closes #.

Rationale for this change

previously job id is generated randomly, without any ordering guarantees, which make is rather hard to determine ordering of jobs, and returning totally random results on rest api. Spark is generating job id incrementing atomic int.

What changes are included in this PR?

use atomic integer to generate job id, i do not see valid reason not to do it
move job id generation from task manager to scheduler (if we later want to make it configurable it would be easier to do)

Are there any user-facing changes?

no, they might get confused with new job ids

Open points

job_id is used as directory name where job (shuffle) data is persisted, if we use incremental id, we may get into problems with leftover directory names in case of scheduler restart

Dandandan · 2025-05-27T14:25:53Z

How would this work in a multi-scheduler setup? Maybe add the scheduler id as a first item?

milenkovicm · 2025-05-27T16:19:46Z

hey @Dandandan it does not, the change is focused more on aligning with spark, which to my knowledge does not have multi scheduler setup.

I'm consider using ulid-s which may be better fit, they are random, and sortable. Issues with ulid-s they are quite big strings. Would work with multi scheduler.

wdyt ?

Dandandan · 2025-06-01T10:03:28Z

hey @Dandandan it does not, the change is focused more on aligning with spark, which to my knowledge does not have multi scheduler setup.

I'm consider using ulid-s which may be better fit, they are random, and sortable. Issues with ulid-s they are quite big strings. Would work with multi scheduler.

wdyt ?

In the API, it's possible to sort by time (e.g. job start time) as well based on the job information.

Why would it need to be in the job id?

milenkovicm · 2025-06-01T10:10:22Z

when going through logs, for example, it makes it easier to reason about.

milenkovicm · 2025-06-01T10:24:44Z

there is issue with having job id tied to physical directory, which may make mess when scheduler is restarted without restarting executors, making possibility to overlap job data directories (shuffle readers may read wrong spill in current implementation) thus just using incremental id in current implementation may make problems.

I was looking at the ULID they generate random but sortable IDs, downside is that they are too big (IMHO), not easiest to reason about ordering either.

snowflake like ids, are sortable and unique; for them machine id should be exposed as a configuration parameter.

Current approach is not sortable, now there is slight chance to get name collision (with very low probability).

Dandandan · 2025-06-01T10:25:24Z

Hm yeah. Not anything against it, but just my thoughts :). I think ULID would be preferable over an atomic / incremental id.

feat: job id is incremental

16cb070

milenkovicm marked this pull request as draft May 21, 2025 19:37

andygrove approved these changes May 27, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: job id is incremental #1267

feat: job id is incremental #1267

Uh oh!

milenkovicm commented May 21, 2025 •

edited

Loading

Uh oh!

Dandandan commented May 27, 2025

Uh oh!

milenkovicm commented May 27, 2025

Uh oh!

Dandandan commented Jun 1, 2025

Uh oh!

milenkovicm commented Jun 1, 2025

Uh oh!

milenkovicm commented Jun 1, 2025

Uh oh!

Dandandan commented Jun 1, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat: job id is incremental #1267

Are you sure you want to change the base?

feat: job id is incremental #1267

Uh oh!

Conversation

milenkovicm commented May 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

Open points

Uh oh!

Dandandan commented May 27, 2025

Uh oh!

milenkovicm commented May 27, 2025

Uh oh!

Dandandan commented Jun 1, 2025

Uh oh!

milenkovicm commented Jun 1, 2025

Uh oh!

milenkovicm commented Jun 1, 2025

Uh oh!

Dandandan commented Jun 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

milenkovicm commented May 21, 2025 •

edited

Loading

Dandandan commented Jun 1, 2025 •

edited

Loading