Search before creating an issue
Bug Description
Looks like the FillingModeFlag can't be disabled anymore.
In the Pilot code, it is set to True when we have a PoolCE (i.e. multiple jobs can be executed in parallel in different processes):
|
"-o FillingModeFlag=True", |
In the JobAgent, it is also set to True by default: https://github.com/DIRACGrid/DIRAC/blob/b0bc9cfb83af19215870cee0541a4e404471c53b/src/DIRAC/WorkloadManagementSystem/Agent/JobAgent.py#L71
This looks correct if we manage to solve our issues to compute CPU Time Left in the allocations (see DIRACGrid/DIRAC#8416).
Steps to Reproduce
No response
Expected Behavior
Sometimes we might have some issues with time calculations, and we should be able to configure the fillingModeFlag for a given queue I guess.
Actual Behavior
We have no way to solve time calculation issues with the current configuration.
Environment
No response
Relevant Log Output
Additional Context
I am wondering whether the filling mode should not be always disabled for simplicity (single core pilots would execute 1 job and stop, multi-core pilots would fill the slots once and stop) as we go towards more multi-core jobs.
- we should see what we gain/lose from it (in LHCb I see pilots are generally executing more than 1 job, but I also see a significant number of stalled pilots/jobs)
- it could work fine if we would have an accurate estimate of the time left in the allocation as well as the CPU power, which is not the case now (e.g. can't get time left in HTCondor allocations)
Search before creating an issue
Bug Description
Looks like the
FillingModeFlagcan't be disabled anymore.In the
Pilotcode, it is set toTruewhen we have aPoolCE(i.e. multiple jobs can be executed in parallel in different processes):Pilot/Pilot/pilotCommands.py
Line 1042 in 10330ec
In the
JobAgent, it is also set toTrueby default: https://github.com/DIRACGrid/DIRAC/blob/b0bc9cfb83af19215870cee0541a4e404471c53b/src/DIRAC/WorkloadManagementSystem/Agent/JobAgent.py#L71This looks correct if we manage to solve our issues to compute CPU Time Left in the allocations (see DIRACGrid/DIRAC#8416).
Steps to Reproduce
No response
Expected Behavior
Sometimes we might have some issues with time calculations, and we should be able to configure the
fillingModeFlagfor a given queue I guess.Actual Behavior
We have no way to solve time calculation issues with the current configuration.
Environment
No response
Relevant Log Output
Additional Context
I am wondering whether the filling mode should not be always disabled for simplicity (single core pilots would execute 1 job and stop, multi-core pilots would fill the slots once and stop) as we go towards more multi-core jobs.