Suggestions from @justbennet:
...thought you might like to know about a couple of additional scripts, in case you wanted to add them to the Commands section of SLURM Basics.
First, we wrote my_job_statistics to produce (hopefully) friendlier output than seff.
Second, we wrote my_job_header to print information about the job and its environment that would be useful for debugging or reproducibility.
Third, we wrote my_job_estimate, which used to take a job script as input and give you the estimated cost if it were to run the full walltime. Now it seems only to take arguments, but may still prove useful. I put an example below.
Here is a comparison of the seff/my_job_statistics outputs.
[bennet@gl-build mpi_example]$ seff 16971600
Job ID: 16971600
Cluster: greatlakes
User/Group: bennet/bennet
State: COMPLETED (exit code 0)
Nodes: 2
Cores per node: 2
CPU Utilized: 00:01:30
CPU Efficiency: 47.87% of 00:03:08 core-walltime
Job Wall-clock time: 00:00:47
Memory Utilized: 17.09 MB
Memory Efficiency: 0.42% of 4.00 GB
[bennet@gl-build mpi_example]$ my_job_statistics 16971600
End of job summary for JobID 16971600 on the greatlakes cluster for bennet
Job name: test_mpi
Job start time: 01/16/2021 09:31:23
Job end time: 01/16/2021 09:32:10
Job running time: 00:00:47
State: COMPLETED
Exit code: 0
On nodes: gl[3137-3138]
(2 nodes with 2 cores per node)
CPU Utilized: 00:01:30
CPU Efficiency: 47.87% of 00:03:08 total CPU time (cores * walltime)
Memory Utilized: 17.09 MB
Memory Efficiency: 0.42% of 4.00 GB
Cost: $0.00
Here is what my_job_header printed at the top of the output, which has most of the things about a job that we are likely to ask for or about if you were to submit a ticket.
Job information
#-------------------------------------------------------------------
SLURM_SUBMIT_HOST gl-build.arc-ts.umich.edu
SLURM_JOB_ACCOUNT hpcstaff
SLURM_JOB_PARTITION standard
SLURM_JOB_NAME test_mpi
SLURM_JOBID 16971600
SLURM_NODELIST gl[3137-3138]
SLURM_JOB_NUM_NODES 2
SLURM_NTASKS 4
SLURM_TASKS_PER_NODE 2(x2)
SLURM_CPUS_PER_TASK 1
SLURM_NPROCS 4
SLURM_MEM_PER_CPU
SLURM_SUBMIT_DIR /home/bennet/mpi_example
scheduling priority (-e) 0
pending signals (-i) 765698
max memory size (kbytes, -m) 2097152
open files (-n) 131072
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
max user processes (-u) 765698
Running on gl3137.arc-ts.umich.edu at Sat Jan 16 09:31:24 EST 2021
Currently Loaded Modules:
1) intel/18.0.5 2) impi/2018.4.274
Your job output begins below the line
#-------------------------------------------------------------------
The sum = 0.866386
. . . .
Here is an example of the cost estimator in use
[bennet@gl-build mpi_example]$ my_job_estimate -p standard -n 2 -c 8 -m 4gb -t 1:00:00
Job Detail Summary:
Partition: standard
Total Nodes: 2
Total Cores: 8
Total Memory: 4096.0MB
Walltime: 0 day(s)
1 hour(s)
00 minute(s)
00 second(s)
Cost Estimate:
Total: $0.21 for 1.0 hours.
NOTE: This price estimate assumes your job runs
for the full walltime. Cost is subject to change.
Suggestions from @justbennet:
...thought you might like to know about a couple of additional scripts, in case you wanted to add them to the Commands section of SLURM Basics.
First, we wrote my_job_statistics to produce (hopefully) friendlier output than seff.
Second, we wrote my_job_header to print information about the job and its environment that would be useful for debugging or reproducibility.
Third, we wrote my_job_estimate, which used to take a job script as input and give you the estimated cost if it were to run the full walltime. Now it seems only to take arguments, but may still prove useful. I put an example below.
Here is a comparison of the seff/my_job_statistics outputs.
[bennet@gl-build mpi_example]$ seff 16971600
Job ID: 16971600
Cluster: greatlakes
User/Group: bennet/bennet
State: COMPLETED (exit code 0)
Nodes: 2
Cores per node: 2
CPU Utilized: 00:01:30
CPU Efficiency: 47.87% of 00:03:08 core-walltime
Job Wall-clock time: 00:00:47
Memory Utilized: 17.09 MB
Memory Efficiency: 0.42% of 4.00 GB
[bennet@gl-build mpi_example]$ my_job_statistics 16971600
End of job summary for JobID 16971600 on the greatlakes cluster for bennet
Job name: test_mpi
Job start time: 01/16/2021 09:31:23
Job end time: 01/16/2021 09:32:10
Job running time: 00:00:47
State: COMPLETED
Exit code: 0
On nodes: gl[3137-3138]
(2 nodes with 2 cores per node)
CPU Utilized: 00:01:30
CPU Efficiency: 47.87% of 00:03:08 total CPU time (cores * walltime)
Memory Utilized: 17.09 MB
Memory Efficiency: 0.42% of 4.00 GB
Cost: $0.00
Here is what my_job_header printed at the top of the output, which has most of the things about a job that we are likely to ask for or about if you were to submit a ticket.
Job information
#-------------------------------------------------------------------
SLURM_SUBMIT_HOST gl-build.arc-ts.umich.edu
SLURM_JOB_ACCOUNT hpcstaff
SLURM_JOB_PARTITION standard
SLURM_JOB_NAME test_mpi
SLURM_JOBID 16971600
SLURM_NODELIST gl[3137-3138]
SLURM_JOB_NUM_NODES 2
SLURM_NTASKS 4
SLURM_TASKS_PER_NODE 2(x2)
SLURM_CPUS_PER_TASK 1
SLURM_NPROCS 4
SLURM_MEM_PER_CPU
SLURM_SUBMIT_DIR /home/bennet/mpi_example
scheduling priority (-e) 0
pending signals (-i) 765698
max memory size (kbytes, -m) 2097152
open files (-n) 131072
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
max user processes (-u) 765698
Running on gl3137.arc-ts.umich.edu at Sat Jan 16 09:31:24 EST 2021
Currently Loaded Modules:
1) intel/18.0.5 2) impi/2018.4.274
Your job output begins below the line
#-------------------------------------------------------------------
The sum = 0.866386
. . . .
Here is an example of the cost estimator in use
[bennet@gl-build mpi_example]$ my_job_estimate -p standard -n 2 -c 8 -m 4gb -t 1:00:00
Job Detail Summary:
Partition: standard
Total Nodes: 2
Total Cores: 8
Total Memory: 4096.0MB
Walltime: 0 day(s)
1 hour(s)
00 minute(s)
00 second(s)
Cost Estimate:
Total: $0.21 for 1.0 hours.
NOTE: This price estimate assumes your job runs
for the full walltime. Cost is subject to change.