Skip to content

Validation scripts do not work #5

@oriolerm

Description

@oriolerm

Using this command to run the validation scripts:
python validate_all.py --config_json_file example_validate.json

Then the output I get is this:

`
Global Parameters:
Link speed gbps: ['100Gbps']
Oversubscription ratio: ['1:1']
Topology sizes: [128]
Cc algo: ['nscc', 'rccc', 'rccc+os_cc', 'nscc+rccc']

Experiments:
Experiment Name: permutation
Running permutation with global parameters: {'link_speed_Gbps': '100Gbps', 'oversubscription_ratio': '1:1', 'topology_sizes': 128, 'cc_algo': 'nscc'} and subparameters: {'message_size_bytes': 262144}
Creating CM named python ../connection_matrices/gen_permutation.py experiments/permutation_size128_osratio1:1_linkspeed100Gbps/tmp/permutation_size262144B.cm 128 128 262144 0 42
Nodes: 128
Connections: 128
Flowsize: 262144 bytes
ExtraStartTime: 0.0 us
Random Seed 42
Executing: ../htsim_uec -tm experiments/permutation_size128_osratio1:1_linkspeed100Gbps/tmp/permutation_size262144B.cm -end 1000000 -sender_cc_only -topo ../topologies/fat_tree_128_1os.topo -linkspeed 100000 > experiments/permutation_size128_osratio1:1_linkspeed100Gbps/tmp/permutation_size262144B_nscc_os_ratio1:1_size_topo128_link_speed100Gbps.out
Running permutation with global parameters: {'link_speed_Gbps': '100Gbps', 'oversubscription_ratio': '1:1', 'topology_sizes': 128, 'cc_algo': 'nscc'} and subparameters: {'message_size_bytes': 1048576}
Creating CM named python ../connection_matrices/gen_permutation.py experiments/permutation_size128_osratio1:1_linkspeed100Gbps/tmp/permutation_size1048576B.cm 128 128 1048576 0 42
Nodes: 128
Connections: 128
Flowsize: 1048576 bytes
ExtraStartTime: 0.0 us
Random Seed 42
Executing: ../htsim_uec -tm experiments/permutation_size128_osratio1:1_linkspeed100Gbps/tmp/permutation_size1048576B.cm -end 1000000 -sender_cc_only -topo ../topologies/fat_tree_128_1os.topo -linkspeed 100000 > experiments/permutation_size128_osratio1:1_linkspeed100Gbps/tmp/permutation_size1048576B_nscc_os_ratio1:1_size_topo128_link_speed100Gbps.out
Running permutation with global parameters: {'link_speed_Gbps': '100Gbps', 'oversubscription_ratio': '1:1', 'topology_sizes': 128, 'cc_algo': 'nscc'} and subparameters: {'message_size_bytes': 4194304}
Creating CM named python ../connection_matrices/gen_permutation.py experiments/permutation_size128_osratio1:1_linkspeed100Gbps/tmp/permutation_size4194304B.cm 128 128 4194304 0 42
Nodes: 128
Connections: 128
Flowsize: 4194304 bytes
ExtraStartTime: 0.0 us
Random Seed 42
Executing: ../htsim_uec -tm experiments/permutation_size128_osratio1:1_linkspeed100Gbps/tmp/permutation_size4194304B.cm -end 1000000 -sender_cc_only -topo ../topologies/fat_tree_128_1os.topo -linkspeed 100000 > experiments/permutation_size128_osratio1:1_linkspeed100Gbps/tmp/permutation_size4194304B_nscc_os_ratio1:1_size_topo128_link_speed100Gbps.out
Running permutation with global parameters: {'link_speed_Gbps': '100Gbps', 'oversubscription_ratio': '1:1', 'topology_sizes': 128, 'cc_algo': 'rccc'} and subparameters: {'message_size_bytes': 262144}
Creating CM named python ../connection_matrices/gen_permutation.py experiments/permutation_size128_osratio1:1_linkspeed100Gbps/tmp/permutation_size262144B.cm 128 128 262144 0 42
Nodes: 128
Connections: 128
Flowsize: 262144 bytes
ExtraStartTime: 0.0 us
Random Seed 42
Executing: ../htsim_uec -tm experiments/permutation_size128_osratio1:1_linkspeed100Gbps/tmp/permutation_size262144B.cm -end 1000000 -topo ../topologies/fat_tree_128_1os.topo -linkspeed 100000 -force_disable_oversubscribed_cc > experiments/permutation_size128_osratio1:1_linkspeed100Gbps/tmp/permutation_size262144B_rccc_os_ratio1:1_size_topo128_link_speed100Gbps.out
Running permutation with global parameters: {'link_speed_Gbps': '100Gbps', 'oversubscription_ratio': '1:1', 'topology_sizes': 128, 'cc_algo': 'rccc'} and subparameters: {'message_size_bytes': 1048576}
Creating CM named python ../connection_matrices/gen_permutation.py experiments/permutation_size128_osratio1:1_linkspeed100Gbps/tmp/permutation_size1048576B.cm 128 128 1048576 0 42
Nodes: 128
Connections: 128
Flowsize: 1048576 bytes
ExtraStartTime: 0.0 us
Random Seed 42
Executing: ../htsim_uec -tm experiments/permutation_size128_osratio1:1_linkspeed100Gbps/tmp/permutation_size1048576B.cm -end 1000000 -topo ../topologies/fat_tree_128_1os.topo -linkspeed 100000 -force_disable_oversubscribed_cc > experiments/permutation_size128_osratio1:1_linkspeed100Gbps/tmp/permutation_size1048576B_rccc_os_ratio1:1_size_topo128_link_speed100Gbps.out
Running permutation with global parameters: {'link_speed_Gbps': '100Gbps', 'oversubscription_ratio': '1:1', 'topology_sizes': 128, 'cc_algo': 'rccc'} and subparameters: {'message_size_bytes': 4194304}
Creating CM named python ../connection_matrices/gen_permutation.py experiments/permutation_size128_osratio1:1_linkspeed100Gbps/tmp/permutation_size4194304B.cm 128 128 4194304 0 42
Nodes: 128
Connections: 128
Flowsize: 4194304 bytes
ExtraStartTime: 0.0 us
Random Seed 42
Executing: ../htsim_uec -tm experiments/permutation_size128_osratio1:1_linkspeed100Gbps/tmp/permutation_size4194304B.cm -end 1000000 -topo ../topologies/fat_tree_128_1os.topo -linkspeed 100000 -force_disable_oversubscribed_cc > experiments/permutation_size128_osratio1:1_linkspeed100Gbps/tmp/permutation_size4194304B_rccc_os_ratio1:1_size_topo128_link_speed100Gbps.out
Running permutation with global parameters: {'link_speed_Gbps': '100Gbps', 'oversubscription_ratio': '1:1', 'topology_sizes': 128, 'cc_algo': 'rccc+os_cc'} and subparameters: {'message_size_bytes': 262144}
Creating CM named python ../connection_matrices/gen_permutation.py experiments/permutation_size128_osratio1:1_linkspeed100Gbps/tmp/permutation_size262144B.cm 128 128 262144 0 42
Nodes: 128
Connections: 128
Flowsize: 262144 bytes
ExtraStartTime: 0.0 us
Random Seed 42
Executing: ../htsim_uec -tm experiments/permutation_size128_osratio1:1_linkspeed100Gbps/tmp/permutation_size262144B.cm -end 1000000 -topo ../topologies/fat_tree_128_1os.topo -linkspeed 100000 > experiments/permutation_size128_osratio1:1_linkspeed100Gbps/tmp/permutation_size262144B_rccc+os_cc_os_ratio1:1_size_topo128_link_speed100Gbps.out
Running permutation with global parameters: {'link_speed_Gbps': '100Gbps', 'oversubscription_ratio': '1:1', 'topology_sizes': 128, 'cc_algo': 'rccc+os_cc'} and subparameters: {'message_size_bytes': 1048576}
Creating CM named python ../connection_matrices/gen_permutation.py experiments/permutation_size128_osratio1:1_linkspeed100Gbps/tmp/permutation_size1048576B.cm 128 128 1048576 0 42
Nodes: 128
Connections: 128
Flowsize: 1048576 bytes
ExtraStartTime: 0.0 us
Random Seed 42
Executing: ../htsim_uec -tm experiments/permutation_size128_osratio1:1_linkspeed100Gbps/tmp/permutation_size1048576B.cm -end 1000000 -topo ../topologies/fat_tree_128_1os.topo -linkspeed 100000 > experiments/permutation_size128_osratio1:1_linkspeed100Gbps/tmp/permutation_size1048576B_rccc+os_cc_os_ratio1:1_size_topo128_link_speed100Gbps.out
Running permutation with global parameters: {'link_speed_Gbps': '100Gbps', 'oversubscription_ratio': '1:1', 'topology_sizes': 128, 'cc_algo': 'rccc+os_cc'} and subparameters: {'message_size_bytes': 4194304}
Creating CM named python ../connection_matrices/gen_permutation.py experiments/permutation_size128_osratio1:1_linkspeed100Gbps/tmp/permutation_size4194304B.cm 128 128 4194304 0 42
Nodes: 128
Connections: 128
Flowsize: 4194304 bytes
ExtraStartTime: 0.0 us
Random Seed 42
Executing: ../htsim_uec -tm experiments/permutation_size128_osratio1:1_linkspeed100Gbps/tmp/permutation_size4194304B.cm -end 1000000 -topo ../topologies/fat_tree_128_1os.topo -linkspeed 100000 > experiments/permutation_size128_osratio1:1_linkspeed100Gbps/tmp/permutation_size4194304B_rccc+os_cc_os_ratio1:1_size_topo128_link_speed100Gbps.out
Running permutation with global parameters: {'link_speed_Gbps': '100Gbps', 'oversubscription_ratio': '1:1', 'topology_sizes': 128, 'cc_algo': 'nscc+rccc'} and subparameters: {'message_size_bytes': 262144}
Creating CM named python ../connection_matrices/gen_permutation.py experiments/permutation_size128_osratio1:1_linkspeed100Gbps/tmp/permutation_size262144B.cm 128 128 262144 0 42
Nodes: 128
Connections: 128
Flowsize: 262144 bytes
ExtraStartTime: 0.0 us
Random Seed 42
Executing: ../htsim_uec -tm experiments/permutation_size128_osratio1:1_linkspeed100Gbps/tmp/permutation_size262144B.cm -end 1000000 -sender_cc -topo ../topologies/fat_tree_128_1os.topo -linkspeed 100000 > experiments/permutation_size128_osratio1:1_linkspeed100Gbps/tmp/permutation_size262144B_nscc+rccc_os_ratio1:1_size_topo128_link_speed100Gbps.out
Running permutation with global parameters: {'link_speed_Gbps': '100Gbps', 'oversubscription_ratio': '1:1', 'topology_sizes': 128, 'cc_algo': 'nscc+rccc'} and subparameters: {'message_size_bytes': 1048576}
Creating CM named python ../connection_matrices/gen_permutation.py experiments/permutation_size128_osratio1:1_linkspeed100Gbps/tmp/permutation_size1048576B.cm 128 128 1048576 0 42
Nodes: 128
Connections: 128
Flowsize: 1048576 bytes
ExtraStartTime: 0.0 us
Random Seed 42
Executing: ../htsim_uec -tm experiments/permutation_size128_osratio1:1_linkspeed100Gbps/tmp/permutation_size1048576B.cm -end 1000000 -sender_cc -topo ../topologies/fat_tree_128_1os.topo -linkspeed 100000 > experiments/permutation_size128_osratio1:1_linkspeed100Gbps/tmp/permutation_size1048576B_nscc+rccc_os_ratio1:1_size_topo128_link_speed100Gbps.out
Running permutation with global parameters: {'link_speed_Gbps': '100Gbps', 'oversubscription_ratio': '1:1', 'topology_sizes': 128, 'cc_algo': 'nscc+rccc'} and subparameters: {'message_size_bytes': 4194304}
Creating CM named python ../connection_matrices/gen_permutation.py experiments/permutation_size128_osratio1:1_linkspeed100Gbps/tmp/permutation_size4194304B.cm 128 128 4194304 0 42
Nodes: 128
Connections: 128
Flowsize: 4194304 bytes
ExtraStartTime: 0.0 us
Random Seed 42
Executing: ../htsim_uec -tm experiments/permutation_size128_osratio1:1_linkspeed100Gbps/tmp/permutation_size4194304B.cm -end 1000000 -sender_cc -topo ../topologies/fat_tree_128_1os.topo -linkspeed 100000 > experiments/permutation_size128_osratio1:1_linkspeed100Gbps/tmp/permutation_size4194304B_nscc+rccc_os_ratio1:1_size_topo128_link_speed100Gbps.out
No valid runtimes found in file permutation_size4194304B_rccc_os_ratio1:1_size_topo128_link_speed100Gbps.out. Skipping.
No valid runtimes found in file permutation_size262144B_nscc+rccc_os_ratio1:1_size_topo128_link_speed100Gbps.out. Skipping.
No valid runtimes found in file permutation_size262144B_nscc_os_ratio1:1_size_topo128_link_speed100Gbps.out. Skipping.
No valid runtimes found in file permutation_size262144B_rccc+os_cc_os_ratio1:1_size_topo128_link_speed100Gbps.out. Skipping.
No valid runtimes found in file permutation_size4194304B_nscc_os_ratio1:1_size_topo128_link_speed100Gbps.out. Skipping.
No valid runtimes found in file permutation_size4194304B_rccc+os_cc_os_ratio1:1_size_topo128_link_speed100Gbps.out. Skipping.
No valid runtimes found in file permutation_size262144B_rccc_os_ratio1:1_size_topo128_link_speed100Gbps.out. Skipping.
No valid runtimes found in file permutation_size1048576B_nscc+rccc_os_ratio1:1_size_topo128_link_speed100Gbps.out. Skipping.
No valid runtimes found in file permutation_size1048576B_nscc_os_ratio1:1_size_topo128_link_speed100Gbps.out. Skipping.
No valid runtimes found in file permutation_size1048576B_rccc+os_cc_os_ratio1:1_size_topo128_link_speed100Gbps.out. Skipping.
No valid runtimes found in file permutation_size4194304B_nscc+rccc_os_ratio1:1_size_topo128_link_speed100Gbps.out. Skipping.
No valid runtimes found in file permutation_size1048576B_rccc_os_ratio1:1_size_topo128_link_speed100Gbps.out. Skipping.
Traceback (most recent call last):
File "/mnt/nvme0/robin/simulations/HTSIM/htsim/sim/datacenter/validation/validate_all.py", line 278, in
main()
File "/mnt/nvme0/robin/simulations/HTSIM/htsim/sim/datacenter/validation/validate_all.py", line 275, in main
launch_experiments(data['experiments'], global_combinations, global_parameters, args)
File "/mnt/nvme0/robin/simulations/HTSIM/htsim/sim/datacenter/validation/validate_all.py", line 242, in launch_experiments
handle_experiment(experiment, global_combinations, global_parameters, args)
File "/mnt/nvme0/robin/simulations/HTSIM/htsim/sim/datacenter/validation/validate_all.py", line 236, in handle_experiment
analysis_and_plotting.plot_runtimes(directory_tmp, directory, args)
File "/mnt/nvme0/robin/simulations/HTSIM/htsim/sim/datacenter/validation/analysis_and_plotting.py", line 200, in plot_runtimes
if ("incast" in df['Experiment'].values):
~~^^^^^^^^^^^^^^
File "/mnt/nvme0/robin/simulations/HTSIM/htsim/lib/python3.12/site-packages/pandas/core/frame.py", line 4102, in getitem
indexer = self.columns.get_loc(key)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/nvme0/robin/simulations/HTSIM/htsim/lib/python3.12/site-packages/pandas/core/indexes/range.py", line 417, in get_loc
raise KeyError(key)
KeyError: 'Experiment'
`
It seems that the flow completion is not happening and therefore not being registered by the plotting functionality.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions