Demo to the team regarding pytorch profiler, trace collection, trace analysis.
Plan to give this in Jan last week.
torch.cuda.memory._record_memory_history + torch.cuda.memory._dump_snapshot is a good addition.
https://amd.atlassian.net/wiki/spaces/~glencao2/pages/331547168/DL+profiling+tools+libs+etc.
We can demo torch profiler, rocprof, sqtt & rocprof-compute (formerly omni-prof)