Add Android Startup Perfetto trace support#4776
Conversation
…testing. Still need to add copy of the file to the upload dir if running in helix.
d340e3b to
fc53ef5
Compare
|
Tested successfully locally and initial use is just in manual runs. |
src/scenarios/shared/runner.py
Outdated
| # Wait until the total time taken to start the app is greater than the max startup time + 3 | ||
| # This is to ensure that the trace has captured the entire startup process | ||
| getLogger().info("Ensuring the trace capture has been completed.") | ||
| while time.time() - perfetto_start_time_sec < max_startup_time_sec + 3: |
There was a problem hiding this comment.
where the 3 seconds come from?
Would it make a sense to try verify the trace capturing finished somehow?
This way it feels potentially fragile
There was a problem hiding this comment.
Completely moved away from using the seconds and instead switched to killing the tracing after we finish the startup command. This background start and then kill seems to be the recommended approach after further digging in the docs: https://perfetto.dev/docs/concepts/config#android.
src/scenarios/shared/runner.py
Outdated
| pull_trace_cmd = xharness_adb() + [ | ||
| 'pull', | ||
| perfetto_device_save_file, | ||
| os.path.join(os.getcwd(), const.TRACEDIR, f'perfetto_startup_trace_{self.packagename}_{time.time()}.trace') |
There was a problem hiding this comment.
nit: Would it make sense to use formatted time instead of a number here? Would somebody be checking the files manually?
There was a problem hiding this comment.
Would somebody be checking the files manually?
The idea is that users would upload the captured trace to https://ui.perfetto.dev and analyzed the results there.
There was a problem hiding this comment.
Thanks. Does the filename format matter anyhow for the tool? If not (meaning the filename would not get processed by a machine) than some more human friendly time format make more sense to me. Still a nit, though.
There was a problem hiding this comment.
Does the filename format matter anyhow for the tool?
Unless we do some very weird filename it shouldn't matter. Human friendly file name would be nice :)
There was a problem hiding this comment.
Great idea, I have made the change to instead have a nicely formatted datetime instead. It will be in my next update.
src/scenarios/shared/runner.py
Outdated
|
|
||
| perfetto_cmd = xharness_adb() + [ | ||
| 'shell', | ||
| f'perfetto --background --txt -o {perfetto_device_save_file} --time {max_startup_time_sec + 3}s -b 64mb sched freq idle am wm gfx view binder_driver hal dalvik camera input res memory' |
There was a problem hiding this comment.
where the +3 comes from?
I have no idea about the usual "magnitude" of the max_startup_time_sec - but if it's just some kind of reserve, it might be better to add some percentage to the original value (in case the value might vary significantly)?
There was a problem hiding this comment.
We generally want to run the tracing for the startup duration, that is what is calculated in:
max_startup_time_sec = int(max(int(re.search(r"TotalTime: (\d+)", str(result)).group(1)) for result in allResults) / 1000).
+3s is probably just to introduce some slack if the startup would take unexpectedly longer.
I would maybe add the max_startup_time_sec + 3 as a separate variable value with the comment from below why we use + 3 to make it a bit clearer.
I have no idea about the usual "magnitude" of the max_startup_time_sec
In scenarios we are measuring the startup ranges from hundreds of milliseconds to low seconds. Making it, for example, 1.5x of the max startup should be good enough as well.
There was a problem hiding this comment.
Completely moved away from using the seconds and instead switched to killing the tracing after we finish the startup command. This background start and then kill seems to be the recommended approach after further digging in the docs: https://perfetto.dev/docs/concepts/config#android. Still kept a max time for the tracing at 2x the max found startup, up to 60 seconds. At 60 seconds it throws an error as an arbitrary but, I think reasonable, cut off.
e-kharion
left a comment
There was a problem hiding this comment.
LGTM in general, left few comments/questions
|
|
||
| try: | ||
| # Get the current value of persist.traced.enable | ||
| getLogger().info("Getting current persist.traced.enable value") |
There was a problem hiding this comment.
super nit: I know you follow the pattern here, but I feel that decomposing the code a bit would improve the readability/maintainability significantly. For example separating all the commands into some helper commands library class and only call those here would make the code much nicer in my opinion.
There was a problem hiding this comment.
I agree with this assessment, so I made a task here: #4784.
…time human readable, and switching from catching the trace by taking the max startup and adding 3 seconds to instead just killing perfetto once startup testing is complete.
src/scenarios/shared/runner.py
Outdated
|
|
||
| # Stop perfetto now that the app has started. Sending a Terminate signal should be enough per the longer trace capturing guidance here: https://perfetto.dev/docs/concepts/config#android. | ||
| getLogger().info("Stopping perfetto trace capture") | ||
| stop_perfetto_cmd = xharness_adb() + [ |
There was a problem hiding this comment.
Should the perfetto killall also be in the finally block?
There was a problem hiding this comment.
Additionally, it might make sense to run a precautionary perfetto killall before running the kill command at the top. I know a couple times we have been bitten by processes not being killed.
There was a problem hiding this comment.
Having a kill before the execution is a great idea. I have updated the code per these comments. I included a perfetto kill in the finally block, though am not super worried about it as the time parameter in the call starting the trace should ensure it ends eventually.
…art and to kill as a finally command if we failed before the normal kill spot.
Add support for getting perfetto traces for the android runs. This will start out as a primarily manual thing to help users of our scenario testing when doing manual runs, or manually setup pipeline runs.