Qualcomm AI Engine Direct - Python API Refactor by winskuo-quic · Pull Request #18312 · pytorch/executorch

winskuo-quic · 2026-03-19T07:51:12Z

Summary

The biggest goal of this PR is to improve user experience by maintaining consistency across all example scripts and provide an official config file for QNN APIs.
In the past, user has to manually provide params to APIs such as build_executorch_binary to make it work. However, not all the params are passed in the build_executorch_binary, making some of the flags not working, which leaves users confused. Taking example below, if user tries to skip node in script 1, it will fail.

For this reason, we want to maintain a QnnConfig structure as an official config file.
If we want to introduce a new flag to our APIs, if we want all scripts to benefit from the flag, we will need to update all our example scripts, making it hard to maintain as we support more flags and more scripts. With this feature, we don't have to manually update all example scripts when a new flag is introduced. Instead, all QnnConfig will parse it itself and we don't have to update the example script at all.

This PR does the following:

Introduce QnnConfig, which takes parser or .json file as input
Migrate our Qnn ExecuTorch official APIs to backends/qualcomm/export_utils.py. The reason of doing this is API calls shouldn't be under examples folder. Furthermore, pip install executorch does not include examples/qualcomm folder, meaning these APIs are not exposed to users that uses pip install.
The following flags can now all be removed from example scripts and backend API will handle the logic: compile_only, pre_gen_pte, skip_push, profile_level, dump_intermediate_outputs, shared_buffer, skip_delegate_node_ids, skip_delegate_node_ops.
Update README so it aligns with the new behavior.

Test plan

Passes all tests under test_qnn_delegate.py.

pytorch-bot · 2026-03-19T07:51:16Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18312

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

Rolling out OSDC (ARC) runners on pull workflow for PyTorch trunk commits

❌ 5 New Failures, 1 Cancelled Job, 5 Unrelated Failures

As of commit 50cee3e with merge base 411ede2 ():

NEW FAILURES - The following jobs have failed:

pull / android / run-emulator (gh)
The process '/usr/bin/sh' failed with exit code 1
trunk / test-torchao-huggingface-checkpoints (lfm2_5_1_2b, linux.arm64.2xlarge, executorch-ubuntu-22.04-g... / linux-job (gh)
RuntimeError: Command docker exec -t ddbaadb9e3e6bfeb558706b542f775f8132862cf123e37b82e05f3250b3d64ae /exec failed with exit code 1
trunk / test-torchao-huggingface-checkpoints (phi_4_mini, linux.arm64.2xlarge, executorch-ubuntu-22.04-gc... / linux-job (gh)
RuntimeError: Command docker exec -t 0a754a379be8ffb167ad0fa55ea059f2f4f6e5addd4cdcdd2f1b1dbf21014fd8 /exec failed with exit code 1
trunk / test-torchao-huggingface-checkpoints (qwen3_4b, linux.arm64.2xlarge, executorch-ubuntu-22.04-gcc1... / linux-job (gh)
RuntimeError: Command docker exec -t eee5527ab27fcc41938bc7d66a44e946ca1c13da4341109c801dfb85046ae98a /exec failed with exit code 1
trunk / unittest-release / macos / macos-job (gh)
export/tests/test_target_recipes.py::TestTargetRecipes::test_mv3_model

CANCELLED JOB - The following job was cancelled. Please retry:

trunk / test-models-macos-coreml (mv3) / macos-job (gh)
##[error]The operation was canceled.

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

pull / test-models-linux (mv2, portable, linux.2xlarge) / linux-job (gh) (detected as infra flaky with no log or failing log classifier)
pull / unittest-arm-backend-with-no-deps (test_pytest_models_tosa) / linux-job (gh) (detected as infra flaky with no log or failing log classifier)

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / unittest / windows / windows-job (gh) (trunk failure)
##[error]The operation was canceled.
pull / unittest-editable / windows / windows-job (gh) (trunk failure)
##[error]The operation was canceled.
trunk / unittest-release / windows / windows-job (gh) (trunk failure)
##[error]The operation was canceled.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2026-03-19T07:51:53Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

abhinaykukkadapu · 2026-03-30T23:01:31Z

backends/qualcomm/export_utils.py

+    backend_options = {
+        QnnExecuTorchBackendType.kGpuBackend: generate_gpu_compiler_spec(),
+        QnnExecuTorchBackendType.kHtpBackend: generate_htp_compiler_spec(
+            use_fp16=False if any([quant_dtype, custom_quantizer]) is not None else True


@winskuo-quic any() always yields a bool, this check should be use_fp16=not any([quant_dtype, custom_quantizer])

Thanks for the catch. This is a lot cleaner. Fixed.

abhinaykukkadapu · 2026-03-30T23:03:40Z

examples/qualcomm/oss_scripts/fastvit.py

            backend=backend,
            soc_model=soc_model,
+            backend=qnn_config.backend,


are we supposed to pass both backend params here?

Ah, I think this is a rebase error. Thanks for the catch. Fixed.

abhinaykukkadapu · 2026-03-30T23:06:51Z

examples/qualcomm/oss_scripts/moshi/mimi.py

            )
        elif args.pre_gen_pte:
            logging.info("Inference only for QNN Static Decoder")
            qnn_decode_res = inference_static_mimi_decoder(


Function signature mismatch.

Fixed. Thanks

abhinaykukkadapu · 2026-03-30T23:11:55Z

@winskuo-quic thank you for these changes, i def think this would help with user experience, i tried my best to comb through as the diff is big, i found some syntax issues which was highlighted by the ide but i wanted to make sure if you can check few examples. Other than that i think this is good to go, once you rebase for conflicts and address minor things.

meta-codesync · 2026-04-02T04:25:15Z

@abhinaykukkadapu has imported this pull request. If you are a Meta employee, you can view this in D99197179.

abhinaykukkadapu · 2026-04-02T23:43:26Z

@claude review the PR for potential internal CI build breakages and buck file drift. I see these failures on the imported diff internally, for example, deep dive and provide feedback to resolve buck issues.

#link-tree/executorch/backends/qualcomm/tests/test_qnn_delegate.py", line 31, in <module>
    from executorch.backends.qualcomm.export_utils import (
ModuleNotFoundError: No module named 'executorch.backends.qualcomm.export_utils'

claude · 2026-04-02T23:44:04Z

Claude finished @abhinaykukkadapu's task in 12m 1s —— View job

Review: Internal CI Build Breakages and Buck File Drift

I've analyzed the PR changes against the Buck/TARGETS build configuration. The reported ModuleNotFoundError for executorch.backends.qualcomm.export_utils is confirmed, and there are multiple Buck file issues that need to be resolved.

Root Cause

The new file backends/qualcomm/export_utils.py (872 lines) was added but no corresponding Buck/TARGETS target was created for it. The Buck build system has no knowledge of this module, so any internal Buck-based build that tries to import it will fail.

Issue 1: Missing Buck target for `backends/qualcomm/export_utils.py` (Critical)

The file backends/qualcomm/export_utils.py is a brand new file added by this PR, but there is no corresponding entry in any of:

backends/qualcomm/BUCK
backends/qualcomm/targets.bzl

A new python_library target needs to be added. Based on the imports in export_utils.py, the target should look something like:

# In backends/qualcomm/BUCK (fbcode section) or backends/qualcomm/targets.bzl
runtime.python_library(
    name = "export_utils",
    srcs = ["export_utils.py"],
    visibility = ["PUBLIC"],
    deps = [
        "//caffe2:torch",
        "//executorch/backends/qualcomm/debugger:qnn_intermediate_debugger",
        "//executorch/backends/qualcomm/quantizer:quantizer",
        "//executorch/backends/qualcomm/serialization:serialization",
        "//executorch/backends/qualcomm/utils:utils",
        "//executorch/exir/backend:utils",
        "//executorch/exir/capture:config",
        "//executorch/exir/passes:memory_planning_pass",
        "//pytorch/ao:torchao",
    ],
)

Fix this →

Issue 2: `backends/qualcomm/tests/BUCK` dependency is stale (Critical)

backends/qualcomm/tests/BUCK:38 still references "//executorch/examples/qualcomm:utils" as a dependency. However, the test files (test_qnn_delegate.py at line 31 and utils.py at line 21) now import from executorch.backends.qualcomm.export_utils instead.

The dep needs to be updated (not just added):

# backends/qualcomm/tests/BUCK, line 38
-        "//executorch/examples/qualcomm:utils",
+        "//executorch/backends/qualcomm:export_utils",

Note: backends/qualcomm/tests/utils.py:45 still imports from executorch.examples.qualcomm.utils import make_output_dir, so the old "//executorch/examples/qualcomm:utils" dep may still be needed alongside the new one. The fix would be:

         "//executorch/examples/qualcomm:utils",
+        "//executorch/backends/qualcomm:export_utils",

Fix this →

Issue 3: `examples/qualcomm/TARGETS` `utils` target has drifted dependencies (Moderate)

The examples/qualcomm/TARGETS still defines a utils target (lines 36-44):

runtime.python_library(
    name = "utils",
    srcs = ["utils.py"],
    deps = [
        "//executorch/backends/qualcomm/partition:partition",
        "//executorch/backends/qualcomm/quantizer:quantizer",
        "//executorch/backends/qualcomm/debugger:qnn_intermediate_debugger",
    ],
)

Since examples/qualcomm/utils.py had most of its code moved to export_utils.py, these deps are now over-specified — the slimmed-down utils.py no longer imports from partition, quantizer, or qnn_intermediate_debugger. These deps should be cleaned up, or users of the old utils target that now actually need the moved code should depend on //executorch/backends/qualcomm:export_utils instead.

Issue 4: `examples/qualcomm/oss_scripts/llama/TARGETS` and `whisper/TARGETS` still reference old target (Moderate)

examples/qualcomm/oss_scripts/llama/TARGETS:153 — "//executorch/examples/qualcomm:utils" is still listed as a dep for llama_lib. If llama.py now imports from export_utils, this needs "//executorch/backends/qualcomm:export_utils" added.
examples/qualcomm/oss_scripts/whisper/TARGETS:31 — same situation for whisper_lib.

Both files' Python sources now import from executorch.backends.qualcomm.export_utils, so they need the new Buck target added as a dep.

Issue 5: `generate_qnn_executorch_compiler_spec` API change — `profile`/`optrace` → `profile_level` (Low, but worth noting)

The function signature in backends/qualcomm/utils/utils.py changed:

-    profile: bool = False,
-    optrace: bool = False,
+    profile_level: int = 0,

This is a breaking API change for any internal callers that use named arguments profile= or optrace=. All call sites in this PR have been updated, but if there are other internal callers not in this diff, they will break. Worth verifying internally.

Summary of Required Changes

Priority	File	Action
Critical	`backends/qualcomm/BUCK` (or `targets.bzl`)	Add `export_utils` python_library target
Critical	`backends/qualcomm/tests/BUCK`	Add `//executorch/backends/qualcomm:export_utils` dep to `test_qnn_delegate` target
Moderate	`examples/qualcomm/TARGETS`	Clean up stale deps on `utils` target
Moderate	`examples/qualcomm/oss_scripts/llama/TARGETS`	Add `//executorch/backends/qualcomm:export_utils` dep to `llama_lib`
Moderate	`examples/qualcomm/oss_scripts/whisper/TARGETS`	Add `//executorch/backends/qualcomm:export_utils` dep to `whisper_lib`
Low	Internal callers of `generate_qnn_executorch_compiler_spec`	Verify no other callers use old `profile=`/`optrace=` kwargs

The critical fix is straightforward: create the Buck target for export_utils.py and wire it into the dependency graph. Without this, any Buck-based build (like internal CI) will fail with the ModuleNotFoundError reported above.

abhinaykukkadapu · 2026-04-03T03:55:02Z

@winskuo-quic the buck ones are spot on, can you please patch?

winskuo-quic · 2026-04-03T12:51:46Z

@winskuo-quic the buck ones are spot on, can you please patch?

@abhinaykukkadapu thanks for providing the feedback. I tried to address issues suggested by claude. Please let me know if you buck build is still failing.
Thanks

abhinaykukkadapu · 2026-04-08T23:07:53Z

@winskuo-quic can you please rebase there seems to be some conflicts. I've merged your minimal inference PR and will do this one after you rebase.

winskuo-quic · 2026-04-09T05:02:58Z

@winskuo-quic can you please rebase there seems to be some conflicts. I've merged your minimal inference PR and will do this one after you rebase.

@abhinaykukkadapu,
I have rebased an resolved the conflict. Thanks for reminder.

abhinaykukkadapu · 2026-04-09T05:27:53Z

@winskuo-quic Sorry, i see more conflicts.

abhinaykukkadapu · 2026-04-09T18:02:35Z

@winskuo-quic few things, i think this diff added few backward incompatible changes which is breaking internal CI.

Option 1. Can we maintain API contracts this time and going forward, if you want this, remove export_util renaming and follow same var names and types for profile and optrace. I like this tbh.
Option 2. Possibly you can introduce backward compatible scaffolds, here is the patch to show what are actually is breaking:

diff --git a/backends/qualcomm/utils/utils.py b/backends/qualcomm/utils/utils.py                                                                                                                    
  --- a/backends/qualcomm/utils/utils.py                                                                                                                                                              
  +++ b/backends/qualcomm/utils/utils.py
  @@ -1,5 +1,6 @@                                                                                                                                                                                     
   import operator                                                                 
   import os                                                                                                                                                                                          
   import re                                                                       
   import warnings
   from collections import defaultdict, OrderedDict                                                                                                                                                   
  +from typing import Any, Callable, Dict, List, Optional, Tuple, Union
                                                                                                                                                                                                      
   # ... (in generate_qnn_executorch_compiler_spec)                                                                                                                                                   
                                                                                                                                                                                                      
   def generate_qnn_executorch_compiler_spec(                                                                                                                                                         
       soc_model: QcomChipset,                                                     
       backend_options: QnnExecuTorchBackendOptions,
       debug: bool = False,                                                                                                                                                                           
       saver: bool = False,
       online_prepare: bool = False,                                                                                                                                                                  
       dump_intermediate_outputs: bool = False,                                    
       profile_level: int = 0,                                                                                                                                                                        
       shared_buffer: bool = False,                                                                                                                                                                   
       is_from_context_binary: bool = False,
       op_package_options: QnnExecuTorchOpPackageOptions = None,                                                                                                                                      
       use_mha2sha: bool = False,                                                                                                                                                                     
  +    # Deprecated parameters - use profile_level instead                                                                                                                                            
  +    profile: Optional[bool] = None,                                                                                                                                                                
  +    optrace: Optional[bool] = None,                                                                                                                                                                
   ) -> List[CompileSpec]:                                                         
       """                                                                                                                                                                                            
       ...                                                                         
       """                                                                                                                                                                                            
       _supported_soc_models = {soc_model.value for soc_model in QcomChipset}
       if soc_model not in _supported_soc_models:                                                                                                                                                     
           raise ValueError(f"unknown SoC model for QNN: {soc_model}")             
                                                                                                                                                                                                      
  +    # Backward compatibility: map deprecated profile/optrace to profile_level   
  +    if optrace is not None:                                                                                                                                                                        
  +        warnings.warn(                                                                                                                                                                             
  +            "'optrace' is deprecated, use 'profile_level=3' instead.",
  +            DeprecationWarning,                                                                                                                                                                    
  +            stacklevel=2,                                                       
  +        )
  +        if optrace and profile_level == 0:                                                                                                                                                         
  +            profile_level = 3  # kProfileOptrace
  +    if profile is not None:                                                                                                                                                                        
  +        warnings.warn(                                                          
  +            "'profile' is deprecated, use 'profile_level=2' instead.",                                                                                                                             
  +            DeprecationWarning,                                                 
  +            stacklevel=2,
  +        )
  +        if profile and profile_level == 0:                                                                                                                                                         
  +            profile_level = 2  # kProfileDetailed
  +                                                                                                                                                                                                   
       if profile_level and dump_intermediate_outputs:                             
                                                                                                                                                                                                      
  diff --git a/examples/qualcomm/utils.py b/examples/qualcomm/utils.py
  --- a/examples/qualcomm/utils.py                                                                                                                                                                    
  +++ b/examples/qualcomm/utils.py                                                 
  @@ -6,6 +6,16 @@                                                                                                                                                                                    
   # TODO: reenable pyre after fixing the issues
   # pyre-ignore-all-errors                                                                                                                                                                           
                                                                                   
  +# Backward-compatible re-exports for symbols moved to export_utils.                                                                                                                                
  +# Internal callers still import from this module.                               
  +from executorch.backends.qualcomm.export_utils import (  # noqa: F401                                                                                                                              
  +    build_executorch_binary,                                                    
  +    make_quantizer,                                                                                                                                                                                
  +    ptq_calibrate,                                                                                                                                                                                 
  +    qat_train,
  +    SimpleADB,                                                                                                                                                                                     
  +)                                                                               
  +                                                                                                                                                                                                   
   import csv

winskuo-quic · 2026-04-12T05:17:00Z

@abhinaykukkadapu thanks for sharing the options for bc fix. Personally, I think option1 won't be able to fulfill all the needs. We are moving from executorch/examples/qualcomm/utils.py to executorch/backends/qualcomm/export_utils.py because:

When pip install executorch libraries, the library only comes with files under executorch/backends/qualcomm but doesn't comes with files under executorch/examples/qualcomm. This could possibly leave users using pip install executorch confused.
We have checked all backend vendors and noticed that their official api calls are under executoch/backends/${VENDOR_NAME}, so we want to align this behavior.

I have just applied the patch from option2, and please let me know if there's any further question or concerns for the explanation above.
Thanks

winskuo-quic requested review from abhinaykukkadapu, cccclai and lucylq as code owners March 19, 2026 07:51

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 19, 2026

winskuo-quic force-pushed the dev1/winskuo/python_api_refactor branch 2 times, most recently from 4a6594d to 19ff5fc Compare March 23, 2026 01:30

winskuo-quic mentioned this pull request Mar 24, 2026

Qualcomm AI Engine Direct - Minimal Inference Runtime Core Requirment #18434

Merged

abhinaykukkadapu reviewed Mar 30, 2026

View reviewed changes

winskuo-quic force-pushed the dev1/winskuo/python_api_refactor branch 2 times, most recently from 6a50775 to ed6ba51 Compare April 2, 2026 01:45

winskuo-quic force-pushed the dev1/winskuo/python_api_refactor branch from ed6ba51 to 8e29c28 Compare April 3, 2026 12:45

winskuo-quic force-pushed the dev1/winskuo/python_api_refactor branch from 8e29c28 to 94962f8 Compare April 8, 2026 02:02

winskuo-quic force-pushed the dev1/winskuo/python_api_refactor branch 2 times, most recently from 768fee5 to 5179a3a Compare April 9, 2026 02:24

abhinaykukkadapu approved these changes Apr 9, 2026

View reviewed changes

winskuo-quic force-pushed the dev1/winskuo/python_api_refactor branch from 5179a3a to 428fcae Compare April 9, 2026 08:14

winskuo-quic added 2 commits April 11, 2026 21:28

Qualcomm AI Engine Direct - Python API Refactor

6491c14

fix rebase

971d71b

winskuo-quic added 2 commits April 11, 2026 21:28

Rebase error fix

a8e9275

Code Review, BC change

50cee3e

winskuo-quic force-pushed the dev1/winskuo/python_api_refactor branch from 428fcae to 50cee3e Compare April 12, 2026 04:58

Conversation

winskuo-quic commented Mar 19, 2026

Summary

Test plan

Uh oh!

pytorch-bot bot commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18312

❗ 1 Active SEVs

❌ 5 New Failures, 1 Cancelled Job, 5 Unrelated Failures

Uh oh!

github-actions bot commented Mar 19, 2026

This PR needs a release notes: label

Uh oh!

abhinaykukkadapu Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

winskuo-quic Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

abhinaykukkadapu Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

winskuo-quic Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

abhinaykukkadapu Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

winskuo-quic Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

abhinaykukkadapu commented Mar 30, 2026

Uh oh!

meta-codesync bot commented Apr 2, 2026

Uh oh!

abhinaykukkadapu commented Apr 2, 2026

Uh oh!

claude bot commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review: Internal CI Build Breakages and Buck File Drift

Root Cause

Issue 1: Missing Buck target for backends/qualcomm/export_utils.py (Critical)

Issue 2: backends/qualcomm/tests/BUCK dependency is stale (Critical)

Issue 3: examples/qualcomm/TARGETS utils target has drifted dependencies (Moderate)

Issue 4: examples/qualcomm/oss_scripts/llama/TARGETS and whisper/TARGETS still reference old target (Moderate)

Issue 5: generate_qnn_executorch_compiler_spec API change — profile/optrace → profile_level (Low, but worth noting)

Summary of Required Changes

Uh oh!

abhinaykukkadapu commented Apr 3, 2026

Uh oh!

winskuo-quic commented Apr 3, 2026

Uh oh!

abhinaykukkadapu commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

winskuo-quic commented Apr 9, 2026

Uh oh!

abhinaykukkadapu commented Apr 9, 2026

Uh oh!

abhinaykukkadapu commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

winskuo-quic commented Apr 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pytorch-bot bot commented Mar 19, 2026 •

edited

Loading

This PR needs a `release notes:` label

claude bot commented Apr 2, 2026 •

edited

Loading

Issue 1: Missing Buck target for `backends/qualcomm/export_utils.py` (Critical)

Issue 2: `backends/qualcomm/tests/BUCK` dependency is stale (Critical)

Issue 3: `examples/qualcomm/TARGETS` `utils` target has drifted dependencies (Moderate)

Issue 4: `examples/qualcomm/oss_scripts/llama/TARGETS` and `whisper/TARGETS` still reference old target (Moderate)

Issue 5: `generate_qnn_executorch_compiler_spec` API change — `profile`/`optrace` → `profile_level` (Low, but worth noting)

abhinaykukkadapu commented Apr 8, 2026 •

edited

Loading

abhinaykukkadapu commented Apr 9, 2026 •

edited

Loading