Skip to content

Qualcomm AI Engine Direct - Python API Refactor#18312

Open
winskuo-quic wants to merge 4 commits intopytorch:mainfrom
CodeLinaro:dev1/winskuo/python_api_refactor
Open

Qualcomm AI Engine Direct - Python API Refactor#18312
winskuo-quic wants to merge 4 commits intopytorch:mainfrom
CodeLinaro:dev1/winskuo/python_api_refactor

Conversation

@winskuo-quic
Copy link
Copy Markdown
Collaborator

Summary

The biggest goal of this PR is to improve user experience by maintaining consistency across all example scripts and provide an official config file for QNN APIs.
In the past, user has to manually provide params to APIs such as build_executorch_binary to make it work. However, not all the params are passed in the build_executorch_binary, making some of the flags not working, which leaves users confused. Taking example below, if user tries to skip node in script 1, it will fail.
image
For this reason, we want to maintain a QnnConfig structure as an official config file.
If we want to introduce a new flag to our APIs, if we want all scripts to benefit from the flag, we will need to update all our example scripts, making it hard to maintain as we support more flags and more scripts. With this feature, we don't have to manually update all example scripts when a new flag is introduced. Instead, all QnnConfig will parse it itself and we don't have to update the example script at all.

This PR does the following:

  1. Introduce QnnConfig, which takes parser or .json file as input
  2. Migrate our Qnn ExecuTorch official APIs to backends/qualcomm/export_utils.py. The reason of doing this is API calls shouldn't be under examples folder. Furthermore, pip install executorch does not include examples/qualcomm folder, meaning these APIs are not exposed to users that uses pip install.
  3. The following flags can now all be removed from example scripts and backend API will handle the logic: compile_only, pre_gen_pte, skip_push, profile_level, dump_intermediate_outputs, shared_buffer, skip_delegate_node_ids, skip_delegate_node_ops.
  4. Update README so it aligns with the new behavior.

Test plan

Passes all tests under test_qnn_delegate.py.

@pytorch-bot
Copy link
Copy Markdown

pytorch-bot bot commented Mar 19, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18312

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

❌ 5 New Failures, 1 Cancelled Job, 5 Unrelated Failures

As of commit 50cee3e with merge base 411ede2 (image):

NEW FAILURES - The following jobs have failed:

CANCELLED JOB - The following job was cancelled. Please retry:

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 19, 2026
@github-actions
Copy link
Copy Markdown

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

@winskuo-quic winskuo-quic force-pushed the dev1/winskuo/python_api_refactor branch 2 times, most recently from 4a6594d to 19ff5fc Compare March 23, 2026 01:30
backend_options = {
QnnExecuTorchBackendType.kGpuBackend: generate_gpu_compiler_spec(),
QnnExecuTorchBackendType.kHtpBackend: generate_htp_compiler_spec(
use_fp16=False if any([quant_dtype, custom_quantizer]) is not None else True
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@winskuo-quic any() always yields a bool, this check should be use_fp16=not any([quant_dtype, custom_quantizer])

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the catch. This is a lot cleaner. Fixed.

Comment on lines +77 to +79
backend=backend,
soc_model=soc_model,
backend=qnn_config.backend,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we supposed to pass both backend params here?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I think this is a rebase error. Thanks for the catch. Fixed.

)
elif args.pre_gen_pte:
logging.info("Inference only for QNN Static Decoder")
qnn_decode_res = inference_static_mimi_decoder(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function signature mismatch.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Thanks

@abhinaykukkadapu
Copy link
Copy Markdown
Contributor

@winskuo-quic thank you for these changes, i def think this would help with user experience, i tried my best to comb through as the diff is big, i found some syntax issues which was highlighted by the ide but i wanted to make sure if you can check few examples. Other than that i think this is good to go, once you rebase for conflicts and address minor things.

@winskuo-quic winskuo-quic force-pushed the dev1/winskuo/python_api_refactor branch 2 times, most recently from 6a50775 to ed6ba51 Compare April 2, 2026 01:45
@meta-codesync
Copy link
Copy Markdown
Contributor

meta-codesync bot commented Apr 2, 2026

@abhinaykukkadapu has imported this pull request. If you are a Meta employee, you can view this in D99197179.

@abhinaykukkadapu
Copy link
Copy Markdown
Contributor

@claude review the PR for potential internal CI build breakages and buck file drift. I see these failures on the imported diff internally, for example, deep dive and provide feedback to resolve buck issues.

#link-tree/executorch/backends/qualcomm/tests/test_qnn_delegate.py", line 31, in <module>
    from executorch.backends.qualcomm.export_utils import (
ModuleNotFoundError: No module named 'executorch.backends.qualcomm.export_utils'

@claude
Copy link
Copy Markdown

claude bot commented Apr 2, 2026

Claude finished @abhinaykukkadapu's task in 12m 1s —— View job


Review: Internal CI Build Breakages and Buck File Drift

I've analyzed the PR changes against the Buck/TARGETS build configuration. The reported ModuleNotFoundError for executorch.backends.qualcomm.export_utils is confirmed, and there are multiple Buck file issues that need to be resolved.


Root Cause

The new file backends/qualcomm/export_utils.py (872 lines) was added but no corresponding Buck/TARGETS target was created for it. The Buck build system has no knowledge of this module, so any internal Buck-based build that tries to import it will fail.


Issue 1: Missing Buck target for backends/qualcomm/export_utils.py (Critical)

The file backends/qualcomm/export_utils.py is a brand new file added by this PR, but there is no corresponding entry in any of:

  • backends/qualcomm/BUCK
  • backends/qualcomm/targets.bzl

A new python_library target needs to be added. Based on the imports in export_utils.py, the target should look something like:

# In backends/qualcomm/BUCK (fbcode section) or backends/qualcomm/targets.bzl
runtime.python_library(
    name = "export_utils",
    srcs = ["export_utils.py"],
    visibility = ["PUBLIC"],
    deps = [
        "//caffe2:torch",
        "//executorch/backends/qualcomm/debugger:qnn_intermediate_debugger",
        "//executorch/backends/qualcomm/quantizer:quantizer",
        "//executorch/backends/qualcomm/serialization:serialization",
        "//executorch/backends/qualcomm/utils:utils",
        "//executorch/exir/backend:utils",
        "//executorch/exir/capture:config",
        "//executorch/exir/passes:memory_planning_pass",
        "//pytorch/ao:torchao",
    ],
)

Fix this →


Issue 2: backends/qualcomm/tests/BUCK dependency is stale (Critical)

backends/qualcomm/tests/BUCK:38 still references "//executorch/examples/qualcomm:utils" as a dependency. However, the test files (test_qnn_delegate.py at line 31 and utils.py at line 21) now import from executorch.backends.qualcomm.export_utils instead.

The dep needs to be updated (not just added):

# backends/qualcomm/tests/BUCK, line 38
-        "//executorch/examples/qualcomm:utils",
+        "//executorch/backends/qualcomm:export_utils",

Note: backends/qualcomm/tests/utils.py:45 still imports from executorch.examples.qualcomm.utils import make_output_dir, so the old "//executorch/examples/qualcomm:utils" dep may still be needed alongside the new one. The fix would be:

         "//executorch/examples/qualcomm:utils",
+        "//executorch/backends/qualcomm:export_utils",

Fix this →


Issue 3: examples/qualcomm/TARGETS utils target has drifted dependencies (Moderate)

The examples/qualcomm/TARGETS still defines a utils target (lines 36-44):

runtime.python_library(
    name = "utils",
    srcs = ["utils.py"],
    deps = [
        "//executorch/backends/qualcomm/partition:partition",
        "//executorch/backends/qualcomm/quantizer:quantizer",
        "//executorch/backends/qualcomm/debugger:qnn_intermediate_debugger",
    ],
)

Since examples/qualcomm/utils.py had most of its code moved to export_utils.py, these deps are now over-specified — the slimmed-down utils.py no longer imports from partition, quantizer, or qnn_intermediate_debugger. These deps should be cleaned up, or users of the old utils target that now actually need the moved code should depend on //executorch/backends/qualcomm:export_utils instead.


Issue 4: examples/qualcomm/oss_scripts/llama/TARGETS and whisper/TARGETS still reference old target (Moderate)

  • examples/qualcomm/oss_scripts/llama/TARGETS:153"//executorch/examples/qualcomm:utils" is still listed as a dep for llama_lib. If llama.py now imports from export_utils, this needs "//executorch/backends/qualcomm:export_utils" added.
  • examples/qualcomm/oss_scripts/whisper/TARGETS:31 — same situation for whisper_lib.

Both files' Python sources now import from executorch.backends.qualcomm.export_utils, so they need the new Buck target added as a dep.


Issue 5: generate_qnn_executorch_compiler_spec API change — profile/optraceprofile_level (Low, but worth noting)

The function signature in backends/qualcomm/utils/utils.py changed:

-    profile: bool = False,
-    optrace: bool = False,
+    profile_level: int = 0,

This is a breaking API change for any internal callers that use named arguments profile= or optrace=. All call sites in this PR have been updated, but if there are other internal callers not in this diff, they will break. Worth verifying internally.


Summary of Required Changes

Priority File Action
Critical backends/qualcomm/BUCK (or targets.bzl) Add export_utils python_library target
Critical backends/qualcomm/tests/BUCK Add //executorch/backends/qualcomm:export_utils dep to test_qnn_delegate target
Moderate examples/qualcomm/TARGETS Clean up stale deps on utils target
Moderate examples/qualcomm/oss_scripts/llama/TARGETS Add //executorch/backends/qualcomm:export_utils dep to llama_lib
Moderate examples/qualcomm/oss_scripts/whisper/TARGETS Add //executorch/backends/qualcomm:export_utils dep to whisper_lib
Low Internal callers of generate_qnn_executorch_compiler_spec Verify no other callers use old profile=/optrace= kwargs

The critical fix is straightforward: create the Buck target for export_utils.py and wire it into the dependency graph. Without this, any Buck-based build (like internal CI) will fail with the ModuleNotFoundError reported above.


@abhinaykukkadapu
Copy link
Copy Markdown
Contributor

@winskuo-quic the buck ones are spot on, can you please patch?

@winskuo-quic winskuo-quic force-pushed the dev1/winskuo/python_api_refactor branch from ed6ba51 to 8e29c28 Compare April 3, 2026 12:45
@winskuo-quic
Copy link
Copy Markdown
Collaborator Author

@winskuo-quic the buck ones are spot on, can you please patch?

@abhinaykukkadapu thanks for providing the feedback. I tried to address issues suggested by claude. Please let me know if you buck build is still failing.
Thanks

@winskuo-quic winskuo-quic force-pushed the dev1/winskuo/python_api_refactor branch from 8e29c28 to 94962f8 Compare April 8, 2026 02:02
@abhinaykukkadapu
Copy link
Copy Markdown
Contributor

abhinaykukkadapu commented Apr 8, 2026

@winskuo-quic can you please rebase there seems to be some conflicts. I've merged your minimal inference PR and will do this one after you rebase.

@winskuo-quic winskuo-quic force-pushed the dev1/winskuo/python_api_refactor branch 2 times, most recently from 768fee5 to 5179a3a Compare April 9, 2026 02:24
@winskuo-quic
Copy link
Copy Markdown
Collaborator Author

@winskuo-quic can you please rebase there seems to be some conflicts. I've merged your minimal inference PR and will do this one after you rebase.

@abhinaykukkadapu,
I have rebased an resolved the conflict. Thanks for reminder.

@abhinaykukkadapu
Copy link
Copy Markdown
Contributor

@winskuo-quic Sorry, i see more conflicts.

@winskuo-quic winskuo-quic force-pushed the dev1/winskuo/python_api_refactor branch from 5179a3a to 428fcae Compare April 9, 2026 08:14
@abhinaykukkadapu
Copy link
Copy Markdown
Contributor

abhinaykukkadapu commented Apr 9, 2026

@winskuo-quic few things, i think this diff added few backward incompatible changes which is breaking internal CI.

Option 1. Can we maintain API contracts this time and going forward, if you want this, remove export_util renaming and follow same var names and types for profile and optrace. I like this tbh.
Option 2. Possibly you can introduce backward compatible scaffolds, here is the patch to show what are actually is breaking:

diff --git a/backends/qualcomm/utils/utils.py b/backends/qualcomm/utils/utils.py                                                                                                                    
  --- a/backends/qualcomm/utils/utils.py                                                                                                                                                              
  +++ b/backends/qualcomm/utils/utils.py
  @@ -1,5 +1,6 @@                                                                                                                                                                                     
   import operator                                                                 
   import os                                                                                                                                                                                          
   import re                                                                       
   import warnings
   from collections import defaultdict, OrderedDict                                                                                                                                                   
  +from typing import Any, Callable, Dict, List, Optional, Tuple, Union
                                                                                                                                                                                                      
   # ... (in generate_qnn_executorch_compiler_spec)                                                                                                                                                   
                                                                                                                                                                                                      
   def generate_qnn_executorch_compiler_spec(                                                                                                                                                         
       soc_model: QcomChipset,                                                     
       backend_options: QnnExecuTorchBackendOptions,
       debug: bool = False,                                                                                                                                                                           
       saver: bool = False,
       online_prepare: bool = False,                                                                                                                                                                  
       dump_intermediate_outputs: bool = False,                                    
       profile_level: int = 0,                                                                                                                                                                        
       shared_buffer: bool = False,                                                                                                                                                                   
       is_from_context_binary: bool = False,
       op_package_options: QnnExecuTorchOpPackageOptions = None,                                                                                                                                      
       use_mha2sha: bool = False,                                                                                                                                                                     
  +    # Deprecated parameters - use profile_level instead                                                                                                                                            
  +    profile: Optional[bool] = None,                                                                                                                                                                
  +    optrace: Optional[bool] = None,                                                                                                                                                                
   ) -> List[CompileSpec]:                                                         
       """                                                                                                                                                                                            
       ...                                                                         
       """                                                                                                                                                                                            
       _supported_soc_models = {soc_model.value for soc_model in QcomChipset}
       if soc_model not in _supported_soc_models:                                                                                                                                                     
           raise ValueError(f"unknown SoC model for QNN: {soc_model}")             
                                                                                                                                                                                                      
  +    # Backward compatibility: map deprecated profile/optrace to profile_level   
  +    if optrace is not None:                                                                                                                                                                        
  +        warnings.warn(                                                                                                                                                                             
  +            "'optrace' is deprecated, use 'profile_level=3' instead.",
  +            DeprecationWarning,                                                                                                                                                                    
  +            stacklevel=2,                                                       
  +        )
  +        if optrace and profile_level == 0:                                                                                                                                                         
  +            profile_level = 3  # kProfileOptrace
  +    if profile is not None:                                                                                                                                                                        
  +        warnings.warn(                                                          
  +            "'profile' is deprecated, use 'profile_level=2' instead.",                                                                                                                             
  +            DeprecationWarning,                                                 
  +            stacklevel=2,
  +        )
  +        if profile and profile_level == 0:                                                                                                                                                         
  +            profile_level = 2  # kProfileDetailed
  +                                                                                                                                                                                                   
       if profile_level and dump_intermediate_outputs:                             
                                                                                                                                                                                                      
  diff --git a/examples/qualcomm/utils.py b/examples/qualcomm/utils.py
  --- a/examples/qualcomm/utils.py                                                                                                                                                                    
  +++ b/examples/qualcomm/utils.py                                                 
  @@ -6,6 +6,16 @@                                                                                                                                                                                    
   # TODO: reenable pyre after fixing the issues
   # pyre-ignore-all-errors                                                                                                                                                                           
                                                                                   
  +# Backward-compatible re-exports for symbols moved to export_utils.                                                                                                                                
  +# Internal callers still import from this module.                               
  +from executorch.backends.qualcomm.export_utils import (  # noqa: F401                                                                                                                              
  +    build_executorch_binary,                                                    
  +    make_quantizer,                                                                                                                                                                                
  +    ptq_calibrate,                                                                                                                                                                                 
  +    qat_train,
  +    SimpleADB,                                                                                                                                                                                     
  +)                                                                               
  +                                                                                                                                                                                                   
   import csv

@winskuo-quic winskuo-quic force-pushed the dev1/winskuo/python_api_refactor branch from 428fcae to 50cee3e Compare April 12, 2026 04:58
@winskuo-quic
Copy link
Copy Markdown
Collaborator Author

@abhinaykukkadapu thanks for sharing the options for bc fix. Personally, I think option1 won't be able to fulfill all the needs. We are moving from executorch/examples/qualcomm/utils.py to executorch/backends/qualcomm/export_utils.py because:

  1. When pip install executorch libraries, the library only comes with files under executorch/backends/qualcomm but doesn't comes with files under executorch/examples/qualcomm. This could possibly leave users using pip install executorch confused.
  2. We have checked all backend vendors and noticed that their official api calls are under executoch/backends/${VENDOR_NAME}, so we want to align this behavior.

I have just applied the patch from option2, and please let me know if there's any further question or concerns for the explanation above.
Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants