Allow dumping models first inference input/output data to pickle files by pwolnows · Pull Request #4027 · openvinotoolkit/open_model_zoo

pwolnows · 2025-11-04T18:43:44Z

NPU development team requested providing solution that will allow validate models with minimal datasets.
The provided solution allows to executed accuracy_checker with _dump_first_infer_data parameter passed to openvino_luancher (or each OpenVINO model from network_info), as result input and output data from the first inference requests is stored inside _dump_first_infer_data pickle file, additionaly model evaluator stores annotations, indentifiers and metada related with input data.

Wovchena · 2025-11-05T08:06:53Z

Summon people who requested this to review

Copilot

Pull Request Overview

This PR adds functionality to dump the first inference input/output data to pickle files for model validation with minimal datasets. This capability is specifically requested by the NPU development team to facilitate model debugging and validation.

Key changes:

Added _dump_first_infer_data parameter support across launcher and evaluator components to enable first-inference data capture
Implemented data dumping logic that stores input/output tensors along with associated metadata (annotations, identifiers) to pickle files
Refined error message formatting in the presenters module to improve readability and consistency

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
tools/accuracy_checker/accuracy_checker/presenters.py	Improved error message formatting for metric comparisons with cleaner structure and conditional handling
tools/accuracy_checker/accuracy_checker/launcher/openvino_launcher.py	Added first inference data dumping capability with InferData namedtuple and automatic async mode disable
tools/accuracy_checker/accuracy_checker/launcher/loaders/loader.py	Added InferDataBatch namedtuple to store annotations, identifiers, and metadata
tools/accuracy_checker/accuracy_checker/launcher/loaders/init.py	Exported InferDataBatch for use in other modules
tools/accuracy_checker/accuracy_checker/evaluators/model_evaluator.py	Implemented first inference data storage with dataset preprocessing
tools/accuracy_checker/accuracy_checker/evaluators/custom_evaluators/base_models.py	Added inference data dumping to the base model class
tools/accuracy_checker/accuracy_checker/evaluators/custom_evaluators/base_custom_evaluator.py	Added inference data dumping support for custom evaluators
tools/accuracy_checker/accuracy_checker/argparser.py	Added CLI argument for specifying inference data dump path

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

tools/accuracy_checker/accuracy_checker/evaluators/custom_evaluators/base_models.py

tools/accuracy_checker/accuracy_checker/presenters.py

Copilot · 2025-11-05T08:08:36Z

tools/accuracy_checker/accuracy_checker/launcher/openvino_launcher.py

+            friendly_outputs = {out_node.get_node().friendly_name: out_res for out_node, out_res in outputs.items()}
+            self._dump_first_inference(InferData(feed_dict, friendly_outputs))
+            results.append(friendly_outputs)


The dictionary comprehension for friendly_outputs is computed even when inference data dumping is disabled. Since _dump_first_inference returns early when _dump_first_infer_data is None, this creates unnecessary overhead for every inference. Consider moving the dictionary comprehension inside the dump method or checking the flag before creating it.

Suggested change

friendly_outputs = {out_node.get_node().friendly_name: out_res for out_node, out_res in outputs.items()}

self._dump_first_inference(InferData(feed_dict, friendly_outputs))

results.append(friendly_outputs)

# Only compute friendly_outputs if dumping is enabled or for results

if self._dump_first_infer_data is not None:

friendly_outputs = {out_node.get_node().friendly_name: out_res for out_node, out_res in outputs.items()}

self._dump_first_inference(InferData(feed_dict, friendly_outputs))

results.append(friendly_outputs)

else:

friendly_outputs = {out_node.get_node().friendly_name: out_res for out_node, out_res in outputs.items()}

results.append(friendly_outputs)

Copilot · 2025-11-05T08:08:36Z

tools/accuracy_checker/accuracy_checker/launcher/openvino_launcher.py

        self.reset_memory_state = self.get_value_from_config('reset_memory_state')
+        self._dump_first_infer_data = self.config.get('_dump_first_infer_data', None)
+        if self._dump_first_infer_data is not None:
+            self.async_mode = False


Silently disabling async mode when dumping is enabled could cause unexpected performance differences in testing. Consider logging a warning message to inform users that async mode has been disabled due to inference data dumping.

Suggested change

self.async_mode = False

self.async_mode = False

warning(

"Async mode has been disabled because inference data dumping is enabled "

"(_dump_first_infer_data is set). This may affect performance results."

)

Added as suggested

Copilot · 2025-11-05T08:08:37Z

tools/accuracy_checker/accuracy_checker/evaluators/model_evaluator.py


    @staticmethod
-    def store_predictions(stored_predictions, predictions):
+    def store_predictions(stored_predictions, predictions, mode='ab'):


The mode parameter changes the method's behavior significantly ('ab' for appending vs 'wb' for overwriting), but this is not validated. Adding 'wb' mode creates a risk of data loss if used incorrectly. Consider creating a separate method for the write-once use case or validating the mode parameter.

Suggested change

def store_predictions(stored_predictions, predictions, mode='ab'):

def store_predictions(stored_predictions, predictions, mode='ab'):

if mode not in ('ab', 'wb'):

raise ValueError(f"Invalid mode '{mode}' for store_predictions. Only 'ab' (append) and 'wb' (overwrite) are allowed.")

Copilot · 2025-11-05T08:08:38Z

tools/accuracy_checker/accuracy_checker/presenters.py

            message = "{} {}".format(message, result_message)
-        elif abs_threshold <= diff_with_ref[0] or rel_threshold <= diff_with_ref[1]:
-            fail_message = "[FAILED:  {}]".format(error_text)
+        elif abs_threshold <= diff_with_ref[0] or (rel_threshold and rel_threshold <= diff_with_ref[1]):


The condition abs_threshold <= diff_with_ref[0] will fail when abs_threshold is None due to the comparison with a numeric value. This should check abs_threshold is not None and abs_threshold <= diff_with_ref[0] to avoid TypeError.

Suggested change

elif abs_threshold <= diff_with_ref[0] or (rel_threshold and rel_threshold <= diff_with_ref[1]):

elif (abs_threshold is not None and abs_threshold <= diff_with_ref[0]) or (rel_threshold and rel_threshold <= diff_with_ref[1]):

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot

Pull Request Overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-11-06T11:37:59Z

tools/accuracy_checker/accuracy_checker/launcher/openvino_launcher.py

+        dump_inf_data = {'input': input_data, 'output': output_data}
+        with open(self._dump_first_infer_data, 'wb') as file:
+            pickle.dump(dump_inf_data, file)


The _dump_first_infer_data method lacks documentation explaining its purpose, parameters, and the structure of the dumped pickle data. Add a docstring describing the method's behavior and the format of the dumped data.

Copilot · 2025-11-06T11:37:59Z

tools/accuracy_checker/accuracy_checker/evaluators/custom_evaluators/base_models.py

+        if self._dump_first_infer_data:
+            with open(self._dump_first_infer_data, 'wb') as file:
+                print_info(f'Storing first inference data to {self._dump_first_infer_data}')
+                dump_inf_data = {'input': feed_dict, 'output': res_outputs}
+                pickle.dump(dump_inf_data, file)
+                self._dump_first_infer_data = None


The pickle dumping logic is duplicated between openvino_launcher.py and base_models.py. Extract this into a shared utility function to avoid code duplication and ensure consistent behavior.

Copilot · 2025-11-06T11:37:59Z

tools/accuracy_checker/accuracy_checker/dataset.py

+        print_info(f"Storing first annotation to {dump_infer_data_path}")
+        with open(dump_infer_data_path, mode='wb') as content:


The method writes to the same file path (dump_infer_data_path) that is used by inference data dumping in openvino_launcher.py and base_models.py. This creates a race condition where annotation data may overwrite inference data or vice versa. Use separate file paths or combine the data into a single write operation.

Suggested change

print_info(f"Storing first annotation to {dump_infer_data_path}")

with open(dump_infer_data_path, mode='wb') as content:

annotation_path = str(Path(dump_infer_data_path).with_suffix('.annotation.pkl'))

print_info(f"Storing first annotation to {annotation_path}")

with open(annotation_path, mode='wb') as content:

Copilot · 2025-11-06T11:38:00Z

tools/accuracy_checker/accuracy_checker/dataset.py


        return list(subsample_set)

+    def store_first_annotation(self, annotation, identifier, dump_infer_data_path):


The store_first_annotation method lacks documentation explaining its purpose, parameters, and the structure of the stored data. Add a docstring describing the expected types for annotation and identifier, and the format of the pickle data.

Suggested change

def store_first_annotation(self, annotation, identifier, dump_infer_data_path):

def store_first_annotation(self, annotation, identifier, dump_infer_data_path):

"""

Stores the first annotation and its identifier to a pickle file for later inspection or debugging.

Args:

annotation (BaseRepresentation): The annotation object to store. Typically an instance of a subclass of BaseRepresentation.

identifier (Any): The identifier associated with the annotation. Usually a string or object uniquely identifying the annotation.

dump_infer_data_path (str or Path): The file path where the pickled data will be stored.

Pickle Data Format:

The pickled file will contain a dictionary with the following structure:

{

'annotation': {

'class_name': <str>, # Name of the annotation class

... # All attributes of the annotation, with numpy arrays converted to lists

},

'identifier': <Any> # The identifier value

}

"""

Copilot · 2025-11-06T11:38:00Z

tools/accuracy_checker/accuracy_checker/launcher/openvino_launcher.py

+        print_info(f'Storing first inference data to {self._dump_first_infer_data}')
+        dump_inf_data = {'input': input_data, 'output': output_data}
+        with open(self._dump_first_infer_data, 'wb') as file:


The file path from _dump_first_infer_data is used directly without validation. Add validation to ensure the path is safe and doesn't allow writing to unauthorized locations through path traversal attacks.

Suggested change

print_info(f'Storing first inference data to {self._dump_first_infer_data}')

dump_inf_data = {'input': input_data, 'output': output_data}

with open(self._dump_first_infer_data, 'wb') as file:

# Validate the path to prevent path traversal and absolute paths

dump_path = Path(self._dump_first_infer_data)

safe_base = Path.cwd()

try:

resolved_path = dump_path.resolve()

except Exception as e:

warning(f"Failed to resolve dump path: {e}. Skipping dump.")

self._dump_first_infer_data = None

return

if not str(resolved_path).startswith(str(safe_base)):

warning(f"Unsafe dump path detected: {resolved_path}. Skipping dump.")

self._dump_first_infer_data = None

return

print_info(f'Storing first inference data to {resolved_path}')

dump_inf_data = {'input': input_data, 'output': output_data}

with open(resolved_path, 'wb') as file:

pwolnows added 3 commits November 4, 2025 18:44

Add possibility to dump model first inference input ouput data

a3283e3

Add command line option for accuracy_check tool

2b9ca75

Present model status based on abs_threshold

15e6c73

Wovchena requested a review from Copilot November 5, 2025 08:06

Copilot AI reviewed Nov 5, 2025

View reviewed changes

pwolnows added 13 commits November 6, 2025 12:23

Make pickle inder data independent from accuracy_checker

3b5de35

Revert changes for argparser.py

fe8493f

Revert change base_custom_evaluator.py

b557261

Remove pickle import base_custom_evaluator.py

32acb8b

Update base_custom_evaluator.py

4b7a848

Update base_models.py

7c60a41

Switch from False to None base_models.py

0a6277a

Update model_evaluator.py

b26b15f

Update model_evaluator.py

aea6957

Update __init__.py

2c5a989

Update loader.py

5b840e5

Update openvino_launcher.py

3cec415

Update openvino_launcher.py

022cfdf

Wovchena requested a review from Copilot November 6, 2025 11:36

Update tools/accuracy_checker/accuracy_checker/presenters.py

fbb6259

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI reviewed Nov 6, 2025

View reviewed changes

sstrehlk approved these changes Nov 6, 2025

View reviewed changes

pwolnows merged commit 1c45f9f into openvinotoolkit:master Nov 6, 2025
9 of 21 checks passed

-            self.async_mode = False
+            self.async_mode = False
+            warning(
+                "Async mode has been disabled because inference data dumping is enabled "
+                "(_dump_first_infer_data is set). This may affect performance results."
+            )

	elif abs_threshold <= diff_with_ref[0] or (rel_threshold and rel_threshold <= diff_with_ref[1]):
	elif (abs_threshold is not None and abs_threshold <= diff_with_ref[0]) or (rel_threshold and rel_threshold <= diff_with_ref[1]):

		print_info(f"Storing first annotation to {dump_infer_data_path}")
		with open(dump_infer_data_path, mode='wb') as content:

-        print_info(f"Storing first annotation to {dump_infer_data_path}")
-        with open(dump_infer_data_path, mode='wb') as content:
+        annotation_path = str(Path(dump_infer_data_path).with_suffix('.annotation.pkl'))
+        print_info(f"Storing first annotation to {annotation_path}")
+        with open(annotation_path, mode='wb') as content:


		return list(subsample_set)

		def store_first_annotation(self, annotation, identifier, dump_infer_data_path):

-    def store_first_annotation(self, annotation, identifier, dump_infer_data_path):
+    def store_first_annotation(self, annotation, identifier, dump_infer_data_path):
+        """
+        Stores the first annotation and its identifier to a pickle file for later inspection or debugging.
+        Args:
+            annotation (BaseRepresentation): The annotation object to store. Typically an instance of a subclass of BaseRepresentation.
+            identifier (Any): The identifier associated with the annotation. Usually a string or object uniquely identifying the annotation.
+            dump_infer_data_path (str or Path): The file path where the pickled data will be stored.
+        Pickle Data Format:
+            The pickled file will contain a dictionary with the following structure:
+                {
+                    'annotation': {
+                        'class_name': <str>,  # Name of the annotation class
+                        ...                   # All attributes of the annotation, with numpy arrays converted to lists
+                    },
+                    'identifier': <Any>        # The identifier value
+                }
+        """

-        print_info(f'Storing first inference data to {self._dump_first_infer_data}')
-        dump_inf_data = {'input': input_data, 'output': output_data}
-        with open(self._dump_first_infer_data, 'wb') as file:
+        # Validate the path to prevent path traversal and absolute paths
+        dump_path = Path(self._dump_first_infer_data)
+        safe_base = Path.cwd()
+        try:
+            resolved_path = dump_path.resolve()
+        except Exception as e:
+            warning(f"Failed to resolve dump path: {e}. Skipping dump.")
+            self._dump_first_infer_data = None
+            return
+        if not str(resolved_path).startswith(str(safe_base)):
+            warning(f"Unsafe dump path detected: {resolved_path}. Skipping dump.")
+            self._dump_first_infer_data = None
+            return
+        print_info(f'Storing first inference data to {resolved_path}')
+        dump_inf_data = {'input': input_data, 'output': output_data}
+        with open(resolved_path, 'wb') as file:

Conversation

pwolnows commented Nov 4, 2025

Uh oh!

Wovchena commented Nov 5, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Copilot AI Nov 5, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 5, 2025

Choose a reason for hiding this comment

Uh oh!

pwolnows Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 5, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 5, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Copilot AI Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants