Skip to content

[Bug] Revise the _remove_state_dict_prefix and _add_state_dict_prefix functions in timm.py to adapt to the case of multiple submodels.#1295

Open
wilxy wants to merge 3 commits intoopen-mmlab:dev-1.xfrom
wilxy:dev-1.x
Open

[Bug] Revise the _remove_state_dict_prefix and _add_state_dict_prefix functions in timm.py to adapt to the case of multiple submodels.#1295
wilxy wants to merge 3 commits intoopen-mmlab:dev-1.xfrom
wilxy:dev-1.x

Conversation

@wilxy
Copy link
Copy Markdown

@wilxy wilxy commented Jan 4, 2023

When using TimmClassifier as student or teacher model in Knowledge Distillation Algorithms, there have some bugs in save_checkpoint and load_checkpoint.

  1. save_checkpoint
    When saving checkpoint like save_checkpoint(self.state_dict(), 'xxx.pth'), where self is a Knowledge Distillation Algorithm which contains submodels self.student and self.teacher, self.state_dict() will recursively call the state_dict function here.
    The _remove_state_dict_prefix function in the TimmClassifier class will be used as a hook to modify the original destination.
    Specifically, the _remove_state_dict_prefix function creates a new_state_dict whose memory is different from the original destination as the hook_result to modify the original destination for submodels student and teacher. But the state_dict funtion of the Knowledge Distillation Algorithm Model will not receive this modify, so the memory address and value of destination have not changed.
    To solve this problem, we change the _remove_state_dict_prefix function to modify the state_dict directly instead of creating a new_state_dict.

  2. load_checkpoint
    When loading checkpoint of a Knowledge Distillation Algorithm Model whose student and teacher are all TimmClassifier. The _add_state_dict_prefix function in the TimmClassifier class will be used as a hook to modify the state_dict of each submodel.
    When modifying the student submodel, _add_state_dict_prefix function will delete all keys of teacher submodel.
    To solve this problem, we change the _add_state_dict_prefix function to only delete the key that different from its new_key.

…ions in timm.py to adapt to the case of multiple submodels.
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Jan 4, 2023

CLA assistant check
All committers have signed the CLA.

@Ezra-Yu
Copy link
Copy Markdown
Collaborator

Ezra-Yu commented Jan 4, 2023

please sign the CLA so that I can review your PR.

@mzr1996
Copy link
Copy Markdown
Member

mzr1996 commented Jan 9, 2023

Hello, can you sign the CLA and fix the lint problem? Then we can merge the PR. @wilxy

@wilxy
Copy link
Copy Markdown
Author

wilxy commented Jan 10, 2023

Hello, can you sign the CLA and fix the lint problem? Then we can merge the PR. @wilxy

Thanks for the reminder, I've signed the CLA and fixed the lint problem.

@Ezra-Yu
Copy link
Copy Markdown
Collaborator

Ezra-Yu commented May 6, 2023

Hi @wilxy , Can you migrate this PR to the main branch?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants