Skip to content

mcore-adapter从VLM的hfconfig中获取参数异常 #419

@YisuZhou

Description

@YisuZhou

hfconfig中,vlm模型(下图是qwen2-vl)的llm部分的参数,包在了text_config中,导致llm相关key均无法正常获取,会导致后续功能异常。
Image

[WARNING] [mcore_adapter.models.converter.template]: key='vocab_size' not exists in hf_config for get_hf_config_value
[WARNING] [mcore_adapter.models.converter.template]: key='intermediate_size' not exists in hf_config for get_hf_config_value
[WARNING] [mcore_adapter.models.converter.template]: key='attention_dropout' not exists in hf_config for get_hf_config_value
【xxxx此处省略多个key】

最终导致初始化异常
[rank1]: Traceback (most recent call last):
[rank1]: File "/yisu/LlamaFactory-main/src/llamafactory/launcher.py", line 185, in
[rank1]: run_exp()
[rank1]: File "/yisu/LlamaFactory-main/src/llamafactory/train/tuner.py", line 139, in run_exp
[rank1]: _training_function(config={"args": args, "callbacks": callbacks})
[rank1]: File "/yisu/LlamaFactory-main/src/llamafactory/train/tuner.py", line 98, in _training_function
[rank1]: run_sft_mca(model_args, data_args, training_args, finetuning_args, callbacks)
[rank1]: File "/yisu/LlamaFactory-main/src/llamafactory/train/mca/workflow.py", line 229, in run_sft
[rank1]: model = AutoModel.from_pretrained(model_args.model_name_or_path, training_args)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/opt/conda/lib/python3.12/site-packages/mcore_adapter/models/auto/modeling_auto.py", line 58, in from_pretrained
[rank1]: return model_class.from_pretrained(model_name_or_path, *args, **kwargs)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/opt/conda/lib/python3.12/site-packages/mcore_adapter/models/model_factory.py", line 213, in from_pretrained
[rank1]: config = cls.config_class.from_pretrained(model_name_or_path, args)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/opt/conda/lib/python3.12/site-packages/mcore_adapter/models/model_config.py", line 184, in from_pretrained
[rank1]: config.post_init()
[rank1]: File "/opt/conda/lib/python3.12/site-packages/mcore_adapter/models/model_config.py", line 48, in post_init
[rank1]: self.post_init()
[rank1]: File "/opt/conda/lib/python3.12/site-packages/mcore_adapter/models/qwen2_vl/config_qwen2_vl.py", line 28, in post_init
[rank1]: super().post_init()
[rank1]: File "/opt/conda/lib/python3.12/site-packages/mcore_adapter/models/model_config.py", line 379, in post_init
[rank1]: super().post_init()
[rank1]: File "/opt/conda/lib/python3.12/site-packages/megatron/core/transformer/transformer_config.py", line 934, in post_init
[rank1]: self.kv_channels = self.hidden_size // self.num_attention_heads
[rank1]: ~~~~~~~~~~~~~~~~~^^~~~~~~~~~~~~~~~~~~~~~~~~~
[rank1]: ZeroDivisionError: integer division or modulo by zero

纯LLM没有上述问题

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions