Skip to content

Feature/hps paddleocr vl 1.5#5017

Merged
Bobholamovic merged 6 commits intoPaddlePaddle:developfrom
scyyh11:feature/hps-paddleocr-vl-1.5
Feb 28, 2026
Merged

Feature/hps paddleocr vl 1.5#5017
Bobholamovic merged 6 commits intoPaddlePaddle:developfrom
scyyh11:feature/hps-paddleocr-vl-1.5

Conversation

@scyyh11
Copy link
Collaborator

@scyyh11 scyyh11 commented Feb 26, 2026

概述

  • HPS 打包脚本支持派生产线:通过读取 PIPELINE_APP_ROUTER(如 PaddleOCR-VL-1.5),自动复用源产线的 server/client/version,仅替换 pipeline_config.yaml
  • assemble.sh:将 name_mappings.py 和产线配置目录挂载到 Docker 容器中;使用脚本相对路径定位仓库根目录
  • assemble.py:通过 ast 解析 PIPELINE_APP_ROUTER--all 模式自动包含映射产线,从源产线目录复制并覆盖产线配置
  • 修复 Path.with_suffix() 对含点号的产线名(如 PaddleOCR-VL-1.5)生成错误归档文件名的问题
  • 服务化部署文档(中英文)添加 PaddleOCR-VL-1.5 SDK 下载链接

测试情况

  • bash scripts/assemble.sh PaddleOCR-VL-1.5 生成 output/paddlex_hps_PaddleOCR-VL-1.5_sdk.tar.gz
  • 解压后 server/pipeline_config.yaml 包含 pipeline_name: PaddleOCR-VL-1.5、模型 PP-DocLayoutV3PaddleOCR-VL-1.5-0.9B
  • server/model_repo/layout-parsing/1/model.py 与 PaddleOCR-VL 版本一致

Add derived pipeline support to the HPS assembly scripts so that
pipelines defined in PIPELINE_APP_ROUTER (e.g. PaddleOCR-VL-1.5)
can automatically reuse the source pipeline's server/client/version
while substituting the correct pipeline_config.yaml.

- assemble.sh: mount name_mappings.py and pipeline configs into the
  Docker container; resolve paths relative to the script location
- assemble.py: parse PIPELINE_APP_ROUTER via ast, include mapped
  pipelines in --all, copy from source dir and overwrite config
- docs: add PaddleOCR-VL-1.5 SDK download link to serving docs
pathlib.Path.with_suffix() treats the dot in names like
PaddleOCR-VL-1.5 as a file extension, producing incorrect archive
names (e.g. paddlex_hps_PaddleOCR-VL-1.tar.gz). Use string
concatenation instead to preserve the full SDK name.
@scyyh11 scyyh11 requested a review from Bobholamovic February 26, 2026 08:41
@paddle-bot
Copy link

paddle-bot bot commented Feb 26, 2026

Thanks for your contribution!

CLIENT_LIB_PATH = BASE_DIR / "paddlex-hps-client"
OUTPUT_DIR = BASE_DIR / "output"
NAME_MAPPINGS_PATH = BASE_DIR / "_name_mappings.py"
PIPELINE_CONFIGS_DIR = BASE_DIR / "_pipeline_configs"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

建议类比其他产线,在pipelines目录里创建PaddleOCR-VL-1.5产线对应目录,只是其中只存放一个配置文件(pipeline_config.yaml)。这是因为高稳定性服务化部署需要的产线配置文件可能和paddlex/configs里的不一样。

- Create pipelines/PaddleOCR-VL-1.5/ with its own pipeline_config.yaml
  so HPS config can diverge from paddlex/configs independently
- Remove _pipeline_configs volume mount from assemble.sh (no longer needed)
- Remove PIPELINE_CONFIGS_DIR from assemble.py, read config from local
  pipeline directory instead
- Add NOTE comment explaining why ast is used to parse PIPELINE_APP_ROUTER
@scyyh11 scyyh11 requested a review from Bobholamovic February 27, 2026 08:46
"""Parse PIPELINE_APP_ROUTER from the mounted name_mappings.py file."""
"""Parse PIPELINE_APP_ROUTER from the mounted name_mappings.py file.

NOTE: We use `ast` to extract the dict value without importing the module,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

建议放在后面,不放在docstring里

if mapped_source is not None:
mapped_config = PIPELINE_CONFIGS_DIR / f"{pipeline_name}.yaml"
mapped_pipeline_dir = PIPELINES_DIR / pipeline_name
mapped_config = mapped_pipeline_dir / "pipeline_config.yaml"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

建议这个文件调整一下逻辑,改成mapped_pipeline_dir和pipeline_dir合并,优先使用mapped_pipeline_dir中的文件(不只限于pipeline_config.yaml)

scyyh11 and others added 2 commits February 27, 2026 20:35
- Move pipeline_config.yaml to server/ subdirectory to mirror source
  pipeline structure, enabling generic file-level merge
- Use copytree with dirs_exist_ok to overlay mapped pipeline files on
  top of source, so any file can be overridden (not just config)
- Sync config with latest: VLRecognition batch_size=-1, add Serving
- Move ast NOTE from docstring to inline comment
@scyyh11 scyyh11 requested a review from Bobholamovic February 28, 2026 10:10
@Bobholamovic Bobholamovic merged commit 1ffc4a6 into PaddlePaddle:develop Feb 28, 2026
4 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants