feat(vertex): 为 ai-proxy 插件的 Vertex AI Provider 添加 Express Mode 支持 || feat(vertex): Add Express Mode support to Vertex AI Provider of ai-proxy plug-in #3301

wydream · 2026-01-06T14:54:02Z

Ⅰ. Describe what this PR did

本 PR 为 ai-proxy 插件的 Vertex AI Provider 添加了 Express Mode 支持。

背景

Vertex AI 新推出了 Express Mode，这是一种简化的访问模式，允许开发者只使用 API Key 即可快速开始使用 Vertex AI，无需配置复杂的 Service Account 认证。详见 Vertex AI Express Mode 官方文档。

主要变更

实现 Express Mode 逻辑（provider/vertex.go）：
- 添加 Express Mode 路径模板（不含 project/location）
- 添加 isExpressMode() 方法：通过检测 apiTokens 是否配置来判断是否启用 Express Mode
- 修改配置验证：如果配置了 apiTokens，则使用 Express Mode，无需其他配置
- 修改域名处理：Express Mode 使用固定域名 aiplatform.googleapis.com（不带 region 前缀）
- 修改认证处理：Express Mode 跳过 OAuth 认证流程，API Key 作为 URL 查询参数传递
- 修改路径生成：Express Mode 使用 /v1/publishers/google/models/{model}:{action}?key={API_KEY} 格式
更新文档：
- README.md - 添加 Express Mode 配置说明和使用示例（中文）
- README_EN.md - 添加 Express Mode 配置说明和使用示例（英文）
添加单元测试（test/vertex.go）：
- 配置解析测试（标准模式、Express Mode）
- 请求头处理测试
- 请求体处理测试（聊天、嵌入、流式请求、模型映射）
- 响应体处理测试
- 流式响应处理测试

Express Mode vs 标准模式对比

特性	标准模式	Express Mode
认证方式	Service Account JSON → JWT → OAuth Access Token	API Key (URL 查询参数)
域名	`{region}-aiplatform.googleapis.com`	`aiplatform.googleapis.com`
路径格式	`/v1/projects/{project}/locations/{location}/publishers/google/models/{model}:{action}`	`/v1/publishers/google/models/{model}:{action}?key={API_KEY}`
必需配置	vertexAuthKey, vertexRegion, vertexProjectId, vertexAuthServiceName	apiTokens
适用场景	生产环境	开发测试

配置简化说明

Express Mode 的配置与其他 Provider 保持一致，只需配置 apiTokens 即可自动启用 Express Mode，无需额外的配置项：

provider:
  type: vertex
  apiTokens:
    - "YOUR_API_KEY"

Ⅱ. Does this pull request fix one issue?

N/A - 这是一个新功能特性

Ⅲ. Why don't you add test cases (unit test/integration test)?

已添加完整的单元测试，位于 test/vertex.go，覆盖以下场景：

✅ 配置解析测试（5个测试用例）
✅ 请求头处理测试（2个测试用例）
✅ 请求体处理测试（4个测试用例）
✅ 响应体处理测试（1个测试用例）
✅ 流式响应处理测试（1个测试用例）

所有测试均已通过。

Ⅳ. Describe how to verify it

方式一：运行单元测试

cd plugins/wasm-go/extensions/ai-proxy
go test -gcflags="all=-N -l" -v -run TestVertex ./...

方式二：配置验证

Express Mode 配置示例：

provider:
  type: vertex
  apiTokens:
    - "YOUR_API_KEY"

发送测试请求：

curl -X POST http://your-gateway/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-2.5-flash",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

验证请求被正确转换：
- Host 应为 aiplatform.googleapis.com
- Path 应为 /v1/publishers/google/models/gemini-2.5-flash:generateContent?key=YOUR_API_KEY
- 不应有 Authorization header

Ⅴ. Special notes for reviews

向后兼容：如果没有配置 apiTokens，则继续使用标准模式（Service Account 认证），不影响现有用户
配置简化：与其他 Provider 保持一致，使用 apiTokens 配置项，无需引入新的配置字段
Claude 模型支持：Express Mode 同样支持通过 Vertex AI 调用 Anthropic Claude 模型

Ⅵ. AI Coding Tool Usage Checklist (if applicable)

Please check all applicable items:

For regular updates/changes (not new plugins):
- I have provided the prompts/instructions I gave to the AI Coding tool below
- I have included the AI Coding summary below

AI Coding Summary

关键决策：

使用 apiTokens 配置项来判断是否启用 Express Mode，与其他 Provider 保持一致
添加 isExpressMode() 辅助方法，通过检测 len(apiTokens) > 0 来判断模式
Express Mode 不创建 OAuth 客户端，避免不必要的资源消耗
API Key 作为 URL 查询参数传递，符合 Google 官方 Express Mode API 规范

主要变更：

provider/vertex.go - 添加 isExpressMode() 方法和 Express Mode 路径模板，修改配置验证、域名处理、路径生成和认证逻辑
README.md / README_EN.md - 添加双语配置文档和使用示例
test/vertex.go - 创建完整的测试套件
main_test.go - 注册 Vertex 测试函数

重要考虑和限制：

保持与标准模式完全向后兼容
配置简化：只需 apiTokens 一个配置项即可启用 Express Mode
Express Mode 支持的模型列表与标准模式一致（主要是 Gemini 系列）
API Key 在 URL 中明文传递，日志记录时需注意脱敏

Ⅰ. Describe what this PR did

This PR adds Express Mode support to the Vertex AI Provider of the ai-proxy plugin.

Background

Vertex AI has newly launched Express Mode, a simplified access mode that allows developers to quickly start using Vertex AI using only API Keys without configuring complex Service Account authentication. For details, see Vertex AI Express Mode official documentation.

Major changes

Implement Express Mode logic (provider/vertex.go):
- Add Express Mode path template (without project/location)
- Add isExpressMode() method: determine whether Express Mode is enabled by detecting whether apiTokens is configured
- Modify configuration verification: If apiTokens is configured, Express Mode is used, no other configuration is required
- Modify domain name processing: Express Mode uses fixed domain name aiplatform.googleapis.com (without region prefix)
- Modify authentication processing: Express Mode skips the OAuth authentication process, and the API Key is passed as a URL query parameter
- Modify path generation: Express Mode uses /v1/publishers/google/models/{model}:{action}?key={API_KEY} format
Updated Documentation:
- README.md - Add Express Mode configuration instructions and usage examples (Chinese)
- README_EN.md - Add Express Mode configuration instructions and usage examples (English)
Add unit test (test/vertex.go):
- Configure parsing test (standard mode, Express Mode)
- Request header processing test
- Request body processing testing (chat, embedding, streaming requests, model mapping)
- Response body processing test
- Streaming response processing test

Express Mode vs Standard Mode Comparison

Features	Standard Mode	Express Mode
Authentication method	Service Account JSON → JWT → OAuth Access Token	API Key (URL query parameter)
Domain name	`{region}-aiplatform.googleapis.com`	`aiplatform.googleapis.com`
Path format	`/v1/projects/{project}/locations/{location}/publishers/google/models/{model}:{action}`	`/v1/publishers/google/models/{model}:{action}?key={API_KEY}`
Required configuration	vertexAuthKey, vertexRegion, vertexProjectId, vertexAuthServiceName	apiTokens
Applicable scenarios	Production environment	Development and testing

Simplified configuration instructions

The configuration of Express Mode is consistent with other Providers. You only need to configure apiTokens to automatically enable Express Mode without additional configuration items:

provider:
  type: vertex
  apiTokens:
    - "YOUR_API_KEY"

Ⅱ. Does this pull request fix one issue?

N/A - This is a new feature

Ⅲ. Why don't you add test cases (unit test/integration test)?

Complete unit tests have been added, located in test/vertex.go, covering the following scenarios:

✅ Configuration parsing test (5 test cases)
✅ Request header processing test (2 test cases)
✅ Request body processing test (4 test cases)
✅ Response body processing test (1 test case)
✅ Streaming response processing test (1 test case)

All tests passed.

Ⅳ. Describe how to verify it

Method 1: Run unit tests

cd plugins/wasm-go/extensions/ai-proxy
go test -gcflags="all=-N -l" -v -run TestVertex ./...

Method 2: Configuration verification

Express Mode configuration example:

provider:
  type: vertex
  apiTokens:
    - "YOUR_API_KEY"

Send test request:

curl -X POST http://your-gateway/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-2.5-flash",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Verify that the request was converted correctly:
- Host should be aiplatform.googleapis.com
- Path should be /v1/publishers/google/models/gemini-2.5-flash:generateContent?key=YOUR_API_KEY
- There should be no Authorization header

Ⅴ. Special notes for reviews

Backwards Compatibility: If apiTokens is not configured, the standard mode (Service Account authentication) will continue to be used, and existing users will not be affected.
Configuration Simplification: Consistent with other Providers, using the apiTokens configuration item, no need to introduce new configuration fields
Claude model support: Express Mode also supports calling the Anthropic Claude model through Vertex AI

Ⅵ. AI Coding Tool Usage Checklist (if applicable)

Please check all applicable items:

For regular updates/changes (not new plugins):
- I have provided the prompts/instructions I gave to the AI Coding tool below
- I have included the AI Coding summary below

AI Coding Summary

Key Decisions:

Use the apiTokens configuration item to determine whether to enable Express Mode, consistent with other Providers
Add isExpressMode() auxiliary method to determine the mode by detecting len(apiTokens) > 0
Express Mode does not create an OAuth client to avoid unnecessary resource consumption.
The API Key is passed as a URL query parameter and complies with Google’s official Express Mode API specification.

Major changes:

provider/vertex.go - Add isExpressMode() method and Express Mode path template, modify configuration verification, domain name processing, path generation and authentication logic
README.md / README_EN.md - Add bilingual configuration documents and usage examples
test/vertex.go - Create a complete test suite
main_test.go - Register Vertex test function

IMPORTANT CONSIDERATIONS AND LIMITATIONS:

Maintain full backward compatibility with standards mode
Simplified configuration: Only one configuration item apiTokens is needed to enable Express Mode
The list of models supported by Express Mode is consistent with the standard mode (mainly the Gemini series)
The API Key is passed in plain text in the URL, and attention must be paid to desensitization when logging.

…hentication - Introduced new configuration options for Vertex AI Express Mode in provider. - Updated README files to include details about Express Mode and its usage. - Added tests for Vertex Express Mode configuration and request/response handling. - Enhanced existing Vertex provider logic to accommodate Express Mode requirements. Change-Id: Ib273d0109c32760f3397be48e1ab3a3a47b184fc Co-developed-by: Cursor <[email protected]>

rinfx

LGTM

- Replaced `vertexExpressMode` and `vertexApiKey` fields with a single `apiTokens` array in the configuration. - Updated README files to reflect the new configuration structure for Express Mode. - Adjusted provider logic to accommodate the new API Key handling. - Modified tests to align with the updated configuration format. Change-Id: Ib273d0109c32760f3397be48e1ab3a3a47b184fc Co-developed-by: Cursor <[email protected]>

wydream requested review from johnlanni and rinfx as code owners January 6, 2026 14:54

rinfx approved these changes Jan 7, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(vertex): 为 ai-proxy 插件的 Vertex AI Provider 添加 Express Mode 支持 || feat(vertex): Add Express Mode support to Vertex AI Provider of ai-proxy plug-in #3301

feat(vertex): 为 ai-proxy 插件的 Vertex AI Provider 添加 Express Mode 支持 || feat(vertex): Add Express Mode support to Vertex AI Provider of ai-proxy plug-in #3301

Uh oh!

wydream commented Jan 6, 2026 •

edited by github-actions bot

Loading

Uh oh!

rinfx left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat(vertex): 为 ai-proxy 插件的 Vertex AI Provider 添加 Express Mode 支持 || feat(vertex): Add Express Mode support to Vertex AI Provider of ai-proxy plug-in #3301

Are you sure you want to change the base?

feat(vertex): 为 ai-proxy 插件的 Vertex AI Provider 添加 Express Mode 支持 || feat(vertex): Add Express Mode support to Vertex AI Provider of ai-proxy plug-in #3301

Uh oh!

Conversation

wydream commented Jan 6, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Ⅰ. Describe what this PR did

背景

主要变更

Express Mode vs 标准模式对比

配置简化说明

Ⅱ. Does this pull request fix one issue?

Ⅲ. Why don't you add test cases (unit test/integration test)?

Ⅳ. Describe how to verify it

方式一：运行单元测试

方式二：配置验证

Ⅴ. Special notes for reviews

Ⅵ. AI Coding Tool Usage Checklist (if applicable)

AI Coding Summary

Ⅰ. Describe what this PR did

Background

Major changes

Express Mode vs Standard Mode Comparison

Simplified configuration instructions

Ⅱ. Does this pull request fix one issue?

Ⅲ. Why don't you add test cases (unit test/integration test)?

Ⅳ. Describe how to verify it

Method 1: Run unit tests

Method 2: Configuration verification

Ⅴ. Special notes for reviews

Ⅵ. AI Coding Tool Usage Checklist (if applicable)

AI Coding Summary

Uh oh!

rinfx left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

wydream commented Jan 6, 2026 •

edited by github-actions bot

Loading