fix(skills): harden skills API review findings by slin1237 · Pull Request #1366 · lightseekorg/smg

slin1237 · 2026-04-23T21:06:17Z

Summary

Enforce configured skills upload, file-size, total-size, and file-count limits before buffering uploads and inside direct service calls.
Sanitize internal skills service, store, and blob errors in API responses while logging backend details server-side.
Gate skills CRUD routes with allowed_operations and make create/version metadata writes atomic at the store boundary.

Validation

cargo check -p smg-skills -p smg --tests
cargo test -p smg-skills
cargo test -p smg --test api_tests skills_api
cargo test -p smg internal_skill_service_errors_hide_backend_details
cargo +nightly fmt --all --check
cargo clippy --all-targets --all-features -- -D warnings
cargo test

Summary by CodeRabbit

Release Notes

New Features
- Configurable upload limits for skills now enforced: maximum file size, file count per version, and total upload size.
- Admin operation authorization checks added to skill management endpoints; requests without proper permissions return 403 Forbidden.
Bug Fixes
- Enhanced error handling with clearer, safer error messages for skill service operations.
- Added validation to prevent configuration overflow when converting size limits.

Signed-off-by: Simo Lin <linsimo.mark@gmail.com>

coderabbitai · 2026-04-23T21:06:23Z

📝 Walkthrough

Walkthrough

Introduces configurable SkillUploadLimits to enforce file and aggregate size constraints on skill uploads via zip and multipart routes. Adds atomic combined write operations to the metadata store trait to ensure skills and versions persist together. Implements admin operation authorization gates across CRUD handlers and refactors multipart parsing with chunked, size-limited readers.

Changes

Cohort / File(s)	Summary
Configuration & Type Definitions `crates/skills/src/config.rs`, `crates/skills/src/lib.rs`	Adds `SkillUploadLimits` struct with MB-to-bytes conversion and validation; extends `SkillsAdminOperation` enum with `CreateAnyTenant` and `UpdateAnyTenant` variants; exports new type via `lib.rs` public re-export.
Skill Service API & Validation `crates/skills/src/api.rs`, `crates/skills/src/validation.rs`	Introduces constructors `single_process_with_limits` and `in_memory_with_limits`; enforces upload limits during normalization via `normalize_skill_bundle_zip_with_limits`; refactors create/upgrade persistence to use atomic `put_skill_with_initial_version` and `put_skill_version_and_update_skill` operations.
Metadata Store & In-Memory Implementation `crates/skills/src/storage.rs`, `crates/skills/src/memory.rs`	Adds two trait methods for atomic skill+version writes; centralizes validation logic in `*_locked` helpers; implements new methods with tenant consistency enforcement and skill-id validation.
Gateway Wiring & Authorization `model_gateway/src/app_context.rs`, `model_gateway/src/config/validation.rs`, `model_gateway/src/routers/skills/handlers.rs`	Parses upload limits from config and passes to service constructors; adds mebibyte-overflow validation; gates all CRUD/list/get handlers behind admin operation checks; refactors multipart parsing to enforce per-file, aggregate, and field-size limits; adds internal error logging without backend detail leakage.
Integration Tests `model_gateway/tests/api/skills_api_test.rs`	Verifies `400` rejection when multipart file exceeds `max_file_size_mb`; verifies `403` rejection when admin operation is not allowed in configured permissions.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Handler as Handler<br/>(Authorization)
    participant Parser as Multipart<br/>Parser
    participant Validator as Normalization<br/>& Limits
    participant Store as Metadata<br/>Store
    
    Client->>Handler: POST /v1/skills (multipart)
    Handler->>Handler: ensure_admin_operation_allowed<br/>CreateAnyTenant?
    alt Operation Not Allowed
        Handler-->>Client: 403 Forbidden
    end
    
    Handler->>Parser: parse_skill_upload<br/>with limits
    Parser->>Parser: Chunked read<br/>enforce max_files,<br/>file_size, tenant_id bytes
    alt Upload Exceeds Limits
        Parser-->>Client: 400 Bad Request
    end
    
    Parser->>Validator: normalize_skill_bundle_zip_with_limits
    Validator->>Validator: Validate zip entries<br/>vs max_upload_size_bytes,<br/>max_file_size_bytes
    alt Validation Fails
        Validator-->>Client: 400 Bad Request
    end
    
    Validator->>Store: put_skill_with_initial_version<br/>(skill, version)
    Store->>Store: Atomic:<br/>validate tenant,<br/>insert skill & version
    alt Store Fails
        Store-->>Client: 500 Internal Error
    end
    
    Store-->>Handler: Success
    Handler-->>Client: 201 Created

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

PR #1332: Modifies multipart parsing and bundle normalization in the upload pipeline with additional limit enforcement on the same code paths.
PR #1249: Provides foundational scaffolding for SkillService, SkillsConfig, and related types that this PR extends with limits and new operations.
PR #1255: Introduces the zip normalization API that this PR adapts with limits-aware overload and threshold parameterization.

Suggested reviewers

CatherineSue
key4ng
claude

Poem

🐰 Hop, hop! With limits set so tight,
Files fly through checks, day and night,
Atomic writes keep skills so true,
Admin gates guard what we do—
No more sprawling uploads wild,
Just bounded bundles, safely filed! 🎯

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 24.14% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The title 'fix(skills): harden skills API review findings' is vague and generic, using the broad term 'harden' and 'review findings' without specifying which aspects are being hardened or what the main changes are.	Consider a more specific title that highlights the primary changes, such as 'fix(skills): enforce upload limits and secure API access with auth gates' or similar.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/skills-api-review-findings

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist

Code Review

This pull request introduces configurable upload limits for skill bundles, including constraints on total size, file count, and individual file size. It refactors the SkillService and SkillMetadataStore to enforce these limits during multipart parsing and bundle normalization. Additionally, the PR adds new administrative operations for tenant management and updates the in-memory store to support atomic-like metadata updates. Feedback was provided regarding a potential race condition in the in-memory store that could lead to version collisions.

gemini-code-assist · 2026-04-23T21:09:44Z

+        validate_skill_tenant_locked(&state, &skill)?;
+        put_skill_version_locked(&mut state, version)?;
+        put_skill_locked(&mut state, skill)?;


In put_skill_version_and_update_skill, there is a potential race condition where a version ID generated based on a timestamp could collide under high concurrency. If using UUID v7, ensure the full UUID or a sufficient prefix (e.g., simple()[..24]) is used to ensure uniqueness. The store should enforce uniqueness by checking if the version key already exists before inserting it, returning a conflict error instead of silently overwriting the existing version data.

Suggested change

validate_skill_tenant_locked(&state, &skill)?;

put_skill_version_locked(&mut state, version)?;

put_skill_locked(&mut state, skill)?;

let mut state = self.state.write();

validate_skill_tenant_locked(&state, &skill)?;

let key = (version.skill_id.clone(), version.version.clone());

if state.skill_versions.contains_key(&key) {

return Err(SkillsStoreError::InvalidData(format!(

"version '{}' already exists for skill_id '{}'",

version.version, version.skill_id

)));

}

put_skill_version_locked(&mut state, version)?;

put_skill_locked(&mut state, skill)?;

References

When generating unique IDs from UUID v7, avoid using only the first segment as it is not unique within short time intervals. Use the full UUID or a long enough prefix to ensure uniqueness.

Avoid silently ignoring potential failures or overwriting data; prefer explicit error handling or logging to prevent silent failures.

claude · 2026-04-23T21:10:26Z

+                let remaining_upload_bytes = limits
+                    .max_upload_size_bytes
+                    .saturating_sub(total_file_bytes);
+                let max_field_bytes = remaining_upload_bytes.min(limits.max_file_size_bytes);
+                let contents =
+                    read_field_bytes_limited(field, max_field_bytes, "uploaded skill file").await?;


🟡 Nit: When previous files have consumed the full budget, saturating_sub returns 0, so max_field_bytes = 0. The next non-empty file then triggers read_field_bytes_limited → upload_too_large with the message "uploaded skill file exceeds maximum size of 0 bytes", which is confusing.

An early return when the budget is exhausted would produce a clearer error:

Suggested change

let remaining_upload_bytes = limits

.max_upload_size_bytes

.saturating_sub(total_file_bytes);

let max_field_bytes = remaining_upload_bytes.min(limits.max_file_size_bytes);

let contents =

read_field_bytes_limited(field, max_field_bytes, "uploaded skill file").await?;

let remaining_upload_bytes = limits

.max_upload_size_bytes

.saturating_sub(total_file_bytes);

if remaining_upload_bytes == 0 {

return Err(SkillsApiError::BadRequest {

code: "skill_upload_too_large",

message: format!(

"skill upload exceeds maximum total size of {} bytes",

limits.max_upload_size_bytes

),

});

}

let max_field_bytes = remaining_upload_bytes.min(limits.max_file_size_bytes);

let contents =

read_field_bytes_limited(field, max_field_bytes, "uploaded skill file").await?;

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

model_gateway/src/routers/skills/handlers.rs (1)
292-307: ⚠️ Potential issue | 🟠 Major

Use update permission for version creation.

POST /v1/skills/{skill_id}/versions mutates an existing skill by adding a version and advancing its projection. Gating it with CreateAnyTenant lets callers who are denied updates still modify existing skills.
Suggested fix
-    ensure_admin_operation_allowed(&state, SkillsAdminOperation::CreateAnyTenant)?;
+    ensure_admin_operation_allowed(&state, SkillsAdminOperation::UpdateAnyTenant)?;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@model_gateway/src/routers/skills/handlers.rs` around lines 292 - 307,
create_skill_version_impl currently gates version creation with
ensure_admin_operation_allowed(..., SkillsAdminOperation::CreateAnyTenant), but
adding a version is an update to an existing skill; change the permission check
to use the update permission instead (replace
SkillsAdminOperation::CreateAnyTenant with SkillsAdminOperation::UpdateAnyTenant
in the ensure_admin_operation_allowed call inside create_skill_version_impl) so
callers need update rights to add versions.
crates/skills/src/api.rs (1)
853-910: ⚠️ Potential issue | 🟠 Major

Reserve ZIP framing overhead before buffering files[] uploads.

Line 888 only caps the sum of file.contents.len(), but Lines 857-874 then build a stored ZIP archive, which adds per-entry and central-directory overhead after validation. A bundle that is exactly at max_upload_size_bytes in raw file bytes can still allocate a larger archive buffer here, so the pre-buffering upload cap is still bypassable for multipart uploads. Reject based on an archive-size estimate before writing, or stop as soon as the buffered ZIP exceeds max_upload_size_bytes.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/skills/src/api.rs` around lines 853 - 910, The current validation only
sums raw file bytes but normalize_files_upload builds a ZIP (in ZipWriter) which
can exceed max_upload_size_bytes; update the logic to enforce archive-size
limits: either augment validate_files_upload_limits to conservatively estimate
ZIP overhead per entry (local header + central dir + filename length) and fail
when the estimated archive size would exceed limits.max_upload_size_bytes, or
(simpler) instrument normalize_files_upload to check the buffer length after
each start_file/write/finish and immediately return
SkillBundleArchiveError::BundleTooLarge.into() if buffer.get_ref().len() >
limits.max_upload_size_bytes; apply the same SkillBundleArchiveError variant so
callers of normalize_skill_bundle_zip_with_limits see a consistent error.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@model_gateway/src/routers/skills/handlers.rs`:
- Around line 692-709: The helper ensure_admin_operation_allowed currently only
checks admin.allowed_operations and ignores the admin.enabled flag, so update it
to also require that state.context.router_config.skills.as_ref().and_then(|s|
s.admin.as_ref()).map_or(false, |admin| admin.enabled) (or equivalent) is true
before allowing the operation; if admin.enabled is false, return the same
SkillsApiError::Forbidden (code "skills_operation_not_allowed") to ensure the
global admin-off switch blocks handler access even when allowed_operations is
populated (see ensure_admin_operation_allowed, SkillsAdminConfig::default, and
the admin.allowed_operations check).

---

Outside diff comments:
In `@crates/skills/src/api.rs`:
- Around line 853-910: The current validation only sums raw file bytes but
normalize_files_upload builds a ZIP (in ZipWriter) which can exceed
max_upload_size_bytes; update the logic to enforce archive-size limits: either
augment validate_files_upload_limits to conservatively estimate ZIP overhead per
entry (local header + central dir + filename length) and fail when the estimated
archive size would exceed limits.max_upload_size_bytes, or (simpler) instrument
normalize_files_upload to check the buffer length after each
start_file/write/finish and immediately return
SkillBundleArchiveError::BundleTooLarge.into() if buffer.get_ref().len() >
limits.max_upload_size_bytes; apply the same SkillBundleArchiveError variant so
callers of normalize_skill_bundle_zip_with_limits see a consistent error.

In `@model_gateway/src/routers/skills/handlers.rs`:
- Around line 292-307: create_skill_version_impl currently gates version
creation with ensure_admin_operation_allowed(...,
SkillsAdminOperation::CreateAnyTenant), but adding a version is an update to an
existing skill; change the permission check to use the update permission instead
(replace SkillsAdminOperation::CreateAnyTenant with
SkillsAdminOperation::UpdateAnyTenant in the ensure_admin_operation_allowed call
inside create_skill_version_impl) so callers need update rights to add versions.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 929e29c5-bd40-4c73-b227-77a9e51c3b1d

📥 Commits

Reviewing files that changed from the base of the PR and between 0b969a4 and 45acc08.

📒 Files selected for processing (10)

crates/skills/src/api.rs
crates/skills/src/config.rs
crates/skills/src/lib.rs
crates/skills/src/memory.rs
crates/skills/src/storage.rs
crates/skills/src/validation.rs
model_gateway/src/app_context.rs
model_gateway/src/config/validation.rs
model_gateway/src/routers/skills/handlers.rs
model_gateway/tests/api/skills_api_test.rs

coderabbitai · 2026-04-23T21:20:22Z

+fn ensure_admin_operation_allowed(
+    state: &AppState,
+    operation: SkillsAdminOperation,
+) -> Result<(), SkillsApiError> {
+    let allowed = state
+        .context
+        .router_config
+        .skills
+        .as_ref()
+        .is_some_and(|skills| skills.admin.allowed_operations.contains(&operation));
+    if allowed {
+        return Ok(());
+    }
+
+    Err(SkillsApiError::Forbidden {
+        code: "skills_operation_not_allowed",
+        message: format!("skills admin operation '{operation:?}' is not allowed"),
+    })


⚠️ Potential issue | 🟠 Major

Preserve admin.enabled in the handler-side authorization gate.

This helper only checks allowed_operations. Since SkillsAdminConfig::default() still populates every operation while enabled is false, any accidentally mounted handler bypasses the global admin-off switch. Fold skills.admin.enabled into this check as defense in depth.

Suggested fix

fn ensure_admin_operation_allowed( state: &AppState, operation: SkillsAdminOperation, ) -> Result<(), SkillsApiError> { let allowed = state .context .router_config .skills .as_ref() - .is_some_and(|skills| skills.admin.allowed_operations.contains(&operation)); + .is_some_and(|skills| { + skills.admin.enabled && skills.admin.allowed_operations.contains(&operation) + }); if allowed { return Ok(()); }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

fn ensure_admin_operation_allowed(

state: &AppState,

operation: SkillsAdminOperation,

) -> Result<(), SkillsApiError> {

let allowed = state

.context

.router_config

.skills

.as_ref()

.is_some_and(|skills| skills.admin.allowed_operations.contains(&operation));

if allowed {

return Ok(());

}

Err(SkillsApiError::Forbidden {

code: "skills_operation_not_allowed",

message: format!("skills admin operation '{operation:?}' is not allowed"),

})

fn ensure_admin_operation_allowed(

state: &AppState,

operation: SkillsAdminOperation,

) -> Result<(), SkillsApiError> {

let allowed = state

.context

.router_config

.skills

.as_ref()

.is_some_and(|skills| {

skills.admin.enabled && skills.admin.allowed_operations.contains(&operation)

});

if allowed {

return Ok(());

}

Err(SkillsApiError::Forbidden {

code: "skills_operation_not_allowed",

message: format!("skills admin operation '{operation:?}' is not allowed"),

})

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@model_gateway/src/routers/skills/handlers.rs` around lines 692 - 709, The helper ensure_admin_operation_allowed currently only checks admin.allowed_operations and ignores the admin.enabled flag, so update it to also require that state.context.router_config.skills.as_ref().and_then(|s| s.admin.as_ref()).map_or(false, |admin| admin.enabled) (or equivalent) is true before allowing the operation; if admin.enabled is false, return the same SkillsApiError::Forbidden (code "skills_operation_not_allowed") to ensure the global admin-off switch blocks handler access even when allowed_operations is populated (see ensure_admin_operation_allowed, SkillsAdminConfig::default, and the admin.allowed_operations check).

fix(skills): harden skills API review findings

45acc08

Signed-off-by: Simo Lin <linsimo.mark@gmail.com>

github-actions Bot added tests Test changes model-gateway Model gateway crate changes labels Apr 23, 2026

gemini-code-assist Bot reviewed Apr 23, 2026

View reviewed changes

claude Bot reviewed Apr 23, 2026

View reviewed changes

claude Bot approved these changes Apr 23, 2026

View reviewed changes

slin1237 marked this pull request as ready for review April 23, 2026 21:14

slin1237 requested review from CatherineSue and key4ng as code owners April 23, 2026 21:14

coderabbitai Bot requested changes Apr 23, 2026

View reviewed changes

slin1237 merged commit 26c8495 into main Apr 23, 2026
52 checks passed

slin1237 deleted the fix/skills-api-review-findings branch April 23, 2026 21:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(skills): harden skills API review findings#1366

fix(skills): harden skills API review findings#1366
slin1237 merged 1 commit intomainfrom
fix/skills-api-review-findings

slin1237 commented Apr 23, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 23, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Apr 23, 2026

Uh oh!

claude Bot Apr 23, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Apr 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

slin1237 commented Apr 23, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

slin1237 commented Apr 23, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 23, 2026 •

edited

Loading