Fix authorization #4338

lvhan028 · 2026-02-09T03:53:19Z

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily receiving feedbacks. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.

Motivation

Please describe the motivation of this PR and the goal you want to achieve through this PR.

Modification

Please briefly describe what modification is made in this PR.

BC-breaking (Optional)

Does the modification introduce changes that break the backward-compatibility of the downstream repositories?
If so, please describe how it breaks the compatibility and how the downstream projects should modify their code to keep compatibility with this PR.

Use cases (Optional)

If this PR introduces a new feature, it is better to list some use cases here, and update the documentation.

Checklist

Pre-commit or other linting tools are used to fix the potential lint issues.
The modification is covered by complete unit tests. If not, please add more unit tests to ensure the correctness.
If the modification has a dependency on downstream projects of a newer version, this PR should be tested with all supported versions of downstream projects.
The documentation has been modified accordingly, like docstring or example tutorials.

Copilot

Pull request overview

This PR refactors API key authorization in both the OpenAI API server and the proxy by replacing per-route Depends(check_api_key) enforcement with a shared ASGI AuthenticationMiddleware, and updates the proxy request forwarding to pass through original request headers/body (to preserve Authorization).

Changes:

Removed check_api_key dependency usage and introduced AuthenticationMiddleware (new server_utils.py) for token auth.
Updated proxy forwarding to pass through inbound request headers/body, and added a custom streaming response wrapper to surface upstream non-200 responses.
Added a JSON Content-Type validator dependency in the OpenAI API server for several endpoints.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 9 comments.

File	Description
`lmdeploy/serve/proxy/proxy.py`	Removes route-level auth dependencies, forwards raw request headers/body to upstream nodes, adds streaming error handling, and conditionally installs auth middleware.
`lmdeploy/serve/openai/server_utils.py`	Adds new ASGI `AuthenticationMiddleware` implementation for Bearer token validation.
`lmdeploy/serve/openai/api_server.py`	Removes `check_api_key`, adds `validate_json_request`, adjusts validation exception handler, and conditionally installs auth middleware.

Comments suppressed due to low confidence (1)

lmdeploy/serve/proxy/proxy.py:441

NodeManager.stream_generate/generate now require a FastAPI Request and call raw_request.body(), but large parts of the proxy (e.g., DistServe prefill/decode paths) still call these methods with a request dict. This will raise AttributeError at runtime and break DistServe routing. Consider either (a) keeping dict-based helpers for internally-mutated requests (prefill/migration) and adding separate raw_request forwarders, or (b) updating DistServe call sites to build the outbound body/headers explicitly (including auth) without relying on Request.

    async def stream_generate(self, raw_request: Request, node_url: str, endpoint: str):
        try:
            target_url = node_url.rstrip('/') + endpoint
            headers = self._prepare_headers(raw_request)
            body_bytes = await raw_request.body()
            async with aiohttp.ClientSession() as session:
                async with session.post(target_url, headers=headers, data=body_bytes,
                                        timeout=self.aiotimeout) as response:
                    if response.status != 200:
                        error_body = await response.read()
                        raise APIServerException(status_code=response.status, body=error_body)
                    async for line in response.content:
                        if line.strip():
                            yield line + b'\n\n'
        except APIServerException:
            # raise APIServerException again to be caught by the outer layer
            raise
        except (Exception, GeneratorExit, aiohttp.ClientError) as e:  # noqa
            logger.error(f'catched an exception: {e}')
            # exception happened, reduce unfinished num
            yield self.handle_api_timeout(node_url)

    async def generate(self, raw_request: Request, node_url: str, endpoint: str):
        try:
            target_url = node_url.rstrip('/') + endpoint
            headers = self._prepare_headers(raw_request)
            body_bytes = await raw_request.body()
            async with aiohttp.ClientSession() as session:
                async with session.post(target_url, headers=headers, data=body_bytes,
                                        timeout=self.aiotimeout) as response:
                    return await response.text()
        except (Exception, GeneratorExit, aiohttp.ClientError, asyncio.CancelledError) as e:  # noqa  # yapf: disable
            logger.error(f'catched an exception: {e}')
            return self.handle_api_timeout(node_url)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-09T03:57:47Z

lmdeploy/serve/openai/api_server.py

+    content_type = raw_request.headers.get('content-type', '').lower()
+    media_type = content_type.split(';', maxsplit=1)[0]
+    if media_type != 'application/json':
+        raise RequestValidationError(errors=["Unsupported Media Type: Only 'application/json' is allowed"])


validate_json_request raises RequestValidationError for unsupported Content-Type, but RequestValidationError is meant for pydantic validation issues and doesn't carry HTTP status/headers. This also maps the error to 422 instead of the more appropriate 415 Unsupported Media Type. Consider raising HTTPException(status_code=415, ...) (or returning a JSONResponse) and keeping RequestValidationError for schema/field validation only.

Suggested change

raise RequestValidationError(errors=["Unsupported Media Type: Only 'application/json' is allowed"])

raise HTTPException(

status_code=HTTPStatus.UNSUPPORTED_MEDIA_TYPE.value,

detail="Unsupported Media Type: Only 'application/json' is allowed",

)

Copilot · 2026-02-09T03:57:48Z

lmdeploy/serve/openai/api_server.py

-            'body': exc.body
-        }),
-    )
+    return JSONResponse(status_code=exc.status_code, content=exc.detail, headers=exc.headers)


validation_exception_handler assumes RequestValidationError has .status_code/.detail/.headers, but FastAPI's RequestValidationError exposes .errors() and .body (and does not define those attributes). This will raise AttributeError the first time a validation error occurs. Either restore the previous handler logic using exc.errors()/exc.body, or switch to handling HTTPException for the custom 415 case.

Suggested change

return JSONResponse(status_code=exc.status_code, content=exc.detail, headers=exc.headers)

return JSONResponse(

status_code=HTTPStatus.UNPROCESSABLE_ENTITY,

content={'detail': exc.errors(), 'body': exc.body},

)

Copilot · 2026-02-09T03:57:48Z

lmdeploy/serve/proxy/proxy.py

+            await send({'type': 'http.response.start', 'status': e.status_code, 'headers': self.raw_headers})
+            await send({
+                'type': 'http.response.body',
+                'body': e.body,


ProxyStreamingResponse sends error responses using self.raw_headers (typically "text/event-stream"), ignoring APIServerException.headers/content-type. If the upstream returns e.g. 401/403 with JSON, clients will receive the wrong content-type. Consider passing upstream response headers into APIServerException and using them (or at least setting application/json) when sending the error response start.

Suggested change

await send({'type': 'http.response.start', 'status': e.status_code, 'headers': self.raw_headers})

await send({

'type': 'http.response.body',

'body': e.body,

# Use headers from APIServerException (defaulting to application/json) instead of streaming headers

error_headers = [

(str(k).encode('latin-1'), str(v).encode('latin-1'))

for k, v in (e.headers or {}).items()

]

error_body = e.body if isinstance(e.body, (bytes, bytearray)) else str(e.body).encode('utf-8')

await send({

'type': 'http.response.start',

'status': e.status_code,

'headers': error_headers,

})

await send({

'type': 'http.response.body',

'body': error_body,

Copilot · 2026-02-09T03:57:48Z

lmdeploy/serve/utils/server_utils.py

+        if scope['type'] not in ('http', 'websocket') or scope['method'] == 'OPTIONS':
+            # scope["type"] can be "lifespan" or "startup" for example,
+            # in which case we don't need to do anything
+            return self.app(scope, receive, send)


AuthenticationMiddleware.call unconditionally reads scope['method'] when scope['type'] is 'websocket'. WebSocket scopes typically don't include a 'method' key, so this can raise KeyError and break websocket connections. Use scope.get('method') (or only check OPTIONS when scope['type']=='http').

Suggested change

if scope['type'] not in ('http', 'websocket') or scope['method'] == 'OPTIONS':

# scope["type"] can be "lifespan" or "startup" for example,

# in which case we don't need to do anything

return self.app(scope, receive, send)

if scope['type'] not in ('http', 'websocket'):

# scope["type"] can be "lifespan" or "startup" for example,

# in which case we don't need to do anything

return self.app(scope, receive, send)

if scope['type'] == 'http' and scope.get('method') == 'OPTIONS':

return self.app(scope, receive, send)

Copilot · 2026-02-09T03:57:48Z

lmdeploy/serve/openai/api_server.py



-@router.get('/is_sleeping', dependencies=[Depends(check_api_key)])
+@router.get('/is_sleeping', dependencies=[Depends(validate_json_request)])


The new validate_json_request dependency is attached to GET /is_sleeping. GET requests often omit Content-Type entirely, so this endpoint will start failing with a validation error even though it has no JSON body. Consider removing this dependency from GET routes (and any routes that don't require a JSON request body), or make validate_json_request conditional on HTTP method and/or presence of a body.

Suggested change

@router.get('/is_sleeping', dependencies=[Depends(validate_json_request)])

@router.get('/is_sleeping')

Copilot · 2026-02-09T03:57:49Z

lmdeploy/serve/openai/api_server.py

+    if api_keys is not None and (tokens := [key for key in api_keys if key]):
+        from .server_utils import AuthenticationMiddleware
+
+        app.add_middleware(AuthenticationMiddleware, tokens=tokens)


Authentication is now enforced via AuthenticationMiddleware, but it intentionally only checks paths starting with /v1. This means non-/v1 endpoints that previously required an API key (e.g. /generate, /update_weights, /sleep, /wakeup, /pooling, /is_sleeping) become unauthenticated when api_keys are configured. Consider expanding the middleware path check (or adding per-route protection) so all sensitive endpoints remain protected (keeping explicit allowlist exceptions like /health and docs).

lmdeploy/serve/utils/server_utils.py

lmdeploy/serve/proxy/proxy.py

Copilot · 2026-02-09T03:57:49Z

lmdeploy/serve/proxy/proxy.py

        if request.stream is True:
-            response = node_manager.stream_generate(request_dict, node_url, '/v1/completions')
+            response = node_manager.stream_generate(raw_request, node_url, '/v1/completions')
            background_task = node_manager.create_background_tasks(node_url, start)
            return StreamingResponse(response, background=background_task, media_type='text/event-stream')
        else:


In the Hybrid /v1/completions streaming path, stream_generate can raise APIServerException (e.g., upstream returns non-200). Unlike /v1/chat/completions, this endpoint still uses StreamingResponse, so exceptions will likely surface as an unhandled 500 instead of a clean upstream status/body. Consider using ProxyStreamingResponse here as well (and in other streaming endpoints) for consistent error propagation.

lvhan028 added 3 commits February 7, 2026 20:14

fix authorization

462ee13

forward raw request

6fadfde

fix exceptional cases

b8a9bfd

Copilot AI review requested due to automatic review settings February 9, 2026 03:53

Copilot started reviewing on behalf of lvhan028 February 9, 2026 03:53 View session

Copilot AI reviewed Feb 9, 2026

View reviewed changes

lvhan028 added 2 commits February 9, 2026 13:39

reorganize

d807f14

type hint

d4f6480

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix authorization #4338

Fix authorization #4338

lvhan028 commented Feb 9, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 9, 2026

Uh oh!

Copilot AI Feb 9, 2026

Uh oh!

Copilot AI Feb 9, 2026

Uh oh!

Copilot AI Feb 9, 2026

Uh oh!

Copilot AI Feb 9, 2026

Uh oh!

Copilot AI Feb 9, 2026

Uh oh!

Uh oh!

Uh oh!

Copilot AI Feb 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

-        raise RequestValidationError(errors=["Unsupported Media Type: Only 'application/json' is allowed"])
+        raise HTTPException(
+            status_code=HTTPStatus.UNSUPPORTED_MEDIA_TYPE.value,
+            detail="Unsupported Media Type: Only 'application/json' is allowed",
+        )



		@router.get('/is_sleeping', dependencies=[Depends(check_api_key)])
		@router.get('/is_sleeping', dependencies=[Depends(validate_json_request)])

	@router.get('/is_sleeping', dependencies=[Depends(validate_json_request)])
	@router.get('/is_sleeping')

Fix authorization #4338

Are you sure you want to change the base?

Fix authorization #4338

Conversation

lvhan028 commented Feb 9, 2026

Motivation

Modification

BC-breaking (Optional)

Use cases (Optional)

Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant