Motivation
Both built-in policies retry on any exception. This means a ValueError from a malformed response or a KeyError in parsing logic burns through all attempts even though retrying will never help — only transient failures like network timeouts or rate limits warrant a retry.
Proposed Change
Add an optional retryable_exceptions parameter to both concrete policy classes. Default None preserves existing behavior, no breaking changes.
Since workflows-py is LLM-agnostic, the parameter accepts any exception type(s) the caller needs. Example -
OpenAI
from openai import RateLimitError, APITimeoutError, APIConnectionError
@step(retry_policy=ExponentialBackoffRetryPolicy(
maximum_attempts=5,
retryable_exceptions=(RateLimitError, APITimeoutError, APIConnectionError),
))
async def call_openai(self, ev: QueryEvent) -> ResultEvent: ...
Anthropic
from anthropic import RateLimitError, APIConnectionError
@step(retry_policy=ConstantDelayRetryPolicy(
delay=10,
maximum_attempts=3,
retryable_exceptions=(RateLimitError, APIConnectionError),
))
async def call_anthropic(self, ev: QueryEvent) -> ResultEvent: ...
Gemini
from google.api_core.exceptions import ResourceExhausted, ServiceUnavailable
@step(retry_policy=ExponentialBackoffRetryPolicy(
maximum_attempts=5,
retryable_exceptions=(ResourceExhausted, ServiceUnavailable),
))
async def call_gemini(self, ev: QueryEvent) -> ResultEvent: ...
Implementation is ~3 lines per policy — early-return None if not isinstance(error, self.retryable_exceptions).
Questions
- Is this in scope for the built-in policies, or is the intent that users subclass
RetryPolicy for this level of control?
- Happy to open a PR if the direction looks good — I contributed the
ExponentialBackoffRetryPolicy previously so I know this surface area well.
Motivation
Both built-in policies retry on any exception. This means a
ValueErrorfrom a malformed response or aKeyErrorin parsing logic burns through all attempts even though retrying will never help — only transient failures like network timeouts or rate limits warrant a retry.Proposed Change
Add an optional
retryable_exceptionsparameter to both concrete policy classes. DefaultNonepreserves existing behavior, no breaking changes.Since
workflows-pyis LLM-agnostic, the parameter accepts any exception type(s) the caller needs. Example -OpenAI
Anthropic
Gemini
Implementation is ~3 lines per policy — early-return
Noneifnot isinstance(error, self.retryable_exceptions).Questions
RetryPolicyfor this level of control?ExponentialBackoffRetryPolicypreviously so I know this surface area well.