fix(cli): prevent crash when displaying incomplete test workflow executions#6942
fix(cli): prevent crash when displaying incomplete test workflow executions#6942zucchinidev wants to merge 1 commit intokubeshop:mainfrom
Conversation
| ui.Warn("Name: ", execution.Workflow.Name) | ||
| } else { | ||
| ui.NL() | ||
| ui.Err(errors.New("incomplete execution data received from API: missing Workflow field")) |
There was a problem hiding this comment.
could/should we output any other fields from the execution object that could be helpful in troubleshooting?
There was a problem hiding this comment.
Hi @olensmar,
Thank you for taking the time to review this PR and for the thoughtful feedback. I really appreciate the guidance.
Great question about additional troubleshooting fields. The current implementation follows a graceful degradation approach: when workflow is nil, we display all available execution metadata that doesn't depend on the workflow reference. Currently this includes:
Currently displayed:
- Execution identifiers (ID, name, namespace, number)
- Timing information (scheduledAt, queuedAt, startedAt, finishedAt, duration)
- Execution status and configuration (status, disableWebhooks, tags)
- Initialization errors (if present)
Additional fields that could aid troubleshooting:
- runnerId - Shows which runner processed this execution
- statusAt - When the status last changed (currently we show scheduledAt but not statusAt)
- groupId - Helps identify correlated executions in the same batch/group
- resolvedWorkflow.name - If workflow is nil but resolvedWorkflow exists, it could show the actual executed workflow
The logic is currently at testworkflowexecution_obj.go:54-82 where we display fields inside the if execution.Id != "" block.
Proposed Enhancement (if valuable)
I'd propose adding these 3 specific fields when workflow is missing, as they directly answer key troubleshooting questions:
runnerId(HIGH value)
- Why: Identifies which runner had the issue
- Troubleshooting: "Are all failures from one runner?" → runner-specific config/connectivity problem
- Use case: Correlate with infrastructure logs for that specific runner
statusAt(HIGH value)
- Why: Shows when status last changed
- Troubleshooting: Compare timeline (scheduledAt → statusAt → queuedAt) to determine if workflow was missing from creation or became null later
- Use case: Correlate with API restart times or network outage windows
groupId(MEDIUM value)
- Why: If part of a batch, determines scope of problem
- Troubleshooting: "One execution or entire batch affected?" → different root causes
- Use case: Identify parallel execution issues vs isolated failures
Implementation location: Display these right after the error/hint (line ~60), making them prominent for troubleshooting:
} else {
ui.NL()
ui.Err(errors.New("incomplete execution data received from API: missing Workflow field"))
ui.Warn("Hint: check your API endpoint configuration (HTTP vs HTTPS) and API server accessibility")
ui.NL()
// Troubleshooting context
if execution.RunnerId != "" {
ui.Warn("Runner ID: ", execution.RunnerId)
}
if !execution.StatusAt.IsZero() {
ui.Warn("Status changed at: ", execution.StatusAt.String())
}
if execution.GroupId != "" {
ui.Warn("Group ID: ", execution.GroupId)
}
}This keeps the change minimal (3 fields) while maximizing diagnostic value. Happy to implement this if you think it adds value, or take a different approach based on your experience with these types of issues.
Would you like me to:
- Proceed with this selective enhancement?
- Go with a different set of fields?
- Keep the current implementation as-is (minimal fix)?
Thanks again for the review and guidance!
…utions When API connectivity issues cause incomplete execution data to be received, the CLI would crash with an unhelpful nil pointer error, preventing users from diagnosing the underlying problem. This change ensures graceful degradation by detecting missing workflow references and displaying a clear error message with actionable guidance about common causes such as API endpoint misconfiguration. Users can now view available execution details even when workflow metadata is incomplete, significantly improving the troubleshooting experience. Closes kubeshop#6891
685931a to
7f461c0
Compare
|
@greptile |
Greptile SummaryThis PR fixes a nil pointer panic in the CLI when displaying test workflow executions with incomplete data by adding a nil check for Key changes:
Observations:
Confidence Score: 4/5
Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[printPrettyOutput called] --> B{execution.Workflow != nil?}
B -->|Yes| C[Display workflow name]
B -->|No| D[Display newline]
D --> E[Show error: incomplete execution data]
E --> F[Show hint about API configuration]
C --> G[Continue with execution details]
F --> G
G --> H{execution.Id != empty?}
H -->|Yes| I[Display execution metadata]
H -->|No| J[Skip metadata]
I --> K{Has initialization error?}
J --> K
K -->|Yes| L[Display initialization error]
K -->|No| M[End]
L --> M
Last reviewed commit: 7f461c0 |
| Id: "test-id", | ||
| Name: "test-name", | ||
| Namespace: "test-namespace", | ||
| Workflow: nil, |
There was a problem hiding this comment.
explicit nil is unnecessary - omit Workflow field since zero value is already nil
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| Id: "test-id", | ||
| Name: "test-name", | ||
| Namespace: "test-namespace", | ||
| Workflow: nil, |
There was a problem hiding this comment.
explicit nil is unnecessary - omit Workflow field since zero value is already nil
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
vsukhin
left a comment
There was a problem hiding this comment.
hey @zucchinidev DId you see this issue, because execution.Workflow can't be nil by design, it's created with this fileld?
|
@zucchinidev please address review comments above - looks great otherwise! |
|
Thanks for the review, @vsukhin @olensmar! You're right that under normal conditions the Workflow field should always be populated. However, two things worth noting:
I picked this as a good first issue to contribute to the project; the crash was real and the fix is low-risk, only activating in the error path without changing normal behavior. |
|
ok, @zucchinidev please address pr feedback then. thank you! |
|
hey @zucchinidev will you be able to address the feedback or do you need our help? |
Pull request description
When API connectivity issues cause incomplete execution data to be received, the CLI would crash with an unhelpful nil pointer error, preventing users from diagnosing the underlying problem.
This change ensures graceful degradation by detecting missing workflow references and displaying a clear error message with actionable guidance about common causes such as API endpoint misconfiguration.
Users can now view available execution details even when workflow metadata is incomplete, significantly improving the troubleshooting experience.
Closes #6891
Checklist (choose whats happened)
Breaking changes
Changes
File:
cmd/kubectl-testkube/commands/testworkflows/renderer/testworkflowexecution_obj.goexecution.Workflowbefore accessing its fieldsFile:
cmd/kubectl-testkube/commands/testworkflows/renderer/testworkflowexecution_obj_test.go(new)Fixes
Issue #6891 - CLI panic when displaying test workflow executions with incomplete data
Root Cause
testworkflowexecution_obj.goaccessed execution.Workflow.Name without checking if execution.Workflow was nil, causing a panic when the API returned incomplete data.Testing Performed
testkube create testworkflow -f panic-demo-workflow.yamltestkube run testworkflow panic-demoBefore (Panic - Buggy Code)
Scenario: Execution with workflow: null in database
User Impact: CLI crashes, no execution details shown, no guidance on how to fix.
After (Graceful Error - With Fix)
Scenario 1: Valid execution with workflow reference
✅ Normal operation - works perfectly
Scenario 2: Execution with workflow: null (corrupted data)
✅ Graceful error handling - no panic, clear guidance, execution details still visible
User Impact: Users get actionable error message, can still see execution details, and know where to look for the problem