Skip to content

fix(llm): fix OpenAI param and default analysis to failed runs#2615

Merged
chmouel merged 2 commits intotektoncd:mainfrom
theakshaypant:fix-llm-openai-params
Mar 31, 2026
Merged

fix(llm): fix OpenAI param and default analysis to failed runs#2615
chmouel merged 2 commits intotektoncd:mainfrom
theakshaypant:fix-llm-openai-params

Conversation

@theakshaypant
Copy link
Copy Markdown
Member

📝 Description of the Change

  • fix(openai): Replace deprecated max_tokens with max_completion_tokens
    in the OpenAI request struct fixing errors on newer models (gpt-5, o1, o3 series).
    The internal max_tokens config field is kept provider-agnostic.
  • fix(llm): Move the failed-pipeline gate from the reconciler into the
    LLM orchestrator. Roles now run on all pipeline completions by default
    only if on_cel is set to 'true'; omitting on_cel preserves the
    original failed-only behavior.

🔗 Linked GitHub Issue

Fixes #2576

🧪 Testing Strategy

  • Unit tests
  • Integration tests
  • End-to-end tests
  • Manual testing
  • Not Applicable

Repo manifest

$ kg repo akshay-pac-test-repo -n default -oyaml
apiVersion: pipelinesascode.tekton.dev/v1alpha1
kind: Repository
metadata:
  annotations:
    ...
  creationTimestamp: "2026-03-27T07:50:37Z"
  generation: 10
  name: akshay-pac-test-repo
  namespace: default
  resourceVersion: "14045"
  uid: 2a398763-205b-4d50-84fe-b1117bc2fecd
pipelinerun_status:
  ...
spec:
  settings:
    ai:
      enabled: true
      max_tokens: 1000
      provider: openai
      roles:
      - context_items:
          commit_content: true
          container_logs:
            enabled: false
          error_content: true
          pr_content: true
        name: pipeline-summary
        on_cel: "true"
        output: pr-comment
        prompt: "You are a CI/CD assistant providing pipeline run summaries.\n**Generate
          a brief summary of this pipeline run:**\n## \U0001F4CA Pipeline Run Summary\n###
          Status\n[Success/Failed/Other - based on context]\n### Key Metrics\n- Duration:
          [if available]\n- Tasks Run: [count if available]\n### Highlights\n- [Notable
          successes or issues]\n- [Any warnings or recommendations]\nKeep it brief
          and informative.\n"
      - context_items:
          container_logs:
            enabled: true
            max_lines: 100
          error_content: true
        model: o3
        name: failure-analysis
        output: pr-comment
        prompt: |
          You are a DevOps expert. Analyze this failed pipeline and:
          1. Identify the root cause
          2. Suggest specific fixes
          3. Recommend preventive measures
      secret_ref:
        key: token
        name: openai-api-key
      timeout_seconds: 30
  url: https://github.com/theakshaypant/akshay-pac-test-repo

Comments when PLR failing
image
Comment with successful PLR
image

🤖 AI Assistance

AI assistance can be used for various tasks, such as code generation,
documentation, or testing.

Please indicate whether you have used AI assistance
for this PR and provide details if applicable.

  • I have not used any AI assistance for this PR.
  • I have used AI assistance for this PR.

Important

Slop will be simply rejected, if you are using AI assistance you need to make sure you
understand the code generated and that it meets the project's standards. you
need at least know how to run the code and deploy it (if needed). See
startpaac to make it easy
to deploy and test your code changes.

If the majority of the code in this PR was generated by an AI, please add a Co-authored-by trailer to your commit message.
For example:

Co-authored-by: Claude noreply@anthropic.com

✅ Submitter Checklist

  • 📝 My commit messages are clear, informative, and follow the project's How to write a git commit message guide. The Gitlint linter ensures in CI it's properly validated
  • ✨ I have ensured my commit message prefix (e.g., fix:, feat:) matches the "Type of Change" I selected above.
  • ♽ I have run make test and make lint locally to check for and fix any
    issues. For an efficient workflow, I have considered installing
    pre-commit and running pre-commit install to
    automate these checks.
  • 📖 I have added or updated documentation for any user-facing changes.
  • 🧪 I have added sufficient unit tests for my code changes.
  • 🎁 I have added end-to-end tests where feasible. See README for more details.
  • 🔎 I have addressed any CI test flakiness or provided a clear reason to bypass it.
  • If adding a provider feature, I have filled in the following and updated the provider documentation:
    • GitHub App
    • GitHub Webhook
    • Gitea/Forgejo
    • GitLab
    • Bitbucket Cloud
    • Bitbucket Data Center

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the LLM analysis triggering logic by moving the default failure check into the analyzer, which allows for more granular control via CEL expressions. Additionally, the OpenAI client was updated to use the MaxCompletionTokens field to align with current API standards. Review feedback focuses on improving the robustness of the PipelineRun status check by explicitly querying the Succeeded condition instead of relying on slice indexing, and recommends explicitly defining condition types in test cases for better clarity and maintenance.

@theakshaypant theakshaypant force-pushed the fix-llm-openai-params branch from d442566 to 7bfe9a5 Compare March 27, 2026 10:56
@chmouel
Copy link
Copy Markdown
Member

chmouel commented Mar 31, 2026

@theakshaypant Please consider applying this change to use GetCondition(apis.ConditionSucceeded) instead of Conditions[0] — this matches the pattern used throughout the rest of the codebase (cleanups.go, annotation_matcher.go, reconciler/, formatting/):

diff --git a/pkg/llm/analyzer.go b/pkg/llm/analyzer.go
index 2fe55a27a..a3e824f15 100644
--- a/pkg/llm/analyzer.go
+++ b/pkg/llm/analyzer.go
@@ -16,6 +16,7 @@ import (
 	tektonv1 "github.com/tektoncd/pipeline/pkg/apis/pipeline/v1"
 	"go.uber.org/zap"
 	corev1 "k8s.io/api/core/v1"
+	apis "knative.dev/pkg/apis"
 )
 
 // AnalysisResult represents the result of an LLM analysis.
@@ -300,7 +301,8 @@ func countFailedResults(results []AnalysisResult) int {
 // If no on_cel is provided, defaults to triggering only for failed PipelineRuns.
 func (a *Analyzer) shouldTriggerRole(role v1alpha1.AnalysisRole, celContext map[string]any, pr *tektonv1.PipelineRun) (bool, error) {
 	if role.OnCEL == "" {
-		return len(pr.Status.Conditions) > 0 && pr.Status.Conditions[0].Status == corev1.ConditionFalse, nil
+		succeededCondition := pr.Status.GetCondition(apis.ConditionSucceeded)
+		return succeededCondition != nil && succeededCondition.Status == corev1.ConditionFalse, nil
 	}
 
 	result, err := cel.Value(role.OnCEL, celContext["body"],
diff --git a/pkg/llm/analyzer_test.go b/pkg/llm/analyzer_test.go
index bb2eb9dca..c6403fa3d 100644
--- a/pkg/llm/analyzer_test.go
+++ b/pkg/llm/analyzer_test.go
@@ -412,7 +412,7 @@ func TestAnalyzer_ShouldTriggerRoleEvaluations(t *testing.T) {
 	}
 
 	failedPR := &tektonv1.PipelineRun{}
-	failedPR.Status.Conditions = append(failedPR.Status.Conditions, apis.Condition{Status: "False"})
+	failedPR.Status.Conditions = append(failedPR.Status.Conditions, apis.Condition{Type: apis.ConditionSucceeded, Status: "False"})
 
 	tests := []struct {
 		name      string
@@ -464,10 +464,10 @@ func TestAnalyzer_ShouldTriggerRole(t *testing.T) {
 	analyzer := NewAnalyzer(run, kinteract, logger)
 
 	failedPR := &tektonv1.PipelineRun{}
-	failedPR.Status.Conditions = append(failedPR.Status.Conditions, apis.Condition{Status: "False"})
+	failedPR.Status.Conditions = append(failedPR.Status.Conditions, apis.Condition{Type: apis.ConditionSucceeded, Status: "False"})
 
 	succeededPR := &tektonv1.PipelineRun{}
-	succeededPR.Status.Conditions = append(succeededPR.Status.Conditions, apis.Condition{Status: "True"})
+	succeededPR.Status.Conditions = append(succeededPR.Status.Conditions, apis.Condition{Type: apis.ConditionSucceeded, Status: "True"})
 
 	tests := []struct {
 		name        string

@chmouel
Copy link
Copy Markdown
Member

chmouel commented Mar 31, 2026

/lgtm

Copy link
Copy Markdown

@pipelines-as-code pipelines-as-code bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Congrats @theakshaypant your PR Has been approved 🎉

✅ Pull Request Approved

Approval Status:

  • Required Approvals: 1
  • Current Approvals: 1

👥 Reviewers Who Approved:

Reviewer Permission Level Approval Status
@chmouel admin

📝 Next Steps

  • Ensure all required checks pass
  • Comply with branch protection rules
  • Request a maintainer to merge using the /merge command (or merge it
    directly if you have repository permission).

Automated by the PAC Boussole 🧭

theakshaypant and others added 2 commits March 31, 2026 17:18
Newer OpenAI models (gpt-5, o1, o3 series) reject `max_tokens` and
require `max_completion_tokens` instead. This caused LLM analysis to
fail with HTTP 400 errors for these models.

The internal `max_tokens` config field is kept provider-agnostic;
only the OpenAI API request maps it to `max_completion_tokens`.

Fixes tektoncd#2576

Signed-off-by: Akshay Pant <akpant@redhat.com>
Co-Authored-by: BoseKarthikeyan <bkarth17@ford.com>
Remove the hard-coded failed-only check from the reconciler and
delegate it to the LLM orchestrator. When no on_cel expression is
defined for a role, the orchestrator defaults to triggering only for
failed PipelineRuns. An explicit on_cel expression overrides this,
allowing roles to run on any pipeline outcome.

Signed-off-by: Akshay Pant <akpant@redhat.com>
@theakshaypant theakshaypant force-pushed the fix-llm-openai-params branch from 7a890f3 to c664c4e Compare March 31, 2026 11:55
@theakshaypant
Copy link
Copy Markdown
Member Author

@theakshaypant Please consider applying this change to use GetCondition(apis.ConditionSucceeded) instead of Conditions[0] — this matches the pattern used throughout the rest of the codebase (cleanups.go, annotation_matcher.go, reconciler/, formatting/):

Gotcha! i had moved the condition directly from the reconciler so did not make the change when requested by gemini. Added the suggested patch in c664c4e

	// Perform LLM analysis only for failed pipeline runs (best-effort, non-blocking)
	// Users can use CEL expressions in role configurations for more fine-grained control
	if len(newPr.Status.Conditions) > 0 && newPr.Status.Conditions[0].Status == corev1.ConditionFalse {

@chmouel chmouel merged commit f8e4343 into tektoncd:main Mar 31, 2026
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Newer OpenAI models fail with "max_tokens is not supported, use max_completion_tokens instead"

2 participants