Skip to content

exempt service call from using json#4

Open
btsheehy wants to merge 2 commits intofastly:mainfrom
btsheehy:main
Open

exempt service call from using json#4
btsheehy wants to merge 2 commits intofastly:mainfrom
btsheehy:main

Conversation

@btsheehy
Copy link
Copy Markdown

@btsheehy btsheehy commented Feb 3, 2026

The fastly service list --json command returns a massive amount of unnecessary data, especially when your account has nearly 100 services.

More specifically, every service includes an array with details on every version its ever had. We have nearly 100 services, some of which have 700+ versions. Our fastly service list --json currently returns over 1.15 MB of text, or roughly 448,310 tokens, and that number can only grow.

This immediately overloads the LLM context for most models and causes the call to fail.

I've found this fix to work well, as it allows the LLM to truly obtain a simple list of services, then query what's necessary after that.

Perhaps you could instead intercept the json to remove the excess data before passing it to the LLM, but this seemed like the most straightforward approach.

@jedisct1
Copy link
Copy Markdown
Collaborator

jedisct1 commented Apr 2, 2026

Thanks for the PR and the detailed writeup, this is a real problem worth solving.

I'm a bit hesitant about hardcoding cmd != "service" in applySmartDefaults though. That function is meant to be command-agnostic, and if other commands develop similar issues we'd end up with a growing list of exceptions.

The thing is, we already have infrastructure to handle this: the executor truncates JSON arrays to 100 items and caches anything over 25KB with a preview + pagination tools.

So in theory, that 1.15 MB response should never actually hit the LLM's context. If it is, that's the bug I'd want to fix.

The other concern is that by dropping --json, the LLM gets plain text table output, which is harder to parse reliably.

And if someone explicitly passes --json to service list, they'd still hit the same problem anyway.

I think the better fix here would be to either:

  • Strip the versions array from service list JSON responses before returning them (that's where all the bloat comes from), or
  • Fix whatever's going wrong in the truncation/caching layer so it properly handles this case

Would you be up for looking into one of those approaches instead? Happy to help dig into the caching path if that would be useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants