Skip to content

feat: streaming support for query responses #2

@0xneobyte

Description

@0xneobyte

Problem

The query method waits for the complete response before returning anything to the caller. For long-form answers this introduces noticeable latency — the caller receives nothing until generation is fully complete. Every major AI SDK (Anthropic, OpenAI, Gemini) exposes streaming as a first-class feature.

Proposed Behaviour

Add a queryStream method that yields response chunks as they arrive rather than waiting for the full response.

for await (const chunk of client.queryStream({ query: 'What is Python?' })) {
  process.stdout.write(chunk)
}
  • Existing query() is unchanged
  • queryStream() returns an async iterable
  • Citations and metadata returned at end of stream

Files to Modify

File Change
src/client.ts Add queryStream method returning an async iterable
src/types.ts Add streaming chunk type
src/index.ts Export new types

Acceptance Criteria

  • queryStream() yields text chunks progressively as they arrive
  • Existing query() behaviour is unchanged
  • Citations accessible at end of stream
  • Stream can be cancelled mid-way without errors
  • Full TypeScript types on all new methods and interfaces

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions