feat: add multi-document support to retrieval and client API by Shreyansh1729 · Pull Request #216 · VectifyAI/PageIndex

Shreyansh1729 · 2026-04-04T06:03:13Z

Summary

Added the ability to query multiple documents simultaneously, addressing Issue #187. This allows for cross-document data retrieval and combined reasoning in RAG applications.

Changes Made

retrieve.py: Refactored get_document, get_document_structure, and get_page_content to accept a single string or a list of strings for doc_id.
client.py: Updated PageIndexClient methods to support Union[str, List[str]] for batch querying.
tests: Added tests/test_multi_doc.py with 5 tests verifying batch metadata, structure, and content retrieval, along with error handling and backward compatibility.

Verification

Run tests: export PYTHONPATH=. && pytest tests/test_multi_doc.py
Result: 5 passed.

Closes #187

- Use .get() with safe defaults for all LLM response dict accesses - Optimize extract_toc_content retry loop to grow chat_history incrementally instead of rebuilding with full accumulated response - Optimize toc_transformer retry loop to use chat_history instead of re-embedding the entire raw TOC and incomplete JSON in each prompt - Return best-effort results on max retries instead of raising - Add 14 mock-based tests covering all fix scenarios Closes VectifyAI#163

- Restore explicit Exception on max retries instead of silent warning - Move truncation logic before the retry loop so it only runs once on the initial incomplete response, not on every iteration - Add explicit None guard for physical_index before passing to convert_physical_index_to_int to prevent potential TypeError - Update test to expect Exception on max retries

- Update retrieve.py functions to support Union[str, List[str]] for doc_id - If a list of IDs is provided, return a JSON object mapping IDs to results - Update PageIndexClient methods to support batch querying - Add 5 comprehensive unit tests for multi-doc support - Maintain 100% backward compatibility for single-doc requests

claude

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

…ions

Your Name added 3 commits March 28, 2026 00:16

claude bot reviewed Apr 4, 2026

View reviewed changes

Merge upstream/main and resolve conflicts in client.py and test locat…

feac79a

…ions

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add multi-document support to retrieval and client API#216

feat: add multi-document support to retrieval and client API#216
Shreyansh1729 wants to merge 4 commits intoVectifyAI:mainfrom
Shreyansh1729:feat/multi-doc-support

Shreyansh1729 commented Apr 4, 2026

Uh oh!

claude bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Shreyansh1729 commented Apr 4, 2026

Summary

Changes Made

Verification

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

Claude Code Review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant