Skip to content

fix: harden folder watch feature with file hash dedup, mtime seeding, and stable spinner#1185

Open
AnishSarkar22 wants to merge 21 commits intoMODSetter:devfrom
AnishSarkar22:fix/folder-watch
Open

fix: harden folder watch feature with file hash dedup, mtime seeding, and stable spinner#1185
AnishSarkar22 wants to merge 21 commits intoMODSetter:devfrom
AnishSarkar22:fix/folder-watch

Conversation

@AnishSarkar22
Copy link
Copy Markdown
Contributor

@AnishSarkar22 AnishSarkar22 commented Apr 8, 2026

Description

  • Refactored folder watch feature to work with remote backends by replacing direct filesystem indexing with an upload-based approach, the desktop app reads files locally and uploads them to the backend API for processing.
  • Removed legacy local-only folder indexing endpoints that required the backend to have direct filesystem access.
  • Added folderUploadFiles and folderNotifyUnlinked API endpoints for upload-based file indexing and deletion.
  • Implemented seedFolderMtimes IPC API to populate the Electron mtime store after initial folder scans, preventing the Chokidar watcher from re-emitting redundant "add" events for already-indexed files.
  • Updated acknowledgeFileEvents to persist file modification times on acknowledgment, ensuring accurate change detection across app restarts.
  • Added raw file hash (SHA-256) short-circuit in the backend indexer to skip expensive OCR/ETL extraction when raw file bytes haven't changed, running via asyncio.to_thread to avoid blocking the event loop.
  • Fixed folder spinner flickering during batch indexing by introducing an indexing_in_progress flag on the folder metadata, set at batch start and cleared in a finally block, synced to the frontend via Zero for a stable processing indicator.
  • Exposed folder metadata JSONB column in the Zero schema to enable real-time sync of folder-level state to the UI.
  • Optimized empty folder cleanup with subtree ID retrieval during the local folder sync finalization process.
  • Updated watched folder ID recovery in DocumentsSidebar to fall back to backend data when the Electron store is empty.
  • Updated relationship backref to enable passive deletes for document versions.
  • Added chat session and message synchronization hooks.
  • Minor UI refinements: button variants, terminology updates, className formatting, and motion properties in sidebar components.

Motivation and Context

FIX #

Screenshots

API Changes

  • This PR includes API changes

Change Type

  • Bug fix
  • New feature
  • Performance improvement
  • Refactoring
  • Documentation
  • Dependency/Build system
  • Breaking change
  • Other (specify):

Testing Performed

  • Tested locally
  • Manual/QA verification

Checklist

  • Follows project coding standards and conventions
  • Documentation updated as needed
  • Dependencies updated as needed
  • No lint/build errors or new warnings
  • All relevant tests are passing

High-level PR Summary

This PR refactors the folder watch feature to support remote backend deployments by replacing direct filesystem scanning with an upload-based approach. The desktop app now reads files locally and uploads them to the backend via new API endpoints (/folder-mtime-check, /folder-upload, /folder-unlink, /folder-sync-finalize), enabling the feature to work across cloud, self-hosted remote, and local deployment modes. Key improvements include optimized mtime-based change detection to skip unchanged files, raw file hash comparison as a pre-filter before expensive content extraction, passive delete cascade for document versions, batch upload with concurrency control, and progress tracking with cancellation support in the UI.

⏱️ Estimated Review Time: 1-3 hours

💡 Review Order Suggestion
Order File Path
1 surfsense_backend/app/db.py
2 surfsense_desktop/src/ipc/channels.ts
3 surfsense_desktop/src/ipc/handlers.ts
4 surfsense_desktop/src/modules/folder-watcher.ts
5 surfsense_desktop/src/preload.ts
6 surfsense_web/lib/apis/documents-api.service.ts
7 surfsense_web/lib/folder-sync-upload.ts
8 surfsense_web/hooks/use-folder-sync.ts
9 surfsense_web/components/sources/FolderWatchDialog.tsx
10 surfsense_web/components/layout/ui/sidebar/DocumentsSidebar.tsx
11 surfsense_backend/app/routes/documents_routes.py
12 surfsense_backend/app/tasks/connector_indexers/local_folder_indexer.py
13 surfsense_backend/app/tasks/celery_tasks/document_tasks.py
14 surfsense_web/zero/schema/folders.ts
15 surfsense_web/contracts/types/folder.types.ts
16 surfsense_web/components/documents/FolderNode.tsx
17 surfsense_web/components/documents/FolderTreeView.tsx
18 surfsense_web/app/dashboard/[search_space_id]/new-chat/[[...chat_id]]/page.tsx
19 surfsense_web/components/assistant-ui/connector-popup/tabs/all-connectors-tab.tsx
20 surfsense_web/components/layout/ui/dialogs/CreateSearchSpaceDialog.tsx
21 surfsense_web/components/layout/ui/shell/LayoutShell.tsx
22 surfsense_web/components/layout/ui/sidebar/SidebarSlideOutPanel.tsx
⚠️ Inconsistent Changes Detected
File Path Warning
surfsense_web/app/dashboard/[search_space_id]/new-chat/[[...chat_id]]/page.tsx Imports chat session and message synchronization hooks without any implementation or usage in the diff, which appears unrelated to folder watch functionality
surfsense_web/components/assistant-ui/connector-popup/tabs/all-connectors-tab.tsx Minor text change from 'Document/Files Connectors' to 'File Storage Integrations' seems like incidental UI copy refinement unrelated to the folder watch fix
surfsense_web/components/layout/ui/dialogs/CreateSearchSpaceDialog.tsx Refactoring of the submit button spinner display logic is unrelated to folder watch feature
surfsense_web/components/layout/ui/shell/LayoutShell.tsx Formatting change (className prop indentation) unrelated to folder watch functionality
surfsense_web/components/layout/ui/sidebar/SidebarSlideOutPanel.tsx Animation style change from translateX to width-based animation is a UI refinement unrelated to folder watch

Need help? Join our Discord

Analyze latest changes

testing backend test CI workflow
…hance watched folder ID retrieval in DocumentsSidebar
…val and optimizing empty folder cleanup process
…and enhance UI component styling for folder selection
@vercel
Copy link
Copy Markdown

vercel bot commented Apr 8, 2026

@AnishSarkar22 is attempting to deploy a commit to the Rohan Verma's projects Team on Vercel.

A member of the Team first needs to authorize it.

Copy link
Copy Markdown

@recurseml recurseml bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by RecurseML

🔍 Review performed on b8d6cd4..f3aa514

✨ No bugs found, your code is sparkling clean

✅ Files analyzed, no issues (23)

surfsense_backend/app/db.py
surfsense_backend/app/routes/documents_routes.py
surfsense_backend/app/tasks/celery_tasks/document_tasks.py
surfsense_backend/app/tasks/connector_indexers/local_folder_indexer.py
surfsense_desktop/src/ipc/channels.ts
surfsense_desktop/src/ipc/handlers.ts
surfsense_desktop/src/modules/folder-watcher.ts
surfsense_desktop/src/preload.ts
surfsense_web/app/dashboard/[search_space_id]/new-chat/[[...chat_id]]/page.tsx
surfsense_web/components/assistant-ui/connector-popup/tabs/all-connectors-tab.tsx
surfsense_web/components/documents/FolderNode.tsx
surfsense_web/components/documents/FolderTreeView.tsx
surfsense_web/components/layout/ui/dialogs/CreateSearchSpaceDialog.tsx
surfsense_web/components/layout/ui/shell/LayoutShell.tsx
surfsense_web/components/layout/ui/sidebar/DocumentsSidebar.tsx
surfsense_web/components/layout/ui/sidebar/SidebarSlideOutPanel.tsx
surfsense_web/components/sources/FolderWatchDialog.tsx
surfsense_web/contracts/types/folder.types.ts
surfsense_web/hooks/use-folder-sync.ts
surfsense_web/lib/apis/documents-api.service.ts
surfsense_web/lib/folder-sync-upload.ts
surfsense_web/types/window.d.ts
surfsense_web/zero/schema/folders.ts

@AnishSarkar22 AnishSarkar22 changed the title fix: folder watch feature fix: harden folder watch with file hash dedup, mtime seeding, and stable spinner Apr 8, 2026
@AnishSarkar22 AnishSarkar22 marked this pull request as ready for review April 8, 2026 12:53
@AnishSarkar22 AnishSarkar22 changed the title fix: harden folder watch with file hash dedup, mtime seeding, and stable spinner fix: harden folder watch feature with file hash dedup, mtime seeding, and stable spinner Apr 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant