Skip to content

feat: Discogs user integration — authenticate, sync collection & wantlist#66

Merged
SimplicityGuy merged 30 commits intomainfrom
issue-60
Feb 24, 2026
Merged

feat: Discogs user integration — authenticate, sync collection & wantlist#66
SimplicityGuy merged 30 commits intomainfrom
issue-60

Conversation

@SimplicityGuy
Copy link
Copy Markdown
Owner

@SimplicityGuy SimplicityGuy commented Feb 22, 2026

Overview

Implements Discogs user integration for discogsography: users create an account, connect their Discogs profile via OAuth 1.0a, then sync their personal collection and wantlist to both Neo4j and PostgreSQL.

Closes #60.

Changes

This PR is the umbrella for four sub-tasks implemented across separate branches:

Step Issue Branch What it adds
1 #56 claude/issue-56-auth-user-accounts Auth microservice — register/login/JWT, users + oauth_tokens tables
2 #57 claude/issue-57-discogs-oauth Discogs OAuth 1.0a OOB flow, Redis state store, admin config
3 #58 claude/issue-58-collector-service Collector microservice — Celery sync, PostgreSQL + Neo4j upserts
4 #59 claude/issue-59-explore-extensions Explore extensions — personalized endpoints, collection/wantlist/recommendations

Sub-task PRs

Merge in order:

  1. feat(auth): Add user account system with JWT authentication #56 — Auth microservice
  2. feat(auth): Implement Discogs OAuth 1.0a account connection #57 — Discogs OAuth 1.0a
  3. feat(collector): Create collector microservice for Discogs collection & wantlist sync #58 — Collector microservice
  4. feat(explore): User collection & wantlist graph queries and UI #59 — Explore extensions

Generated with Claude Code

github-actions Bot and others added 15 commits February 22, 2026 05:19
…cation (#56)

Implements Step 1 of the Discogs user integration (issue #60):

- New `auth` microservice (FastAPI, ports 8004/8005)
  - POST /api/auth/register — user registration with PBKDF2-SHA256 password hashing
  - POST /api/auth/login — authentication with HS256 JWT tokens (stdlib only, no deps)
  - GET /api/auth/me — current user info via JWT Bearer auth
  - GET /health — health check endpoint

- PostgreSQL schema additions (via schema-init):
  - `users` table — email, hashed_password, is_active, timestamps
  - `oauth_tokens` table — per-user Discogs OAuth tokens (prepared for Step 2)
  - `app_config` table — admin-managed Discogs app credentials
  - `user_collections` / `user_wantlists` tables — user personal data (prepared for Steps 3-4)
  - `sync_history` table — sync job tracking

- Common config: added `AuthConfig` dataclass with JWT + Postgres settings
- docker-compose.yml: auth service added with health check on port 8005
- All auth crypto uses Python stdlib (hashlib, hmac, base64) — no new package deps

Co-authored-by: Robert Wlodarczyk <SimplicityGuy@users.noreply.github.com>
Implements Step 2 of the Discogs user integration (issue #60):

- auth/services/discogs.py — DiscogsOAuth1Auth helpers:
  - HMAC-SHA1 OAuth signature generation (stdlib only)
  - request_oauth_token() — OOB flow with callback_uri="oob"
  - exchange_oauth_verifier() — exchange verifier for access token
  - fetch_discogs_identity() — get /oauth/identity

- New OAuth endpoints in auth/auth.py:
  - GET  /api/oauth/authorize/discogs — start OOB flow, store state in Redis
  - POST /api/oauth/verify/discogs — exchange verifier, store tokens, fetch identity
  - GET  /api/oauth/status/discogs — check connection status
  - DELETE /api/oauth/revoke/discogs — disconnect Discogs account
  - PUT /api/admin/config/{key} — admin endpoint for Discogs app credentials

- AuthConfig extended with redis_url and discogs_user_agent fields
- Redis initialized in auth service lifespan for 10-min OOB state TTL

Co-authored-by: Robert Wlodarczyk <SimplicityGuy@users.noreply.github.com>
…ntlist sync

Implements step 3 of issue #60 — a new `collector` microservice that
syncs each authenticated user's Discogs vinyl collection and wantlist
into both PostgreSQL and Neo4j.

Key additions:
- collector/collector.py — FastAPI service (port 8010/8011) with JWT
  authentication; POST /api/sync triggers background sync, GET
  /api/sync/status returns history
- collector/syncer.py — paginated Discogs API sync with OAuth 1.0a
  signing, upserts to user_collections / user_wantlists tables and
  COLLECTED / WANTS Neo4j relationships on existing Release nodes
- collector/Dockerfile — multi-stage build following project patterns
- collector/pyproject.toml — service-specific dependency manifest
- common/config.py — adds CollectorConfig (postgres + neo4j + jwt)
- schema-init/neo4j_schema.py — adds User.id uniqueness constraint
- docker-compose.yml — adds collector service (ports 8010/8011)
- pyproject.toml — adds collector optional-dependency group; updates
  all extra and workspace membership
- tests/collector/ — unit tests for OAuth helpers and constants

Closes #58

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…tlist

Implements step 4 of issue #60 — extends the explore service with
personalized Neo4j-backed endpoints for authenticated users.

New endpoints (all require Bearer JWT matching auth/collector services):
  GET /api/user/collection      — paginated list of COLLECTED releases
  GET /api/user/wantlist        — paginated list of WANTS releases
  GET /api/user/recommendations — 'you may also like' from collection artists
  GET /api/user/collection/stats — breakdown by genre, decade, and label
  GET /api/user/status          — check in_collection/in_wantlist for a set
                                  of release IDs (optional auth, defaults false)

Supporting changes:
- explore/user_queries.py — all Neo4j Cypher for personalization; pure
  Neo4j (no PostgreSQL dependency), leverages User node and COLLECTED/WANTS
  relationships written by the collector service
- explore/explore.py — JWT verification via stdlib HMAC-SHA256, optional
  _get_optional_user dependency for decoration, _require_user for protected
  routes; JWT_SECRET_KEY is optional (endpoints return 503 if not configured)
- common/config.py — adds optional jwt_secret_key field to ExploreConfig
- explore/Dockerfile — exposes JWT_SECRET_KEY env var
- docker-compose.yml — sets JWT_SECRET_KEY for explore (must match auth)
- tests/explore/test_user_queries.py — unit tests for JWT verification and
  check_releases_user_status helper

Closes #59

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When FORCE_REPROCESS is set, list_s3_files was called twice per run:
once in process_discogs_data to detect the version, and again inside
download_discogs_data. This caused two full scrapes of the Discogs
website on every startup.

Added cached_files field to Downloader so the result of the first
scrape is reused within the same process_discogs_data invocation.
The cache is naturally reset on each run since a new Downloader is
created per invocation, ensuring periodic checks always fetch fresh
data from the website.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix DLQ poisoning: build separate graphinatorMap/tableInatorMap,
  skipping queues ending in .dlq, so dead-letter queues no longer
  overwrite real data with zeros on every update cycle
- Bar chart now shows two bars per type: graphinator (purple) and
  tableinator (blue) message counts, each with its own scale
- Split single "Processing Rates" panel into two side-by-side panels:
  "Publish Rates" and "Ack Rates", each with a graphinator row (purple)
  and tableinator row (blue) — 16 gauges total
- Each gauge is self-normalizing: fill reflects current rate relative
  to that gauge's own observed max, with a live min–max label appended
  below each gauge label
- Update bar chart legend from "Messages/Ready" to "Graphinator/Tableinator"

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…hboard

The dashboard redesign split the single "Processing Rates" section into
separate "Publish Rates (msg/s)" and "Ack Rates (msg/s)" panels, and
removed the #processing-rates-grid element. Update the E2E test to check
for the actual h2 headings and a known rate circle ID instead.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… badge

- Fix bar chart bars not rendering: bar group containers lacked an
  established height so percentage heights resolved to zero; add h-full
  to each group flex-col and flex-1 to each bar-pair container so CSS
  percentage heights compute correctly against the h-64 chart area
- Fix DLQ toggle doing nothing: wire change event listener to new
  _onDlqToggle() handler; build separate graphinatorDlqMap /
  tableInatorDlqMap alongside regular maps; store currentMaps so the
  toggle can re-render instantly without waiting for the next WebSocket
  update; update bar chart legend to show "Graphinator DLQ /
  Tableinator DLQ" when active
- Fix Neo4j status badge showing "Primary" instead of health status:
  change badge text to "Healthy" / "Unavailable" to match PostgreSQL

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…nd GB/MB units

Display DB size in MB (with thousands comma) for values under 1 GB,
and in GB for values >= 1 GB. Switches from pg_size_pretty() to raw
pg_database_size() bytes and formats in Python.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace the placeholder rotated-square mark with the new Discogsography
logo: a dark-bordered circle with terracotta, steel-blue, teal, and olive
quadrants, a radial depth shadow, and a white rotated-diamond center.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@SimplicityGuy SimplicityGuy changed the title Issue 60 feat: Discogs user integration — authenticate, sync collection & wantlist Feb 22, 2026
SimplicityGuy and others added 12 commits February 22, 2026 15:41
- validate-compose: add auth and collector to expected services list
- test-schema-init: update neo4j statement count (14→15) and postgres
  expected_calls formula to include _USER_TABLES entries
- test-explore: fix AsyncMock setup for driver.session in _make_driver()
  and rename unused `self` to `_self` in _aiter (ARG001)
- code-quality ruff: add `from exc` chaining to bare raises in except
  blocks (B904), add noqa S105/S106 suppressions for password-like
  string literals in tests and constants
- code-quality bandit: add nosec B104 to uvicorn.run host="0.0.0.0"
  lines, nosec B105 to bearer token_type, nosec B107 to token_secret
  default parameter
- mypy: add AsyncGenerator[None] return type to lifespan functions,
  remove stale type: ignore comments, add driver property to
  AsyncResilientNeo4jDriver returning self._connection

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… explore pattern

Instead of unwrapping _neo4j.driver to get the raw AsyncDriver and calling
execute_query(), pass the AsyncResilientNeo4jDriver directly and use its
session() method — the same pattern used by explore and dashboard.

This removes the need for the .driver property that was added to
AsyncResilientNeo4jDriver and keeps Neo4j access consistently routed
through the resilient wrapper with its circuit breaker and retry logic.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
auth → api: the service will grow to host all future API endpoints,
not just authentication.

collector → curator: avoids confusion with the extractor service
(which processes bulk Discogs XML dumps); the curator service manages
a user's personal Discogs collection and wantlist sync.

Renames:
- auth/ → api/, auth/auth.py → api/api.py
- collector/ → curator/, collector/collector.py → curator/curator.py
- tests/auth/ → tests/api/, test_auth_models.py → test_api_models.py
- tests/collector/ → tests/curator/, test_collector_syncer.py → test_curator_syncer.py

Updates all references: imports, config classes (AuthConfig→ApiConfig,
CollectorConfig→CuratorConfig), port constants, docker-compose service
names/images/volumes, pyproject.toml extras and workspace members,
GitHub Actions expected services list, log file paths, and OCI labels.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add api and curator to list-sub-projects matrix for Docker builds
- Add test-api and test-curator jobs to test.yml with Codecov upload
- Update aggregate-results to include both new test jobs
- Add uv pip install for api and curator in code-quality.yml
- Add test-api and test-curator recipes to justfile
- Add api and curator to justfile test-parallel and docker build targets
- Add api and curator to docker-compose.prod.yml with JWT_SECRET_KEY
  and postgres credential overrides, ordered to match base file
- Fix property ordering in docker-compose.yml: networks before healthcheck
  for all infrastructure services, volumes before depends_on for api/curator

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Update README.md: add api/curator to Core Services table and
  Mermaid architecture diagram with new nodes and connections
- Update docs/architecture.md: add api/curator to service components
  table, both Mermaid diagrams, component detail sections, health
  checks, Redis cache description, and scalability section
- Update docs/configuration.md: update "Used By" for Neo4j, PostgreSQL,
  and Redis; add JWT Configuration section; add api and curator
  service-specific config blocks; update health checks and env templates
- Update docs/monitoring.md: add api/curator to Services Monitored,
  health check curl commands, and automated monitoring script
- Update docs/quick-start.md: add api/curator to service access table,
  health checks, and run commands
- Update docs/emoji-guide.md: add 🔐 API and 🗂️ Curator identifiers
- Update docs/task-automation.md: add test-api and test-curator to
  Test Group table
- Update CLAUDE.md: add api (8004/8005) and curator (8010/8011) ports
- Create api/README.md: full service documentation with endpoints,
  JWT auth, Discogs OAuth flow, configuration, and DB schema
- Create curator/README.md: full service documentation with endpoints,
  sync flow, JWT validation, configuration, and DB schema

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add `just api` recipe to run the API service (user accounts & JWT auth)
- Add `just curator` recipe to run the curator service (collection sync)
- Update docs/task-automation.md Services Group table to accurately
  reflect current justfile: add api (8004) and curator (8010), add
  explore (8006), remove extractor and schema-init which have no
  local run commands in the services group

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add `just schema-init` to run the one-shot schema initialiser
- Add `just extractor` as a services-group alias for extractor-run
- Update docs/task-automation.md Services Group table to include both

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace all UK English spellings with US equivalents:
- initialiser → initializer (5 instances)
- initialised → initialized (2 instances)
- initialisation → initialization (1 instance)
- behaviour → behavior (1 instance)
- catalogue → catalog (1 instance)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Addresses Codecov feedback from PR #66 by adding comprehensive unit
tests targeting 85%+ coverage across the 7 flagged files.

Coverage improvements:
- api/api.py: 0% → 85% (auth, register, login, OAuth endpoints)
- api/services/discogs.py: 0% → 100% (OAuth 1.0a, HMAC-SHA1)
- curator/curator.py: 0% → 78% (sync trigger, status, JWT verify)
- curator/syncer.py: 17.83% → 91% (collection/wantlist sync, full sync)
- common/config.py: 32.14% → 97% (ApiConfig, CuratorConfig)
- explore/explore.py: 49.39% → 85% (user endpoints, JWT helpers)
- explore/user_queries.py: 45.94% → 100% (collection, wantlist, recommendations)

New test files:
- tests/api/conftest.py — fixtures (JWT, mock pool, mock Redis, TestClient)
- tests/api/test_api.py — 44 tests for API service endpoints
- tests/api/test_discogs_service.py — 30 tests for Discogs OAuth service
- tests/curator/conftest.py — fixtures (JWT, mock pool, mock Neo4j, TestClient)
- tests/curator/test_curator.py — 48 tests for curator service endpoints

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…er tests

Add tests covering previously uncovered branches in explore and curator services:

- explore/explore.py: JWT helpers (_b64url_decode padding, _verify_jwt all
  branches), _require_user dependency (config None→503, no auth→401, bad
  token→401), all five user endpoints (collection, wantlist, recommendations,
  stats, status) with service-not-ready and success paths
- curator/syncer.py: 429 rate-limit retry with sleep for both collection and
  wantlist, skip items missing release_id, multi-page pagination for both

Total coverage: 84% → 94%

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Repository owner deleted a comment from codecov Bot Feb 23, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Feb 23, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@codecov
Copy link
Copy Markdown

codecov Bot commented Feb 23, 2026

Codecov Report

❌ Patch coverage is 85.97285% with 124 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
common/config.py 32.14% 57 Missing ⚠️
api/api.py 84.64% 37 Missing ⚠️
curator/curator.py 77.86% 29 Missing ⚠️
api/setup.py 98.43% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

@SimplicityGuy
Copy link
Copy Markdown
Owner Author

SimplicityGuy commented Feb 23, 2026

Code Review: Admin Config, Discogs Connectivity & Explore Auth Scope

Admin Config for Discogs Credentials

Storage: The Discogs app credentials (discogs_consumer_key, discogs_consumer_secret) are stored in a PostgreSQL app_config key/value table — not in environment variables. They are read from the DB at runtime on every OAuth initiation via _get_app_config() (api/api.py:371).

Access endpointCLI tool (resolved):

PUT /api/admin/config/{key}

The HTTP endpoint has been removed. Credentials are now set via a CLI tool that runs directly inside the API container:

# Set credentials
docker exec <container> discogs-setup --consumer-key KEY --consumer-secret SECRET

# View current values (masked)
docker exec <container> discogs-setup --show

# Or via justfile
just configure-discogs MY_KEY MY_SECRET

The discogs-setup entry point is defined in api/pyproject.toml and implemented in api/setup.py. It reads the same POSTGRES_* environment variables already present in the container and upserts both keys in a single transaction. No HTTP surface, no JWT — shell access to the container is already an admin-privileged operation.

Security Concern: No Admin Role Enforcement ✅ Resolved

The PUT /api/admin/config/{key} endpoint was protected only by _get_current_user, which validated any valid JWT. There was no is_admin flag, no role check, and no separate admin token — any registered user could overwrite the Discogs consumer key/secret.

The endpoint has been removed entirely. Setting app-level credentials no longer has an HTTP surface at all.


Discogs OAuth Integration in explore

The flow is split across two services:

  1. api service handles OAuth 1.0a OOB (api/services/discogs.py):

    • GET /api/oauth/authorize/discogs — starts the flow, returns the Discogs authorization URL + state token
    • POST /api/oauth/verify/discogs — exchanges the user-pasted verifier code for a permanent access token and stores it in oauth_tokens
    • OAuth state (request token secret) is kept in Redis with a 10-minute TTL for CSRF protection
  2. curator service reads those stored tokens and syncs the user's collection/wantlist into Neo4j, writing (User)-[:COLLECTED]->(Release) and (User)-[:WANTS]->(Release) relationships

  3. explore service queries those Neo4j relationships for the personalized endpoints


Does explore Require Login for All Functionality?

No — login is only required for the personalized /api/user/* endpoints. All graph exploration endpoints are fully public.

Endpoint Login Required
GET /api/autocomplete No
GET /api/explore No
GET /api/expand No
GET /api/node/{id} No
GET /api/trends No
POST /api/snapshot No
GET /api/snapshot/{token} No
GET /api/user/collection Yes
GET /api/user/wantlist Yes
GET /api/user/recommendations Yes
GET /api/user/collection/stats Yes
GET /api/user/status Optional — gracefully returns all false when unauthenticated

JWT_SECRET_KEY in ExploreConfig is optional (jwt_secret_key: str | None = None). If not set, personalized endpoints return 503 Service Unavailable rather than silently failing.


Summary of actionable item: ✅ The admin config HTTP endpoint has been removed and replaced with a CLI tool (discogs-setup) that requires container shell access. The security concern is resolved.

SimplicityGuy and others added 2 commits February 23, 2026 13:36
…ntials

Remove PUT /api/admin/config/{key}, which was protected only by a valid
JWT and thus accessible to any registered user. Replace it with a
standalone discogs-setup CLI script (api/setup.py) that writes directly
to the app_config table and must be run via docker exec on the API
container — an already admin-privileged operation.

- api/setup.py: new CLI entry point with --consumer-key/--consumer-secret
  and --show (masked) modes; reads POSTGRES_* env vars, uses psycopg sync
- api/api.py: remove set_app_config endpoint; update OAuth error message
- api/pyproject.toml: add discogs-setup script entry point
- justfile: add configure-discogs task in [group('setup')]
- tests/api/test_api.py: remove TestAdminConfigEndpoint (4 tests)
- tests/api/test_setup.py: add 18 tests covering the new CLI tool

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

E2E Coverage (webkit)

Totals Coverage
Statements: 47.02% ( 1106 / 2352 )
Lines: 47.02% ( 1106 / 2352 )

StandWithUkraine

@github-actions
Copy link
Copy Markdown
Contributor

E2E Coverage (chromium)

Totals Coverage
Statements: 47.02% ( 1106 / 2352 )
Lines: 47.02% ( 1106 / 2352 )

StandWithUkraine

@github-actions
Copy link
Copy Markdown
Contributor

E2E Coverage (webkit - iPhone 15)

Totals Coverage
Statements: 47.02% ( 1106 / 2352 )
Lines: 47.02% ( 1106 / 2352 )

StandWithUkraine

@github-actions
Copy link
Copy Markdown
Contributor

E2E Coverage (webkit - iPad Pro 11)

Totals Coverage
Statements: 47.02% ( 1106 / 2352 )
Lines: 47.02% ( 1106 / 2352 )

StandWithUkraine

@github-actions
Copy link
Copy Markdown
Contributor

E2E Coverage (firefox)

Totals Coverage
Statements: 47.02% ( 1106 / 2352 )
Lines: 47.02% ( 1106 / 2352 )

StandWithUkraine

@SimplicityGuy SimplicityGuy merged commit 8218b80 into main Feb 24, 2026
54 of 59 checks passed
@SimplicityGuy SimplicityGuy deleted the issue-60 branch February 24, 2026 03:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: Discogs user integration — authenticate, sync collection & wantlist

1 participant