Skip to content

pjmagee/starwars-data

Repository files navigation

Star Wars Data

A .NET Aspire application that scrapes Wookieepedia, processes Star Wars universe data into MongoDB, and serves it through an AI-powered Blazor frontend with interactive visualizations.

Features

AI Assistant (Ask Page)

An AI chat interface powered by OpenAI with tool-calling capabilities. The agent can query the database and render results as:

  • Charts - Bar, Line, Pie, Donut, Stacked Bar, Radar, and Time Series
  • Tables - Paginated browsing by infobox type, or ad-hoc data tables with inline data
  • Relationship Graphs - Family trees and master/apprentice hierarchies
  • Timelines - Temporal events scoped by era, date range, or entity lifetime
  • Infobox Cards - Wiki-style info cards for specific entities
  • Text Summaries - Article excerpts and lore via wiki search RAG

All visualizations include source references linking back to Wookieepedia.

Galactic Map

Interactive 26x20 grid map of the Star Wars galaxy featuring:

  • Color-coded regions with filtering
  • Drill-down into sectors, systems, planets, and nebulas
  • Detail panes for selected entities

Timeline Browser

Browse events across the Star Wars timeline with category filtering, year range scoping, and pagination.

Data Tables

Browse all infobox categories (Characters, Planets, Species, Starships, etc.) as paginated tables with dynamic columns.

Wiki Search (RAG)

On-demand text search against the full wiki page corpus using MongoDB regex matching. The AI agent uses this to answer lore and history questions with cited sources.

Architecture

StarWarsData.AppHost          # .NET Aspire orchestrator
StarWarsData.ApiService       # ASP.NET Core API + AI agent + Hangfire
StarWarsData.Frontend         # Blazor Interactive Server UI
StarWarsData.Services         # Business logic, toolkits, ETL
StarWarsData.Models           # Shared entities and DTOs
StarWarsData.ServiceDefaults  # OpenTelemetry, health checks, resilience

Data Flow

  1. ETL Pipeline - Hangfire jobs scrape Wookieepedia pages via the MediaWiki API, parse infoboxes, and store raw pages in MongoDB
  2. Processing - Additional jobs create template-based views, timeline events, indexes, and OpenAI embeddings
  3. Daily Sync - A recurring job at 03:00 UTC incrementally syncs changed wiki pages
  4. AI Queries - The AI agent uses tool-calling to query MongoDB (via direct tools + MCP server) and renders results through the frontend

AI Agent Pipeline

The agent is built with the Microsoft Agents AI framework and streams responses via the AGUI protocol (SSE):

  • Topic Guardrail - Lightweight classifier rejects off-topic queries
  • Tool Registry - ComponentToolkit (7 render tools), DataExplorerToolkit (search/query tools), WikiSearchProvider (RAG), and MongoDB MCP tools (find, aggregate, count)
  • References - All render tools support source references with Wookieepedia URLs

Authentication

Keycloak provides OpenID Connect authentication with optional social login providers:

  • Google, Microsoft, Facebook (requires client ID/secret configuration)
  • Chat session history is stored per-user in MongoDB

Tech Stack

Layer Technology
Orchestration .NET Aspire
Backend ASP.NET Core, .NET 10
Frontend Blazor Interactive Server, MudBlazor
Database MongoDB
AI OpenAI (GPT, Embeddings), Microsoft.Agents.AI, AGUI
MCP MongoDB MCP Server (@mongodb-js/mongodb-mcp-server)
Background Jobs Hangfire with MongoDB persistence
Auth Keycloak (OpenID Connect)
Diagrams Z.Blazor.Diagrams
Observability OpenTelemetry (traces, metrics, logs)
Analytics Google Analytics 4

MongoDB Databases

Database Purpose
starwars-raw-pages Raw wiki pages with infoboxes (Pages collection)
starwars-timeline-events Processed timeline events grouped by template type
starwars-hangfire-jobs Hangfire job storage

Getting Started

Prerequisites

  • .NET 10 SDK
  • Node.js (for MongoDB MCP server via npx)
  • MongoDB instance
  • OpenAI API key

Configuration

Set the following in appsettings.json or environment variables:

{
  "Settings": {
    "OpenAiKey": "<your-openai-api-key>",
    "OpenAiModel": "gpt-4o-mini",
    "StarWarsBaseUrl": "https://starwars.fandom.com/api.php",
    "PagesDb": "starwars-raw-pages",
    "TimelineEventsDb": "starwars-timeline-events",
    "HangfireDb": "starwars-hangfire-jobs"
  }
}

For Keycloak social login, set environment variables:

GOOGLE_CLIENT_ID / GOOGLE_CLIENT_SECRET
MICROSOFT_CLIENT_ID / MICROSOFT_CLIENT_SECRET
FACEBOOK_CLIENT_ID / FACEBOOK_CLIENT_SECRET

Running

cd src/StarWarsData.AppHost
dotnet run

The Aspire dashboard will show all resources. The Hangfire dashboard is available at /hangfire on the API service.

API Endpoints

Controller Path Purpose
Admin /api/admin/* ETL job triggers, index management
Pages /pages/{id}, /pages/batch Page retrieval
Categories /categories/{type} Paginated infobox browsing
Search /search Full-text page search
Timeline /timeline/events, /timeline/eras Timeline data
GalaxyMap /galaxymap/grid, /galaxymap/sectors Galaxy map data
Relationships /relationships/graph/{id} Relationship graphs
ChatSessions /api/ChatSessions User chat history
AI (AGUI) /kernel/stream AI agent SSE streaming

About

A Star Wars Web app with AI Chat, Charts, Tables, Cards, Graphs and entire Timeline events!

Topics

Resources

License

Stars

Watchers

Forks

Contributors