Skip to content

sultanxdev/sendry

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

50 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🚀 Sendry — High-Performance Decoupled API Monitoring & Analytics Platform

Sendry Banner Node.js React MongoDB PostgreSQL RabbitMQ

Decoupled, zero-overhead API observability using RabbitMQ, MongoDB, and PostgreSQL.

Live ArchitectureQuick StartResilience DesignDatabase LayoutAPI Specs


🎯 Platform Overview

Sendry is a production-grade API monitoring engine built to capture, aggregate, and visualize high-throughput API metrics without introducing latency overhead or blocking downstream application loops.

By utilizing an event-driven, decoupled ingest pipeline, Sendry takes API hits via a zero-dependency SDK middleware, publishes them to a durable queue in under 2 milliseconds, and processes saving operations asynchronously.

🖼️ System Hero Layout

Sendry Hero


📋 Table of Contents


🏗️ System Architecture

Sendry is architected specifically to solve the "monitoring overhead" problem, splitting ingestion, queuing, persistence, and querying into decoupled, scalable layers:

  1. Ingest Endpoint: Express API server receives raw hits, validates client API keys, and quickly pushes them to the queue buffer.
  2. Buffer Queue (RabbitMQ): A message broker that absorbs traffic spikes and guarantees message delivery.
  3. Background Worker (Consumer): A standalone Node.js process that continuously drains the queue, handles retries, and coordinates the dual-write database persistence.
  4. Dual Databases:
    • MongoDB: Optimized for unstructured raw payloads (full request body/headers) with a 30-day auto-expiry (TTL) index.
    • PostgreSQL: Stores structured, aggregated hourly time-series metrics for fast dashboard queries.

Sendry Architecture


🔄 Ingest Sequence Flow

The following sequence details how an API hit is captured, queued, and stored asynchronously without delaying the client's HTTP response.

Ingestion Sequence

Mermaid Sequence Details:

  1. User Client issues an API request to the Monitored App.
  2. Sendry SDK Middleware captures request start time and intercepts the response 'finish' event.
  3. Monitored App immediately returns the response to the user.
  4. SDK Middleware fires an asynchronous, non-blocking POST log request to the Sendry Ingest API.
  5. Ingest API validates the API Key and publishes the payload to the RabbitMQ queue.
  6. Ingest API returns 202 Accepted to the Monitored App in <2ms.
  7. The Background Queue Worker pulls the hit event, parses it, and writes:
    • Raw logs to MongoDB
    • Time-series aggregated counts to PostgreSQL

📊 Database Entity Relationship Diagram

Entity Relationship Diagram

Storage Strategy:

  • MongoDB TTL partition: The timestamp field in MongoDB has an active expireAfterSeconds: 2592000 index (30 days). Raw log collections self-clean, ensuring storage size doesn't grow unboundedly.
  • PostgreSQL Upsert aggregations: The background worker performs high-efficiency INSERT INTO metrics ... ON CONFLICT (...) DO UPDATE queries. This aggregates raw hits into hourly buckets, keeping analytical query scans extremely fast.

📊 Analytics Dashboard

The React dashboard renders live charts, status breakdowns, and sorted endpoint performance tables using ApexCharts.

Dashboard Preview


🛡️ Production-Grade Resiliency Engineering

A YC-level application must be resilient to cloud database drops, message broker crashes, and system load spikes. Sendry implements the following design patterns:

1. Ingestion-Side Circuit Breaker

If RabbitMQ crashes, we must prevent the monitored application's request stack from hanging or dropping. Sendry's Ingest API features a 3-State Circuit Breaker:

  • CLOSED: Traffic flows normally to RabbitMQ.
  • OPEN: If RabbitMQ errors surpass 5 occurrences within 10 seconds, the breaker trips. Ingest API immediately fails-open, skipping the queue and returning 503 Service Unavailable without blocking.
  • HALF-OPEN: After a 30-second cooldown, the system sends 2 test messages. If successful, it closes the breaker; otherwise, it trips it back to open.

2. Consumer Exponential Backoff with Jitter

When database connections drop momentarily, the consumer background worker retries database saves. To prevent a "thundering herd" bottleneck when services recover, the retries scale exponentially with a randomized jitter modifier: $$\text{delay} = (\text{baseDelay} \times 2^{\text{attempt}}) \pm \text{random Jitter}$$

3. Dead-Letter Queue (DLQ) Safeguard

If a message fails schema validation or exceeds 3 failed write attempts, it is acknowledged (removed from api_hits queue) and routed to a Dead-Letter Queue (api_hits.dlq) with error headers detailing the cause. This prevents toxic payloads from blocking active queue channels.


📁 Project Directory Structure

sendry/
├── server/                         # Backend API Server & Background Consumer
│   ├── src/
│   │   ├── server.js               # Express API entry point
│   │   ├── services/
│   │   │   ├── auth/               # JWT authentication & RBAC roles
│   │   │   ├── ingest/             # Ingestion routes & RabbitMQ Event Producer
│   │   │   ├── processor/          # RabbitMQ consumer (standalone background worker)
│   │   │   ├── analytics/          # PostgreSQL analytics metrics queries
│   │   │   └── client/             # API Key creation & client configuration
│   │   └── shared/
│   │       ├── config/             # DB & RabbitMQ connection managers
│   │       ├── models/             # MongoDB Mongoose schemas
│   │       ├── events/             # Circuit Breakers & Retry managers
│   │       └── middlewares/        # JWT & API Key validation filters
│   ├── scripts/
│   │   └── init-postgres.sql       # PostgreSQL table schemas & metrics indexes
│   ├── Dockerfile                  # API production image
│   ├── Dockerfile.consumer         # Consumer background worker image
│   └── docker-compose.yml          # Local infrastructure orchestration
│
├── dashboard/                      # Vite React SPA Dashboard
│   ├── src/
│   │   ├── App.jsx                 # Client routes & Auth gates
│   │   ├── api/                    # Axios REST requests
│   │   ├── components/             # Reusable UI widgets & ApexCharts
│   │   ├── pages/                  # Overview and settings pages
│   │   └── styles/                 # Custom Tailwind CSS global styling
│   └── vercel.json                 # SPA path routing configuration for Vercel
│
└── demo/demo/                      # Sample Monitored Express API
    ├── server.js                   # Mock API server
    └── monitoring.js               # Zero-dependency SDK middleware

⚡ Quick Start

1. Spin up Local Infrastructure (Docker)

Ensure Docker is installed, then spin up database and broker containers:

cd server
docker-compose up -d postgres mongo rabbitmq

2. Configure Environment variables

Create a .env file inside the server/ directory:

cp .env.example .env

Ensure database URI configurations and RabbitMQ connection paths match your credentials.

3. Start Backend Services

Launch the Ingestion API server and the background queue consumer process:

# Terminal 1: Ingestion API
cd server
npm install
npm run dev

# Terminal 2: Queue Consumer Worker
cd server
node src/services/processor/consumer.js

4. Launch the Dashboard

cd dashboard
npm install
npm run dev

5. Onboard Admin

Issue a POST request to register the initial administrator:

POST http://localhost:5000/api/auth/onboard
Content-Type: application/json

{
  "username": "admin",
  "email": "admin@sendry.io",
  "password": "SecurePassword123!"
}

Login via http://localhost:5173/login, add a client profile, and generate a client API Key.


🔌 Developer SDK Integration

Monitoring any Node.js/Express application is simple. Add the following non-blocking middleware:

// monitoring.js - Save in your Express project
import axios from 'axios';

export function monitoringMiddleware(options = {}) {
  const apiKey = options.apiKey || process.env.MONITORING_API_KEY;
  const endpoint = options.endpoint || 'http://localhost:5000/api/hit';
  const serviceName = options.serviceName || 'my-service';

  return function (req, res, next) {
    const start = process.hrtime();

    res.on('finish', () => {
      const diff = process.hrtime(start);
      const latencyMs = (diff[0] * 1e3 + diff[1] * 1e-6);

      const payload = {
        serviceName,
        endpoint: req.route ? req.route.path : req.path,
        method: req.method,
        statusCode: res.statusCode,
        latencyMs: parseFloat(latencyMs.toFixed(2))
      };

      if (!apiKey) return;

      // Fire-and-forget async query: never blocks application flow
      axios.post(endpoint, payload, {
        headers: { 'x-api-key': apiKey },
        timeout: 3000
      }).catch(() => {});
    });

    next();
  };
}
// server.js - Startup entry point
import express from 'express';
import { monitoringMiddleware } from './monitoring.js';

const app = express();

app.use(monitoringMiddleware({
  serviceName: 'order-service',
  apiKey: process.env.MONITORING_API_KEY
}));

app.get('/orders', (req, res) => {
  res.json({ status: 'active' });
});

app.listen(3000);

📡 API Reference

Auth Endpoint Group

Method Route Description
POST /api/auth/onboard One-time super admin profile registration
POST /api/auth/register Create developer account
POST /api/auth/login Log in and receive JWT HTTP-only cookie
POST /api/auth/logout Revoke session and clear cookies

🛠️ Challenges & Solutions

1. Cross-Domain Session Cookie Restrictions

  • Challenge: Modern browsers block cookies on cross-origin requests (SameSite=None) unless they are marked Secure and accessed over HTTPS. This created issues when testing the Vite frontend (http://localhost:5173) against the Express API (http://localhost:5000) and when deploying on Vercel/Render.
  • Solution: Implemented an automated cookie negotiation fallback inside the authentication middleware. The system inspects the environment and dynamically sets sameSite: "lax" and secure: false during local development, while switching to sameSite: "none" and secure: true on production HTTPS domains.

2. Preventing Cascade Failures in Monitored Apps

  • Challenge: If the monitoring system slows down or goes offline, monitored client applications should not experience delays or request pileups in their Express routing loops.
  • Solution: Developed the SDK middleware around an asynchronous "fire-and-forget" model using Node's 'finish' socket events. The Express response returns immediately to the client, while metric posts execute out-of-band with short connection timeouts (3000ms), ensuring zero main-thread blockages.

3. Queue Consumer Idempotency & Duplicate Hits

  • Challenge: Under high-load network retries, the same API hit could be delivered twice to RabbitMQ, causing duplicate analytical entries in PostgreSQL.
  • Solution: Built an in-memory cache inside the consumer background worker using a capped Set. Before writing to the databases, the consumer checks the hit's unique hash in the set. The set is capped at 100,000 entries to prevent memory leaks while filtering out duplicate packets.

⚖️ System Design Trade-offs

Decision Pros Cons
Dual-DB Split Optimal division of labor: MongoDB handles raw payload writes, while PostgreSQL performs structured, time-bucketed metric reads. Higher operational overhead and hosting cost; requires maintaining two separate database connections and connection pool sizes.
RabbitMQ Event Buffer Extremely low latency overhead (under 2ms); native support for dead-letter exchanges (DLX) and easy channel routing. Cannot match the high partition throughput and log-compaction capabilities of Apache Kafka for multi-consumer streams.
In-Memory Idempotency Set Low-latency duplicate checks with zero database read overhead. Set is ephemeral; if the consumer worker crashes and restarts, state is lost, making eventual duplicate records possible.

💡 Key Learnings

  1. Decoupled Architecture Scalability: Keeping database writes out of the active HTTP request loop is a fundamental pattern for building highly scalable systems. Pushing payloads to a message broker like RabbitMQ ensures the user's response time is unaffected by database write latencies.
  2. Thundering Herd Protection: In cloud environments, brief database dropouts are common. Implementing an exponential backoff retry strategy with a randomized jitter factor is critical to prevent recovered databases from being flooded with a storm of queued retries.
  3. Graceful Fail-Open Systems: Monitoring should never cause service downtime. Implementing a Circuit Breaker that fails open guarantees that if the queue becomes completely unreachable, the monitored application continues serving users normally.

🔮 Future Scope

  • Real-Time Alerting Engine: Integrate Slack, Discord, and PagerDuty webhooks to notify team members automatically when an endpoint's error rate spikes or average latency exceeds a specified threshold.
  • Distributed Tracing (OpenTelemetry): Add support for trace propagation headers (e.g., traceparent), allowing developers to map API hits across microservice boundaries.
  • Auto-Generating API Documentation: Analyze incoming request/response schemas to dynamically construct OpenAPI/Swagger documentation directly from actual API traffic.
  • Multi-Language SDKs: Develop drop-in, zero-dependency middleware packages for other major backend environments, including Python (FastAPI/Django), Go (Gin), and Rust (Actix-web).

Releases

No releases published

Packages

 
 
 

Contributors

Languages