diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS new file mode 100644 index 0000000..89310fa --- /dev/null +++ b/.github/CODEOWNERS @@ -0,0 +1,2 @@ +# Default owner for everything not explicitly listed +* @netwrix/dspm-engineering diff --git a/.gitignore b/.gitignore index ac9e0f1..1a0ff0f 100644 --- a/.gitignore +++ b/.gitignore @@ -1,6 +1,15 @@ +# OS .DS_Store -.idea -.vscode +Thumbs.db + +# IDE +.idea/ +.vscode/ +*.suo +*.user +.vs/ + +# Environment files .env .env.local .env.development.local @@ -8,4 +17,32 @@ .env.production.local .env.development .env.test -.env.production \ No newline at end of file +.env.production + +# Python +__pycache__/ +*.py[cod] +*.pyo +.venv/ +.python-version.local +*.egg-info/ +dist/ +build/ +.pytest_cache/ +.ruff_cache/ +.mypy_cache/ + +# uv +# uv.lock is intentionally committed — do not add it here + +# C# / .NET +bin/ +obj/ +*.user +*.suo +*.userosscache +*.sln.docstates +[Pp]ublish/ +_ReSharper*/ +TestResults/ +*.nupkg diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md new file mode 100644 index 0000000..8e0b694 --- /dev/null +++ b/ARCHITECTURE.md @@ -0,0 +1,149 @@ +# Architecture Overview + +`dspm-connector-templates` provides the runtime scaffolding for all DSPM connector functions. Each template bundles an HTTP server, job-mode runner, OpenTelemetry instrumentation, Redis stop/pause/resume signal handling, and (for connector templates) batched data ingestion. Connector authors implement one file — `handler.py` or `Handler.cs` — and the template handles everything else. + +## 1. Project Structure + +``` +/ +├── template/ +│ ├── netwrix-python/ # Python connector template +│ │ ├── index.py # HTTP server + job-mode entrypoint +│ │ ├── redis_signal_handler.py # Redis Streams client for control signals +│ │ ├── state_manager.py # Stop/pause/resume state machine +│ │ ├── pyproject.toml # Python dependencies (managed by uv) +│ │ ├── Dockerfile # Multi-stage container image +│ │ └── function/ +│ │ └── handler.py # Connector author implements this +│ ├── netwrix-internal-python/ # Python template for internal functions +│ │ ├── index.py # HTTP server + job-mode entrypoint +│ │ ├── pyproject.toml +│ │ ├── Dockerfile +│ │ └── function/ +│ │ └── handler.py +│ ├── netwrix-csharp/ # C# connector template +│ │ ├── ConnectorFramework/ # Core runtime library (do not modify) +│ │ │ ├── Program.cs # HTTP + job-mode bootstrap, handler discovery +│ │ │ ├── FunctionContext.cs # Per-request DI context (secrets, tables, spans) +│ │ │ ├── IConnectorHandler.cs # Interface connector handlers must implement +│ │ │ ├── BatchManager.cs # Buffered async data ingestion +│ │ │ ├── StateManager.cs # Stop/pause/resume state machine +│ │ │ ├── RedisSignalHandler.cs +│ │ │ └── ConnectorRequestData.cs +│ │ ├── ConnectorFramework.Tests/ # Unit tests for the framework +│ │ ├── function/ +│ │ │ └── Handler.cs # Connector author implements IConnectorHandler here +│ │ └── Dockerfile +│ └── netwrix-internal-csharp/ # C# template for internal functions +│ ├── Program.cs # HTTP server bootstrap +│ ├── FunctionContext.cs # Simplified context (logging, secrets, OTel) +│ ├── FunctionRequest.cs +│ ├── FunctionResponse.cs +│ └── Dockerfile +├── docs/ +│ └── STOP_PAUSE_RESUME_GUIDE.md # Implementation guide +└── .github/ + └── workflows/ + └── ruff.yml # Python lint/format CI +``` + +## 2. High-Level System Diagram + +```mermaid +graph TD + subgraph "Connector Container (runtime per function)" + A[HTTP Server / Job Runner
index.py · Program.cs] --> B[handler.py / Handler.cs
connector scan logic] + B --> C[FunctionContext / Context
secrets · logging · tracing] + C --> D[BatchManager
500 KB buffer] + C --> E[StateManager
stop · pause · resume] + E --> F[RedisSignalHandler] + end + + subgraph "Platform Services" + G[data-ingestion service] + H["Redis Streams
scan:control:#123;id#125;
scan:status:#123;id#125;"] + I[app-update-execution service] + J[OTLP Collector
Grafana / OpenTelemetry] + K[Core API
connector-api] + end + + D -->|"HTTP POST async
500 KB batches"| G + F <-->|"control signals
status updates"| H + C -->|"execution progress"| I + A -->|"traces / metrics / logs"| J + K -->|"STOP / PAUSE / RESUME"| H +``` + +## 3. Core Components + +### 3.1. netwrix-python + +**Description:** Python connector template for external source and IAM connectors. Runs as a Flask/Waitress HTTP server or a one-shot Kubernetes job, selected by the `EXECUTION_MODE` environment variable. Provides `StateManager` and `RedisSignalHandler` for graceful stop/pause/resume, and a `Context` object with structured logging and OpenTelemetry tracing. + +**Technologies:** Python 3.12, Flask, Waitress, Redis, OpenTelemetry SDK, uv + +**Key files:** `index.py` (entrypoint), `state_manager.py`, `redis_signal_handler.py`, `function/handler.py` (connector author fills this) + +### 3.2. netwrix-internal-python + +**Description:** Simplified Python template for internal platform functions (e.g., `data-ingestion`, `regex-match`, `sensitive-data-orchestrator`). Same HTTP/job dual-mode entrypoint as `netwrix-python` but without connector-specific `StateManager` or `BatchManager`. Provides the same `Context`/`ContextLogger` and OpenTelemetry setup. + +**Technologies:** Python 3.12, Flask, Waitress, OpenTelemetry SDK, uv + +### 3.3. netwrix-csharp + +**Description:** C# (.NET 8) connector template with a full `ConnectorFramework` runtime library. At startup, `Program.cs` discovers the connector's `IConnectorHandler` implementation via reflection and registers it in DI. Supports HTTP mode (ASP.NET Core Minimal API, long-running scans run in background scopes) and job mode (single `HandleJobAsync` invocation). `FunctionContext` provides secrets, per-table `BatchManager` instances, checkpoint state via Redis, execution progress reporting, and OpenTelemetry spans. `StateManager` polls Redis for STOP/PAUSE/RESUME signals and exposes a `CancellationToken` that is cancelled on STOP. + +**Technologies:** .NET 8, ASP.NET Core, StackExchange.Redis, OpenTelemetry .NET SDK + +**Key files:** `ConnectorFramework/Program.cs`, `ConnectorFramework/FunctionContext.cs`, `ConnectorFramework/IConnectorHandler.cs`, `function/Handler.cs` (connector author fills this) + +### 3.4. netwrix-internal-csharp + +**Description:** C# template for internal platform functions. Provides HTTP server bootstrap and a simplified `FunctionContext` with structured logging and secrets, but no `BatchManager` or `StateManager`. Suited for request/response functions that do not perform long-running scans. + +**Technologies:** .NET 8, ASP.NET Core, OpenTelemetry .NET SDK + +## 4. Data Stores + +### 4.1. Redis + +**Type:** Redis Streams +**Purpose:** Control plane for running scan executions. `RedisSignalHandler` reads STOP/PAUSE/RESUME signals from `scan:control:{executionId}` and writes status updates to `scan:status:{executionId}`. Checkpoint/resume state is stored under `scan:state:{executionId}` with a 24-hour TTL. + +The connection URL is provided via the `REDIS_URL` environment variable. Both templates degrade gracefully if Redis is unavailable — stop/pause signals are simply not processed. + +## 5. External Integrations + +| Service | Purpose | Method | +|---------|---------|--------| +| `data-ingestion` | Receive batched scanned objects and write to ClickHouse | HTTP POST (async, 500 KB batches) | +| `app-update-execution` | Report scan progress and final status | HTTP POST | +| OTLP Collector | Receive traces, metrics, and logs | HTTP OTLP protobuf | +| Core API / connector-api | Orchestrate scans; send stop/pause/resume signals via Redis | Redis Streams | + +Service URLs default to Kubernetes service DNS names inside the `access-analyzer` namespace and can be overridden via environment variables (`SAVE_DATA_FUNCTION`, `APP_UPDATE_EXECUTION_FUNCTION`, `COMMON_FUNCTIONS_NAMESPACE`). + +## 6. Deployment & Infrastructure + +- **Build:** Multi-stage Docker builds. Python images use `uv` for reproducible dependency installation from `uv.lock`. C# images use a .NET SDK build stage publishing to an ASP.NET runtime stage. +- **Execution models:** HTTP mode for long-running servers; Job mode (`EXECUTION_MODE=job`) for Kubernetes Jobs invoked by the connector-api. +- **Non-root:** All Dockerfiles create a non-root `app` user and run all application code under that user. +- **CI/CD:** GitHub Actions — `.github/workflows/ruff.yml` lints and format-checks `template/netwrix-python` on every push/PR to `main`. +- **Registry:** Container images distributed via the Keygen OCI registry (`oci.pkg.keygen.sh`). +- **Debug mode:** Set `DEBUG_MODE=true` at image build time to include `debugpy` and expose port 5678 for remote debugging. + +## 7. Security Considerations + +- **Secrets:** Mounted as files at `/var/secrets/{name}` by the orchestrator. Never passed as environment variables or baked into images. `FunctionContext` rejects files outside the secrets directory (path traversal guard). +- **Non-root containers:** All Dockerfiles drop to a non-root user before the application starts. +- **Redis timeouts:** Clients use 2-second connection and socket timeouts to prevent indefinite blocking on an unresponsive Redis instance. +- **No credentials in source:** `NuGet.config` uses `%NUGET_TOKEN%` expansion; the token is supplied at build time via `--build-arg NUGET_TOKEN=...` and is not stored in the image. + +## 8. Development & Testing Environment + +See [CONTRIBUTING.md](CONTRIBUTING.md) for local setup steps. + +- **Python:** `uv sync` installs all dependencies; `uv run pytest` runs tests; `uv run ruff check .` and `uv run ruff format .` enforce code style. +- **C#:** `dotnet restore && dotnet build` builds the framework; `dotnet test` runs unit tests in `ConnectorFramework.Tests`. +- **Code quality tools:** ruff (Python lint + format), `dotnet format` (C#). diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 0000000..be3d282 --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,135 @@ +# Contributing to dspm-connector-templates + +## Table of Contents + +- [I Have a Question](#i-have-a-question) +- [I Want To Contribute](#i-want-to-contribute) + - [Reporting Bugs](#reporting-bugs) + - [Suggesting Enhancements](#suggesting-enhancements) + - [Your First Code Contribution](#your-first-code-contribution) + - [Improving The Documentation](#improving-the-documentation) +- [Styleguides](#styleguides) + +## I Have a Question + +For questions about connector development, template behaviour, or platform integration, raise an issue in this repository or reach out in the internal `#dspm-engineering` Slack channel. + +## I Want To Contribute + +### Reporting Bugs + +If a template produces unexpected runtime behaviour, incorrect OpenTelemetry output, or breaks the stop/pause/resume contract, please open an issue with: + +- Which template is affected (`netwrix-python`, `netwrix-csharp`, etc.) +- The execution mode in use (`http` or `job`) +- Minimal reproduction steps or a failing test +- Observed vs. expected behaviour, including any relevant log output + +### Suggesting Enhancements + +#### Before Submitting an Enhancement + +Before implementing a template change, open an issue to discuss it. Template changes affect every connector built from that template, so breaking changes to the `FunctionContext` API, `IConnectorHandler` interface, or `handler.py` / `Handler.cs` signatures require careful consideration. + +Check existing issues and pull requests to make sure the enhancement has not already been proposed or implemented. + +#### How Do I Submit a Good Enhancement Suggestion? + +Good enhancement proposals include: + +- A clear description of the problem being solved +- The proposed API change and why it cannot be addressed at the connector level +- Any impact on existing connectors +- Example code or pseudocode showing the desired usage, if applicable + +### Your First Code Contribution + +#### Setup — Python templates + +```bash +# Install uv (https://github.com/astral-sh/uv) +curl -LsSf https://astral.sh/uv/install.sh | sh + +cd template/netwrix-python +uv sync # install all dependencies from uv.lock +uv run pytest # run tests +uv run ruff check . # lint +uv run ruff format . # format +``` + +The same steps apply to `template/netwrix-internal-python`. + +#### Setup — C# templates + +```bash +# Requires .NET 8 SDK +cd template/netwrix-csharp +dotnet restore ConnectorFramework/ConnectorFramework.csproj +dotnet build +dotnet test ConnectorFramework.Tests/ConnectorFramework.Tests.csproj +``` + +#### Workflow + +1. Create a feature branch from `main`. +2. Make your changes. +3. Ensure all tests pass and linting is clean (CI will enforce this on PR). +4. Open a pull request against `main` with a clear description of the change and its rationale. + +#### Updating Python dependencies + +Dependencies are managed with `uv`. To add or update a package: + +```bash +cd template/netwrix-python # or netwrix-internal-python +uv add # adds to pyproject.toml and updates uv.lock +uv sync # reinstalls from the updated lockfile +``` + +Always commit both `pyproject.toml` and `uv.lock`. + +#### Updating C# dependencies + +Add packages to `function/Function.csproj` only. Do not modify `ConnectorFramework/ConnectorFramework.csproj` unless you are changing the framework itself. + +### Improving The Documentation + +Documentation lives in: + +- `README.md` — overview, template list, quick-start +- `ARCHITECTURE.md` — structure, component descriptions, system diagram +- `CONTRIBUTING.md` — this file +- `GLOSSARY.md` — domain term definitions +- `docs/STOP_PAUSE_RESUME_GUIDE.md` — stop/pause/resume implementation guide + +Keep documentation accurate when the code changes. If a template API changes, update `ARCHITECTURE.md` and `GLOSSARY.md` in the same PR. + +## Styleguides + +### Python + +All Python code must pass `ruff check` and `ruff format` (configured in `template/netwrix-python/pyproject.toml`). CI enforces this on every push and PR to `main`. + +Key rules: +- Line length: 120 characters +- Imports sorted by `isort` rules (ruff `I`) +- No f-string debugging or `print()` in template code — use the `context.log` structured logger + +### C# + +Follow standard .NET conventions. The `ConnectorFramework` uses `SemaphoreSlim` (not `lock`) for async-safe state mutations; maintain this pattern for any new thread-shared state. + +Key conventions: +- `FunctionContext` remains `sealed` and `IAsyncDisposable` +- `BatchManager` is single-writer per table — document this contract if you extend it +- Connector-specific packages go in `function/Function.csproj`; framework packages go in `ConnectorFramework/ConnectorFramework.csproj` + +### Commit messages + +Use short, imperative-mood subject lines: + +``` +Add retry logic to BatchManager flush +Fix StateManager shutdown not cancelling token on pause +Update netwrix-python Flask to 3.1 +``` diff --git a/GLOSSARY.md b/GLOSSARY.md new file mode 100644 index 0000000..bcba813 --- /dev/null +++ b/GLOSSARY.md @@ -0,0 +1,97 @@ +# Glossary + +Domain terms as used in `dspm-connector-templates`. + +## BatchManager + +Definition: A component in both the `netwrix-csharp` and `netwrix-python` templates that buffers scanned objects in memory and flushes them asynchronously to the `data-ingestion` service when the buffer exceeds 500 KB. Each ClickHouse table gets its own `BatchManager`. + +Supporting information: + +* In C#, obtain one via `context.GetTable("table_name")` and call `AddObject` (synchronous, never blocks on I/O; the flush happens on a background channel worker) +* In Python, `context.save_object(table, obj)` creates a per-table `BatchManager` internally and delegates to `add_object` + +## Connector + +Definition: A deployed function that integrates with an external data source (e.g., CIFS/SMB, SharePoint, Active Directory). Every connector implements three operations: `test_connection`, `access_scan`, and `get_object`. + +Supporting information: + +* In the C# templates this is expressed as an `IConnectorHandler` implementation +* In Python templates it is a `handler.py` module with a `handle(event, context)` function + +## ConnectorFramework + +Definition: The C# runtime library bundled inside the `netwrix-csharp` template (`ConnectorFramework/` directory). It provides `Program.cs` (bootstrap), `FunctionContext`, `BatchManager`, `StateManager`, and `RedisSignalHandler`. + +Supporting information: + +* Connector authors must not modify `ConnectorFramework.csproj` +* Connector-specific packages go in `function/Function.csproj` + +## FunctionContext + +Definition: The per-request object injected into every connector invocation. Provides access to secrets, per-table `BatchManager` instances, checkpoint state, execution progress reporting (`UpdateExecutionAsync`), OpenTelemetry spans, and caller headers. + +Supporting information: + +* In C# it is `Netwrix.ConnectorFramework.FunctionContext` +* In Python it is the `Context` class in `index.py` + +## Execution Mode + +Definition: Controls how a connector container is invoked. **HTTP mode** (default) starts a long-running HTTP server that serves repeated requests. **Job mode** (`EXECUTION_MODE=job`) runs the handler once and exits with a success/failure exit code. + +Supporting information: + +* Job mode is used for Kubernetes Jobs invoked by the connector-api +* Set via the `EXECUTION_MODE` environment variable + +## IConnectorHandler + +Definition: The C# interface that every `netwrix-csharp` connector must implement. Has three optional/required members: `MapServices` (register DI services), `MapEndpoints` (declare HTTP routes), and `HandleJobAsync` (job-mode invocation). + +Supporting information: + +* The framework discovers the single non-abstract implementation via reflection at startup + +## RedisSignalHandler + +Definition: A class in both the Python and C# templates that connects to Redis Streams to read control signals (`STOP`, `PAUSE`, `RESUME`) from `scan:control:{executionId}` and write status updates to `scan:status:{executionId}`. + +Supporting information: + +* If Redis is unavailable the handler degrades gracefully and scanning continues without stop capability + +## ScanExecutionId + +Definition: A unique identifier for a single invocation of a connector scan. It is passed in via the `Scan-Execution-Id` HTTP header (HTTP mode) or the `SCAN_EXECUTION_ID` environment variable (job mode). + +Supporting information: + +* The `StateManager` and `BatchManager` use it as the key for Redis Streams and for enriching ingested data records + +## StateManager + +Definition: A component that manages the lifecycle states of a running scan: `running → stopping → stopped`, `running → pausing → paused → resuming → running`, and `running → completed / failed`. It polls Redis at a configurable interval (default 5 seconds) for control signals. + +Supporting information: + +* Exposes `ShouldStopAsync` / `ShouldPauseAsync` methods (C#) or `should_stop()` / `should_pause()` methods (Python) for connectors to check +* In C# it also exposes a `CancellationToken` that is cancelled when a STOP signal arrives + +## Template + +Definition: A reusable scaffold that connector repositories pull at build time. A template defines the `Dockerfile`, the runtime entrypoint (`index.py` or `Program.cs`), and framework files (`StateManager`, `BatchManager`, etc.). + +Supporting information: + +* Connector authors only provide their `handler.py` or `Handler.cs` and a `requirements.txt` / `Function.csproj` + +## uv + +Definition: The Python package manager used by the Python templates. Dependencies are declared in `pyproject.toml` and pinned in `uv.lock` for reproducible builds. + +Supporting information: + +* Use `uv sync` to install, `uv add ` to add a dependency, and `uv run ` to run tools within the virtual environment diff --git a/README.md b/README.md new file mode 100644 index 0000000..78087a3 --- /dev/null +++ b/README.md @@ -0,0 +1,130 @@ +# dspm-connector-templates + +Function templates for building DSPM connectors. Each template provides the runtime scaffolding — HTTP server, job-mode runner, OpenTelemetry instrumentation, Redis-based stop/pause/resume signals, and batched data ingestion — so connector authors only need to implement their scanning logic. + +## Templates + +| Template | Language | Purpose | +|----------|----------|---------| +| `netwrix-python` | Python 3.12 | External source and IAM connectors | +| `netwrix-internal-python` | Python 3.12 | Internal common platform functions | +| `netwrix-csharp` | C# / .NET 8 | External source and IAM connectors | +| `netwrix-internal-csharp` | C# / .NET 8 | Internal common platform functions | + +## Prerequisites + +- Docker (for building container images) +- .NET 8 SDK (for C# templates) +- Python 3.12 + [uv](https://github.com/astral-sh/uv) (for Python templates) + +## Using a Template + +Connector repositories reference these templates in their `stack.yml`. The templates are pulled automatically at build time. + +```yaml +functions: + my-connector: + lang: netwrix-python + handler: ./functions/my-connector + image: my-connector:latest +``` + +### Template selection + +- Use `netwrix-python` or `netwrix-csharp` for connectors that scan external data sources and ingest data into ClickHouse. +- Use `netwrix-internal-python` or `netwrix-internal-csharp` for internal platform functions that do not scan external sources. + +## Template Features + +### Dual Execution Modes + +All templates support two execution modes controlled by the `EXECUTION_MODE` environment variable: + +- **HTTP mode** (default): starts a long-running HTTP server (Flask/Waitress for Python, ASP.NET Core for C#). +- **Job mode** (`EXECUTION_MODE=job`): runs the handler once and exits. Used for Kubernetes jobs invoked by the connector-api. + +### Stop / Pause / Resume + +The `netwrix-python` and `netwrix-csharp` templates include a `StateManager` that monitors Redis Streams for control signals (`STOP`, `PAUSE`, `RESUME`) sent by the Core API during a running scan. + +See [docs/STOP_PAUSE_RESUME_GUIDE.md](docs/STOP_PAUSE_RESUME_GUIDE.md) for full implementation guidance. + +### Batched Data Ingestion + +Both the `netwrix-csharp` and `netwrix-python` templates include a `BatchManager` that buffers scanned objects in memory and flushes them to the `data-ingestion` service in batches (flush threshold: 500 KB). In C#, `BatchManager` instances are created per table via `context.GetTable("table_name")`. In Python, `context.save_object(table, obj)` creates a per-table `BatchManager` internally. + +### OpenTelemetry + +All templates export distributed traces, metrics, and logs to an OTLP-compatible collector. Configure the endpoint via `OTEL_EXPORTER_OTLP_ENDPOINT` (default: `http://otel-collector.access-analyzer.svc.cluster.local:4318`). Set `OTEL_ENABLED=false` to disable. + +### Secrets + +Secrets are loaded from files mounted at `/var/secrets/{name}` (connector-api) or `/var/openfaas/secrets/{name}` (fallback). Access them via `context.Secrets["name"]` (C#) or `context.secrets["name"]` (Python). + +## Build + +### Python templates + +Python templates do not require a separate build step — dependencies are installed at container build time via `uv sync` in the Dockerfile. + +```bash +docker build -t my-connector:latest -f template/netwrix-python/Dockerfile . +``` + +### C# templates + +```bash +cd template/netwrix-csharp +dotnet build ConnectorFramework/ConnectorFramework.csproj +``` + +Or build the container image directly: + +```bash +docker build -t my-connector:latest -f template/netwrix-csharp/Dockerfile . +``` + +## Develop + +### Python templates + +```bash +cd template/netwrix-python +uv sync +uv run ruff check . +uv run ruff format . +``` + +### C# templates + +```bash +cd template/netwrix-csharp +dotnet restore ConnectorFramework/ConnectorFramework.csproj +dotnet build +``` + +## Test + +### Python templates + +```bash +cd template/netwrix-python +uv run pytest +``` + +The CI pipeline runs `ruff check` and `ruff format --check` on every push/PR to `main` (see `.github/workflows/ruff.yml`). + +### C# templates + +```bash +cd template/netwrix-csharp +dotnet test ConnectorFramework.Tests/ConnectorFramework.Tests.csproj +``` + +## Deploy + +Connector containers are built as multi-stage Docker images and distributed via the Keygen OCI registry (`oci.pkg.keygen.sh`). Connector repositories reference these templates in their `stack.yml`, and images are built and pushed by CI/CD pipelines. Set `EXECUTION_MODE=job` for Kubernetes Job deployments or leave unset for long-running HTTP server mode. + +## Contributing + +See [CONTRIBUTING.md](CONTRIBUTING.md).