Lab 5: Adding OpenTelemetry Metrics with Aspire Dashboard

Overview

In this lab, you'll enhance the Chain of Responsibility pipeline by adding OpenTelemetry (OTEL) metrics using Aspire 9.5.1. You'll create a custom pipeline behavior that tracks three key metrics: cache hits, cache misses, and successful role additions. These metrics will be visible in the Aspire Dashboard for observability.

Learning Objectives

By the end of this lab, you will be able to:

Configure OpenTelemetry metrics in an Aspire 9.5.1 application
Create custom OTEL metrics using System.Diagnostics.Metrics
Build a pipeline behavior for telemetry collection
Integrate metrics tracking with caching behavior
View metrics in the Aspire Dashboard
Understand the benefits of observability in distributed systems

Prerequisites

Completion of Lab 4 (Chain of Responsibility with Pipeline Behaviors)
Understanding of pipeline behaviors and the Mediator pattern
Basic knowledge of observability concepts
Aspire 9.5.1 project structure (already configured in this solution)
Running Aspire application (use dotnet run in the AppHost project)

Important Notes

⚠️ Meter Name Consistency: Ensure the meter name in ServiceDefaults matches exactly with the meter name in TelemetryService ("NimblePros.DAB.Web.Telemetry").

⚠️ Build Before Testing: Always run dotnet build before testing to ensure there are no compile errors.

⚠️ Behavior Registration Order: The order of behavior registration in Program.cs affects execution order. TelemetryBehavior should be registered last among universal behaviors to capture the complete request lifecycle.

Part 1: Understanding OpenTelemetry Metrics

What are OpenTelemetry Metrics?

OpenTelemetry (OTEL) is an open-source observability framework that provides APIs, libraries, and instrumentation for collecting, processing, and exporting telemetry data (metrics, logs, and traces).

Metrics are numerical measurements that represent the state of your application over time:

Counters: Cumulative values that only increase (e.g., total requests, cache misses)
Gauges: Values that can go up or down (e.g., active connections, memory usage)
Histograms: Distributions of values (e.g., request duration)

The Three Metrics We'll Track

Cache Hits (role_cache_hits_total) - Counter
- Incremented when ListRoles finds data in the cache
- Helps measure cache effectiveness
Cache Misses (role_cache_misses_total) - Counter
- Incremented when ListRoles needs to query the database
- Indicates when cache is cold or data has expired
Roles Added (roles_added_total) - Counter
- Incremented when a role is successfully created
- Tracks business functionality usage

Benefits of Metrics

✅ Performance Monitoring: Track application performance over time
✅ Capacity Planning: Understand usage patterns and resource needs
✅ SLA Monitoring: Measure against service level agreements
✅ Alerting: Set up alerts when metrics exceed thresholds
✅ Debugging: Identify performance bottlenecks and issues

Part 2: Reviewing Current Aspire Setup

Current Aspire Configuration

Our solution already includes Aspire 9.5.1 with basic OTEL configuration. Let's examine the current setup:

ServiceDefaults (src/NimblePros.DAB.ServiceDefaults/Extensions.cs):

public static IHostApplicationBuilder ConfigureOpenTelemetry(this IHostApplicationBuilder builder)
{
    builder.Logging.AddOpenTelemetry(logging =>
    {
        logging.IncludeFormattedMessage = true;
        logging.IncludeScopes = true;
    });

    builder.Services.AddOpenTelemetry()
        .WithMetrics(metrics =>
        {
            metrics.AddAspNetCoreInstrumentation()
                .AddHttpClientInstrumentation()
                .AddRuntimeInstrumentation();
        })
        .WithTracing(tracing =>
        {
            tracing.AddAspNetCoreInstrumentation()
                .AddHttpClientInstrumentation();
        });

    builder.AddOpenTelemetryExporters();

    return builder;
}

Program.cs already calls:

builder.AddServiceDefaults(); // This configures OTEL

This gives us:

✅ Basic OTEL setup with OTLP exporter
✅ ASP.NET Core instrumentation (HTTP requests, etc.)
✅ HttpClient instrumentation
✅ .NET Runtime instrumentation
✅ Aspire Dashboard integration

Part 3: Creating the Telemetry Pipeline Behavior

Step 1: Update ServiceDefaults for Custom Metrics

First, we need to configure our custom metrics source. Edit src/NimblePros.DAB.ServiceDefaults/Extensions.cs:

Add this line to the WithMetrics configuration:

public static IHostApplicationBuilder ConfigureOpenTelemetry(this IHostApplicationBuilder builder)
{
    builder.Logging.AddOpenTelemetry(logging =>
    {
        logging.IncludeFormattedMessage = true;
        logging.IncludeScopes = true;
    });

    builder.Services.AddOpenTelemetry()
        .WithMetrics(metrics =>
        {
            metrics.AddAspNetCoreInstrumentation()
                .AddHttpClientInstrumentation()
                .AddRuntimeInstrumentation()
                .AddMeter("NimblePros.DAB.Web.Telemetry"); // Add this line
        })
        .WithTracing(tracing =>
        {
            tracing.AddAspNetCoreInstrumentation()
                .AddHttpClientInstrumentation();
        });

    builder.AddOpenTelemetryExporters();

    return builder;
}

Step 2: Create the Communication Service

Create src/NimblePros.DAB.Web/04_Chain/Services/TelemetryContext.cs:

namespace NimblePros.DAB.Web._04_Chain.Services;

public interface ITelemetryContext
{
  void RecordCacheHit();
  void RecordCacheMiss();
  void RecordRoleAdded(string roleName);
  void FlushToTelemetry(ITelemetryService telemetryService);
}

public class TelemetryContext : ITelemetryContext
{
  private readonly List<Action<ITelemetryService>> _pendingActions = new();
  private readonly ILogger<TelemetryContext> _logger;

  public TelemetryContext(ILogger<TelemetryContext> logger)
  {
    _logger = logger;
  }

  public void RecordCacheHit()
  {
    _pendingActions.Add(ts => ts.TrackCacheHit());
  }

  public void RecordCacheMiss()
  {
    _pendingActions.Add(ts => ts.TrackCacheMiss());
  }

  public void RecordRoleAdded(string roleName)
  {
    _pendingActions.Add(ts => ts.TrackRoleAdded(roleName));
  }

  public void FlushToTelemetry(ITelemetryService telemetryService)
  {
    foreach (var action in _pendingActions)
    {
      try
      {
        action(telemetryService);
      }
      catch (Exception ex)
      {
        // Log but don't throw - telemetry shouldn't break the pipeline
        _logger.LogWarning(ex, "Failed to execute telemetry action");
      }
    }
    
    _pendingActions.Clear();
  }
}

Step 3: Create the Shared Telemetry Service

Create the telemetry service: src/NimblePros.DAB.Web/04_Chain/Services/TelemetryService.cs

using System.Diagnostics.Metrics;

namespace NimblePros.DAB.Web._04_Chain.Services;

public interface ITelemetryService
{
  void TrackCacheHit();
  void TrackCacheMiss();
  void TrackRoleAdded(string roleName);
}

public class TelemetryService : ITelemetryService, IDisposable
{
  private readonly ILogger<TelemetryService> _logger;
  private readonly Meter _meter;
  private readonly Counter<long> _cacheHitsCounter;
  private readonly Counter<long> _cacheMissesCounter;
  private readonly Counter<long> _rolesAddedCounter;

  public TelemetryService(ILogger<TelemetryService> logger)
  {
    _logger = logger;
    
    // Create a single meter for the entire application
    _meter = new Meter("NimblePros.DAB.Web.Telemetry");
    
    // Create counters once and reuse them
    _cacheHitsCounter = _meter.CreateCounter<long>(
      name: "role_cache_hits_total",
      description: "Total number of role cache hits");
      
    _cacheMissesCounter = _meter.CreateCounter<long>(
      name: "role_cache_misses_total", 
      description: "Total number of role cache misses");
      
    _rolesAddedCounter = _meter.CreateCounter<long>(
      name: "roles_added_total",
      description: "Total number of roles successfully added");
  }

  public void TrackCacheHit()
  {
    _cacheHitsCounter.Add(1);
    _logger.LogDebug("Tracked cache hit");
  }

  public void TrackCacheMiss()
  {
    _cacheMissesCounter.Add(1);
    _logger.LogDebug("Tracked cache miss");
  }

  public void TrackRoleAdded(string roleName)
  {
    _rolesAddedCounter.Add(1);
    _logger.LogDebug("Tracked role creation: {RoleName}", roleName);
  }

  public void Dispose()
  {
    _meter?.Dispose();
  }
}

Step 3: Create the Clean Telemetry Behavior

Now create: src/NimblePros.DAB.Web/04_Chain/PipelineBehaviors/TelemetryBehavior.cs

using Ardalis.GuardClauses;
using Ardalis.Result;
using Mediator;
using NimblePros.DAB.Web._04_Chain.Services;
using NimblePros.DAB.Web._04_Chain.UseCases.Create;
using NimblePros.DAB.Web.UseCases;

namespace NimblePros.DAB.Web._04_Chain.PipelineBehaviors;

public class TelemetryBehavior<TRequest, TResponse> : IPipelineBehavior<TRequest, TResponse>
  where TRequest : IMessage
{
  private readonly ITelemetryService _telemetryService;
  private readonly ITelemetryContext _telemetryContext;
  private readonly ILogger<TelemetryBehavior<TRequest, TResponse>> _logger;

  public TelemetryBehavior(
    ITelemetryService telemetryService,
    ITelemetryContext telemetryContext,
    ILogger<TelemetryBehavior<TRequest, TResponse>> logger)
  {
    _telemetryService = telemetryService;
    _telemetryContext = telemetryContext;
    _logger = logger;
  }

  public async ValueTask<TResponse> Handle(
    TRequest request,
    MessageHandlerDelegate<TRequest, TResponse> next,
    CancellationToken cancellationToken)
  {
    Guard.Against.Null(request, nameof(request));

    // Execute the pipeline - other behaviors can record events in ITelemetryContext
    var response = await next(request, cancellationToken);

    // Track business-level telemetry directly
    TrackBusinessEvents(request, response);

    // Flush all recorded events to telemetry service
    _telemetryContext.FlushToTelemetry(_telemetryService);

    return response;
  }

  private void TrackBusinessEvents(TRequest request, TResponse response)
  {
    try
    {
      // Track successful role creation directly (business logic)
      if (request is CreateRoleCommand createCommand && 
          response is Result<RoleDetails> createResult && 
          createResult.IsSuccess)
      {
        _telemetryContext.RecordRoleAdded(createCommand.RoleName);
      }
      
      // Infrastructure events (cache hits/misses) are recorded by other behaviors
    }
    catch (Exception ex)
    {
      _logger.LogWarning(ex, "Failed to track business telemetry for request {RequestType}", 
        typeof(TRequest).Name);
    }
  }
}

Key Features:

✅ Wraps the entire pipeline - ensures all events are flushed
✅ Handles both direct and indirect events - business logic + infrastructure signals
✅ Clean separation - telemetry logic isolated here
✅ Error handling - telemetry failures don't break the pipeline

Step 4: Enhanced Caching Behavior (Clean Separation)

Using the recommended scoped service approach, update the CachingBehavior:

using Ardalis.GuardClauses;
using Ardalis.Result;
using Mediator;
using Microsoft.AspNetCore.Identity;
using Microsoft.Extensions.Caching.Memory;
using NimblePros.DAB.Web._04_Chain.Services;
using NimblePros.DAB.Web._04_Chain.UseCases.Create;
using NimblePros.DAB.Web._04_Chain.UseCases.List;

namespace NimblePros.DAB.Web._04_Chain.PipelineBehaviors;

public class CachingBehavior<TRequest, TResponse> : IPipelineBehavior<TRequest, TResponse>
  where TRequest : IMessage
{
  private readonly IMemoryCache _cache;
  private readonly ITelemetryContext _telemetryContext;
  private readonly ILogger<CachingBehavior<TRequest, TResponse>> _logger;
  private readonly MemoryCacheEntryOptions _cacheOptions;

  public CachingBehavior(
    IMemoryCache cache,
    ITelemetryContext telemetryContext,
    ILogger<CachingBehavior<TRequest, TResponse>> logger)
  {
    _cache = cache;
    _telemetryContext = telemetryContext;
    _logger = logger;
    
    _cacheOptions = new MemoryCacheEntryOptions()
      .SetAbsoluteExpiration(relative: TimeSpan.FromSeconds(Constants.DEFAULT_CACHE_SECONDS));
  }

  public async ValueTask<TResponse> Handle(
    TRequest request,
    MessageHandlerDelegate<TRequest, TResponse> next,
    CancellationToken cancellationToken)
  {
    Guard.Against.Null(request, nameof(request));

    // Only cache ListRoles requests
    if (request is ListRolesRequest listRequest)
    {
      return await HandleListRolesWithCaching(listRequest, next, cancellationToken);
    }

    // For non-cacheable requests (like CreateRoleCommand), 
    // clear the cache and proceed normally
    var response = await next(request, cancellationToken);
    
    if (request is CreateRoleCommand createCommand)
    {
      ClearCache(createCommand);
    }

    return response;
  }

  private async ValueTask<TResponse> HandleListRolesWithCaching(
    ListRolesRequest request, 
    MessageHandlerDelegate<TRequest, TResponse> next,
    CancellationToken cancellationToken)
  {
    var cacheKey = GenerateCacheKey(request);
    
    // Try to get from cache first
    if (_cache.TryGetValue(cacheKey, out var cachedResult) && 
        cachedResult is TResponse cachedResponse)
    {
      // Record cache hit event for telemetry
      _telemetryContext.RecordCacheHit();
      _logger.LogDebug("Cache hit for ListRoles request");
      return cachedResponse;
    }

    // Record cache miss event for telemetry
    _telemetryContext.RecordCacheMiss();
    _logger.LogDebug("Cache miss for ListRoles request, fetching from database");
    
    var response = await next((TRequest)(object)request, cancellationToken);
    
    // Cache the response if it's successful
    if (response is Result<List<IdentityRole>> result && result.IsSuccess)
    {
      _cache.Set(cacheKey, response, _cacheOptions);
      _logger.LogDebug("Cached ListRoles response for {CacheSeconds} seconds", 
        Constants.DEFAULT_CACHE_SECONDS);
    }

    return response;
  }

  private void ClearCache(CreateRoleCommand command)
  {
    // Clear the list cache when a new role is added
    var listCacheKey = GenerateCacheKey(new ListRolesRequest());
    _cache.Remove(listCacheKey);
    _logger.LogDebug("Cleared ListRoles cache after creating role: {RoleName}", 
      command.RoleName);
  }

  private static string GenerateCacheKey(object request)
  {
    // Simple cache key generation based on request type
    return $"{request.GetType().Name}";
  }
}

Key Benefits:

✅ No telemetry dependencies in caching behavior
✅ Clean separation of caching and telemetry concerns
✅ Easy to test - mock ITelemetryContext
✅ Type-safe - no magic strings or casting

Step 5: Update the CachingBehavior

The existing CachingBehavior.cs needs to be enhanced to track cache hits and misses. Find the file in src/NimblePros.DAB.Web/04_Chain/PipelineBehaviors/CachingBehavior.cs and update it:

// Add ITelemetryContext to the constructor
private readonly ITelemetryContext _telemetryContext;

public CachingBehavior(
  IMemoryCache cache, 
  ILogger<CachingBehavior<TRequest, TResponse>> logger,
  ITelemetryContext telemetryContext) // Add this parameter
{
  _cache = cache;
  _logger = logger;
  _telemetryContext = telemetryContext; // Add this assignment
}

Then update the caching methods to record telemetry events:

// In cache hit scenario:
_telemetryContext.RecordCacheHit();
_logger.LogDebug("Cache hit for {RequestType} request", typeof(TRequest).Name);

// In cache miss scenario:
_telemetryContext.RecordCacheMiss();
_logger.LogDebug("Cache miss for {RequestType} request, fetching from database", typeof(TRequest).Name);

Important: The CachingBehavior may need to handle both ListRolesRequest and ListRolesWithAttributesRequest. Ensure you have a generic method like HandleGenericListRolesCaching to handle both request types.

Step 6: Register Services and Behaviors

Edit src/NimblePros.DAB.Web/Program.cs to register all the services:

// Register telemetry services
builder.Services.AddSingleton<ITelemetryService, TelemetryService>();
builder.Services.AddScoped<ITelemetryContext, TelemetryContext>();

// Add pipeline behaviors (order matters!)
builder.Services.AddScoped(typeof(IPipelineBehavior<,>), typeof(LoggingBehavior<,>)); // universal
builder.Services.AddScoped(typeof(IPipelineBehavior<,>), typeof(AuthorizationBehavior<,>)); // universal  
builder.Services.AddScoped(typeof(IPipelineBehavior<,>), typeof(ValidationBehavior<,>)); // universal
builder.Services.AddScoped(typeof(IPipelineBehavior<,>), typeof(TelemetryBehavior<,>)); // universal - LAST
builder.Services.AddScoped(typeof(IPipelineBehavior<,>), typeof(AttributedBehaviorExecutor<,>)); // universal - handles attributed behaviors

Critical Notes:

TelemetryBehavior should be registered last among universal behaviors to flush all accumulated telemetry events
AttributedBehaviorExecutor must be registered last to execute attributed behaviors like CachingBehavior
Order of registration determines execution order in the pipeline builder.Services.AddScoped(typeof(IPipelineBehavior<,>), typeof(TelemetryBehavior<,>)); // universal builder.Services.AddScoped(typeof(IPipelineBehavior<,>), typeof(AttributedBehaviorExecutor<,>)); // execute attributed behaviors


**Architecture Flow:**

Request → TelemetryBehavior (wraps everything) → AttributedBehaviorExecutor → CachingBehavior (records events to ITelemetryContext) → Handler ← Response flows back ← TelemetryBehavior (flushes ITelemetryContext to metrics)


**Benefits of This Architecture:**
- ✅ **Single Responsibility**: Each behavior has one job
- ✅ **Separation of Concerns**: Telemetry logic isolated in TelemetryBehavior
- ✅ **Communication**: Behaviors can signal events without tight coupling
- ✅ **Performance**: Minimal overhead, shared services

### Step 7: Build and Test the Implementation

Before testing, ensure your code compiles:

```bash
# Navigate to the Web project
cd src/NimblePros.DAB.Web

# Build the project to check for errors
dotnet build

# If successful, start the Aspire application
cd ../NimblePros.DAB.AppHost
dotnet run

Look for the dashboard URL in the console output (usually https://localhost:17xxx).

Part 4: Testing the Telemetry Implementation

Step 1: Access the Aspire Dashboard

Start the application using dotnet run in the AppHost project
Copy the dashboard URL from the console output
Open the dashboard in your browser
Navigate to the Metrics section

Step 2: Test Cache Metrics

Use the following endpoints to test cache behavior:

For Chain pattern endpoints:

GET https://localhost:7011/Chain/Roles (first call - cache miss)
GET https://localhost:7011/Chain/Roles (second call - cache hit)

For Pipeline Attributes pattern endpoints:

GET https://localhost:7011/PipelineAttributes/Roles (first call - cache miss)
GET https://localhost:7011/PipelineAttributes/Roles (second call - cache hit)

Test role creation:

POST https://localhost:7011/Chain/Roles or https://localhost:7011/PipelineAttributes/Roles
Body: {"roleName": "TestRole"}
✅ Testability: Easy to mock and test each component

Part 4: Testing the Metrics

Step 1: Run the Application

Start the solution:

cd c:\dev\github-nimblepros\RefactorToPipelineArchitecture
dotnet run --project src/NimblePros.DAB.AppHost

The Aspire Dashboard should open automatically at https://localhost:17191 (or similar)
Navigate to the Metrics section in the Aspire Dashboard

Step 2: Generate Metrics Data

Use the API endpoints to generate telemetry data:

Generate Cache Misses (first request hits database):

GET https://localhost:7070/04_Chain/roles/pipeline
Authorization: Bearer {your-jwt-token}

Generate Cache Hits (subsequent requests within cache window):

GET https://localhost:7070/04_Chain/roles/pipeline
Authorization: Bearer {your-jwt-token}

Generate Role Creation Metrics:

POST https://localhost:7070/04_Chain/roles/pipeline
Authorization: Bearer {your-jwt-token}
Content-Type: application/json

{
  "roleName": "TestRole1"
}

Generate More Cache Misses (cache cleared after role creation):

GET https://localhost:7070/04_Chain/roles/pipeline
Authorization: Bearer {your-jwt-token}

Step 3: View Metrics in Aspire Dashboard

In the Aspire Dashboard:

Navigate to Metrics section
Look for your custom metrics:
- role_cache_hits_total
- role_cache_misses_total
- roles_added_total
Create visualizations:
- Add charts for each metric
- Set appropriate time ranges
- Watch metrics update in real-time

Expected Behavior

First GET request to /Chain/Roles or /PipelineAttributes/Roles:

role_cache_misses_total increments (cold cache)
No cache hit
Database query executes

Second GET request (within 30-second cache window):

role_cache_hits_total increments (warm cache)
No database query
Response served from cache

POST request to create role:

roles_added_total increments (successful creation)
Cache gets cleared automatically
New role appears in subsequent GET requests

Third GET request (after role creation):

role_cache_misses_total increments again (cache cleared)
Fresh data loaded from database

Troubleshooting

If metrics don't appear in Aspire Dashboard:

Check meter name consistency: Verify ServiceDefaults/Extensions.cs and TelemetryService.cs use the exact same meter name
Verify behavior registration: Ensure both ITelemetryContext and TelemetryBehavior are registered in Program.cs
Check logs: Look for telemetry-related log messages with 📊 and 🔄 emojis
Build verification: Run dotnet build to ensure no compile errors
Endpoint testing: Test the correct endpoints (/Chain/Roles or /PipelineAttributes/Roles) that have caching behavior

Part 5: Understanding the Pipeline Flow with Telemetry

Enhanced Pipeline Execution Order

With our new TelemetryBehavior, the pipeline now executes in this order:

Request → LoggingBehavior → AuthorizationBehavior → ValidationBehavior → TelemetryBehavior → AttributedBehaviorExecutor → Handler

For ListRoles with CachingBehavior Attribute:

ListRolesRequest
├── LoggingBehavior (universal)
├── AuthorizationBehavior (universal) 
├── ValidationBehavior (universal)
├── TelemetryBehavior (universal)
├── AttributedBehaviorExecutor (universal)
│   └── CachingBehavior (attributed)
│       ├── Cache Check
│       ├── Cache Hit → Increment cache_hits_total
│       └── Cache Miss → Increment cache_misses_total → Call Handler
└── ListRolesHandler → Return roles

For CreateRoleCommand:

CreateRoleCommand  
├── LoggingBehavior (universal)
├── AuthorizationBehavior (universal)
├── ValidationBehavior (universal) 
├── TelemetryBehavior (universal)
│   └── [On Response] Increment roles_added_total (if successful)
└── CreateRoleHandler → Add role → Return result

Part 6: Advanced Pattern - Behavior Communication

Better Separation of Concerns

Instead of having the CachingBehavior directly call telemetry, we can use a communication pattern where behaviors pass information through the pipeline. This keeps telemetry concerns isolated in the TelemetryBehavior and works in any context (web, CLI, background services, etc.).

The Clean Approach: Scoped Service Communication

Why This Approach?

✅ Context-Independent: Works in web, CLI, background services
✅ Clean DI: Uses dependency injection properly
✅ Type-Safe: No magic strings or casting
✅ Testable: Easy to mock dependencies
✅ Performance: Minimal overhead

Step 1: Create a Scoped Telemetry Context Service

Create src/NimblePros.DAB.Web/04_Chain/Services/TelemetryContext.cs:

namespace NimblePros.DAB.Web._04_Chain.Services;

public interface ITelemetryContext
{
  void RecordCacheHit();
  void RecordCacheMiss();
  void RecordRoleAdded(string roleName);
  void FlushToTelemetry(ITelemetryService telemetryService);
}

public class TelemetryContext : ITelemetryContext
{
  private readonly List<Action<ITelemetryService>> _pendingActions = new();
  private readonly ILogger<TelemetryContext> _logger;

  public TelemetryContext(ILogger<TelemetryContext> logger)
  {
    _logger = logger;
  }

  public void RecordCacheHit()
  {
    _pendingActions.Add(ts => ts.TrackCacheHit());
  }

  public void RecordCacheMiss()
  {
    _pendingActions.Add(ts => ts.TrackCacheMiss());
  }

  public void RecordRoleAdded(string roleName)
  {
    _pendingActions.Add(ts => ts.TrackRoleAdded(roleName));
  }

  public void FlushToTelemetry(ITelemetryService telemetryService)
  {
    foreach (var action in _pendingActions)
    {
      try
      {
        action(telemetryService);
      }
      catch (Exception ex)
      {
        // Log but don't throw - telemetry shouldn't break the pipeline
        _logger.LogWarning(ex, "Failed to execute telemetry action");
      }
    }
    
    _pendingActions.Clear();
  }
}

Key Benefits:

Scoped per request: Each request gets its own context
Context-agnostic: No dependency on HttpContext or web infrastructure
Deferred execution: Actions are recorded and executed at the end
Error isolation: Telemetry failures don't break business logic

Part 7: Performance Considerations and Analysis

The Problem with Naive Implementations

❌ Anti-Pattern: Creating Metrics in Every Behavior Constructor

// DON'T DO THIS - Performance Issues!
public class BadTelemetryBehavior<TRequest, TResponse> : IPipelineBehavior<TRequest, TResponse>
{
  private readonly Meter _meter;
  private readonly Counter<long> _counter;

  public BadTelemetryBehavior()
  {
    // ❌ Creates new meter for EVERY request type!
    _meter = new Meter("MyApp"); 
    _counter = _meter.CreateCounter<long>("my_counter");
  }
}

Problems:

Memory Waste: Creates separate meters/counters for every TRequest, TResponse combination
Registration Overhead: 100 request types = 100 meter instances
Metric Duplication: Same metric registered multiple times with OTEL
Unnecessary Allocation: Metrics created even for requests that never use them

✅ Our Optimized Solution

Shared Telemetry Service Pattern:

// ✅ Efficient: One meter, shared across all requests
[Singleton] ITelemetryService -> Creates meters once
     ↓
[Scoped] TelemetryBehavior<T,R> -> Lightweight, just calls service
     ↓  
[Scoped] CachingBehavior<T,R> -> Uses same shared service

Benefits:

Single Meter Instance: One meter for entire application
Minimal Memory: Behaviors only hold service reference
Fast Instantiation: No expensive meter creation in constructors
Selective Tracking: Only track metrics for requests that need them
Easy Testing: Mock ITelemetryService for unit tests

Part 7: Advanced Metrics Scenarios

Adding Custom Metrics Tags/Labels

You can enhance metrics with tags for better filtering and analysis:

// Enhanced counter with tags
_rolesAddedCounter.Add(1, new[] 
{
  new KeyValuePair<string, object?>("role_name", createResult.Value.Name),
  new KeyValuePair<string, object?>("user", "current_user")
});

Creating Histograms for Response Times

// Add to TelemetryBehavior
private readonly Histogram<double> _requestDurationHistogram;

public TelemetryBehavior(ILogger<TelemetryBehavior<TRequest, TResponse>> logger)
{
  // ... existing code ...
  
  _requestDurationHistogram = _meter.CreateHistogram<double>(
    name: "request_duration_ms",
    description: "Request processing duration in milliseconds");
}

public async ValueTask<TResponse> Handle(/* ... */)
{
  var stopwatch = Stopwatch.StartNew();
  
  var response = await next(request, cancellationToken);
  
  stopwatch.Stop();
  _requestDurationHistogram.Record(stopwatch.ElapsedMilliseconds, new[]
  {
    new KeyValuePair<string, object?>("request_type", typeof(TRequest).Name)
  });
  
  return response;
}

Part 7: Production Considerations

Metrics Best Practices

✅ Use appropriate metric types:

Counters for things that only increase
Gauges for values that fluctuate
Histograms for distributions

✅ Add meaningful labels but avoid high cardinality:

// Good: Low cardinality
new KeyValuePair<string, object?>("operation", "list_roles")

// Bad: High cardinality (unique per request)
new KeyValuePair<string, object?>("request_id", Guid.NewGuid().ToString())

✅ Use descriptive names and descriptions ✅ Monitor metric performance impact ✅ Set up alerting on key metrics

Deployment Considerations

For Production:

Configure OTLP endpoint to send to production observability stack
Set up proper metric retention policies
Configure alerting rules
Monitor metric collection overhead

Environment Variables for Production:

OTEL_EXPORTER_OTLP_ENDPOINT=https://your-otel-collector.company.com
OTEL_RESOURCE_ATTRIBUTES=service.name=NimblePros.DAB.Web,service.version=1.0.0

Part 8: Testing the Implementation

Unit Testing Pipeline Behaviors

You can unit test the telemetry behavior:

[Test]
public async Task TelemetryBehavior_Should_Increment_RolesAdded_Counter()
{
  // Arrange
  var logger = Mock.Of<ILogger<TelemetryBehavior<CreateRoleCommand, Result<RoleDetails>>>>();
  var behavior = new TelemetryBehavior<CreateRoleCommand, Result<RoleDetails>>(logger);
  
  var request = new CreateRoleCommand("TestRole");
  var expectedResponse = Result<RoleDetails>.Success(new RoleDetails("1", "TestRole"));
  
  var nextCalled = false;
  ValueTask<Result<RoleDetails>> Next(CreateRoleCommand req, CancellationToken ct)
  {
    nextCalled = true;
    return ValueTask.FromResult(expectedResponse);
  }
  
  // Act  
  var result = await behavior.Handle(request, Next, CancellationToken.None);
  
  // Assert
  Assert.That(nextCalled, Is.True);
  Assert.That(result.IsSuccess, Is.True);
  // Note: Testing actual metric increments requires more complex setup
}

Summary and Reflection

What You've Learned

✅ OpenTelemetry Metrics: Understanding of OTEL metrics types and usage
✅ Aspire Integration: How to configure custom metrics in Aspire 9.5.1
✅ Pipeline Telemetry: Creating behaviors that track business metrics
✅ Cache Metrics: Tracking cache effectiveness with hit/miss ratios
✅ Business Metrics: Tracking meaningful business operations
✅ Observability: Understanding the value of metrics for production systems

Key Takeaways

Benefits of Metrics in Pipeline Architecture:

Non-Intrusive: Metrics tracking doesn't affect business logic
Consistent: All requests automatically get telemetry tracking
Flexible: Easy to add new metrics by modifying behaviors
Testable: Behaviors can be unit tested independently
Observable: Real-time visibility into application performance

Production Value:

Performance Monitoring: Track cache hit ratios, request counts
Capacity Planning: Understand usage patterns
SLA Monitoring: Measure against service level agreements
Incident Response: Quickly identify performance issues
Business Intelligence: Track feature usage and adoption

Pattern Evolution Summary

From Labs 1-5, we've built a complete observability story:

Lab 1 (Spaghetti): No observability, everything mixed together Lab 2 (Template Method): Basic logging in template methods Lab 3 (Decorator): Logging decorators for specific services Lab 4 (Chain of Responsibility): Universal logging behavior Lab 5 (OTEL Metrics): Complete observability with metrics, logs, and traces

Final Architecture Benefits

Aspect	Lab 1	Lab 2	Lab 3	Lab 4	Lab 5
Business Logic Separation	None	Better	Good	Excellent	Excellent
Observability	None	Basic	Service-Level	Request-Level	Full OTEL
Metrics	None	None	None	None	Custom + System
Testability	Poor	Better	Good	Excellent	Excellent
Production-Ready	No	Partially	Yes	Yes	Production-Ready

Next Steps

In Your Own Projects:

Start with Aspire ServiceDefaults for instant OTEL setup
Add custom metrics for business-critical operations
Create dashboards for key metrics in your observability platform
Set up alerting on important thresholds
Use metrics to guide performance optimization efforts

Advanced Topics to Explore:

Distributed Tracing: Track requests across microservices
Custom Exporters: Send metrics to specific APM systems
Metric Aggregation: Create business dashboards from OTEL data
Alerting Rules: Set up monitoring alerts based on metrics
Correlation: Link metrics, logs, and traces together

Common Pitfalls and Best Practices

❌ Common Mistakes:

Meter Name Mismatch: Different names in ServiceDefaults and TelemetryService
Wrong Behavior Order: TelemetryBehavior not registered last among universal behaviors
Missing Dependencies: Forgetting to inject ITelemetryContext into CachingBehavior
Testing Wrong Endpoints: Using endpoints without caching behavior
Build Errors: Not running dotnet build before testing

✅ Best Practices:

Consistent Naming: Use the same meter name across all configurations
Descriptive Metrics: Use clear, business-meaningful metric names
Proper Logging: Add emoji markers (📊, 🔄) for easy log filtering
Unit Testing: Mock ITelemetryContext for behavior testing
Documentation: Document which endpoints support which behaviors

Additional Resources

OpenTelemetry .NET: Official Documentation
Aspire Observability: Aspire Dashboard and Telemetry
System.Diagnostics.Metrics: .NET Metrics API
OTLP Protocol: OpenTelemetry Protocol
Production Deployment: Azure Monitor Integration

Congratulations! You've successfully built a production-ready pipeline architecture with complete observability using OpenTelemetry and Aspire 9.5.1. Your application now provides real-time insights into cache performance, business operations, and system health.

Questions or Issues? Open an issue in the GitHub repository or ask your instructor for clarification.

FilesExpand file tree

Lab5.md

Latest commit

History