Backend / Logging / Logging

Logging

9 min read

Rapid overview

Logging in .NET 10 for High-Performance, Highly Available Systems
Principles and Goals
Recommended Stack
Patterns to Prefer
Configuration Example (Minimal API)
High-Performance Techniques
Code examples (proper + low overhead)
1) Message templates + `EventId` (cheap when disabled)
2) Avoid interpolated strings (allocates even if not logged)
3) Guard expensive work (serialization, LINQ, stack traces)
4) Correct exception logging (keep stack trace)
5) Use scopes for correlation (applies to all nested logs)
6) High-volume logging with `LoggerMessage` source generator
7) Simple rate limiting for noisy warnings (protects throughput)
NLog: Performance Playbook (Practical)
NLog.config example (bounded async file)
NLog.config example (JSON + scopes for correlation)
Program.cs sketch (NLog provider)
Operational Best Practices
Sample Diagnostic Pattern (Resilient HTTP Client)
Sample Interview Q&A

Logging in .NET 10 for High-Performance, Highly Available Systems

Use these notes to explain how you design resilient, low-overhead logging pipelines for .NET 10 services. Emphasize structured data, predictable latency, and operability under load.

Principles and Goals

Observability first: Capture structured events (no free-form strings) with fields for correlation IDs, tenant, region, and service version.
Consistency across tiers: Standardize ILogger scopes and message templates so API, workers, and background jobs emit comparable fields.
Backpressure-aware: Protect the app by sampling noisy events (e.g., debug traces) and using bounded queues for async sinks.
Fail-safe: Logging must never block request completion; prefer drop-and-alert over throttling callers.

Recommended Stack

Microsoft.Extensions.Logging as the facade for framework integrations.
Structured logger: Serilog or Seq sink for JSON; use message-template placeholders ({OrderId}) instead of string interpolation.
OpenTelemetry: Export traces/logs/metrics consistently; include W3C trace/context propagation headers.
Shipping: Centralize via OTLP/HTTP, gRPC, or Kafka; avoid writing to local disk in containers except for bootstrap/debug.
NLog context:
Where it shines: Mature ecosystem with rich target support (file, database, email, Azure Log Analytics), high-performance async targets, and flexible routing/layouts for multi-tenant or blue/green rollouts.
Advantages: Simple XML/JSON configuration, fast text file output, built-in filtering/rules, and battle-tested community extensions. Great when you need to fan out to multiple sinks without custom code.
Trade-offs: Configuration can become verbose, JSON structure requires explicit layouts, and mixing NLog APIs with ILogger can create duplicate pipelines. Prefer running NLog behind Microsoft.Extensions.Logging via the NLog provider to keep a single abstraction and leverage dependency injection.
Operational note: Keep targets async, set overflowAction="Discard" or sampling for noisy rules, and set per-environment config (dev = verbose file, prod = centralized structured logs) through NLog.config transforms.

Patterns to Prefer

Message templates over interpolation: logger.LogInformation("Processed {Count} items", count); avoids unnecessary string formatting when disabled.
Enriched scopes: Wrap requests with using var scope = logger.BeginScope(new { CorrelationId = traceId, Tenant = tenant }); so child logs inherit fields.
Categorized loggers: Request typed loggers per class (ILogger<CheckoutHandler>) for filters and metrics alignment.
Level hygiene:
Critical/Error: actionable failures only.
Warning: degradation/path to failure.
Information: business milestones.
Debug/Trace: development and short-term diagnostics; gate behind config.
Async/buffered sinks: Use non-blocking background queues with bounded capacity; drop oldest or sample when full.
Health/availability focus: Emit heartbeats and dependency outcome events (latency, status, retries) to support SLO dashboards.

Configuration Example (Minimal API)

var builder = WebApplication.CreateBuilder(args);

builder.Logging.ClearProviders();
builder.Logging.AddConsole(options =>
{
    options.IncludeScopes = true;
    options.TimestampFormat = "yyyy-MM-ddTHH:mm:ss.fffZ ";
});

builder.Services.AddOpenTelemetry()
    .WithTracing(trace => trace
        .AddAspNetCoreInstrumentation()
        .AddHttpClientInstrumentation()
        .AddSource("Orders"))
    .WithMetrics(metrics => metrics
        .AddRuntimeInstrumentation())
    .WithLogging(logging => logging
        .AddConsoleExporter());

var app = builder.Build();

app.MapPost("/orders", async (
    ILogger<Program> logger,
    HttpContext context) =>
{
    using var scope = logger.BeginScope(new
    {
        CorrelationId = context.TraceIdentifier,
        Tenant = context.Request.Headers["X-Tenant"].FirstOrDefault() ?? "unknown"
    });

    logger.LogInformation("Received order request");

    // Simulate work
    await Task.Delay(10);

    logger.LogInformation("Order accepted at {AcceptedAt}", DateTimeOffset.UtcNow);
    return Results.Accepted();
});

app.Run();

High-Performance Techniques

Avoid work when disabled: Use message templates (no interpolation) and guard expensive work with logger.IsEnabled(LogLevel.Debug).
Prefer compile-time logging in hot paths: Use LoggerMessage source generators for high-volume events (less boxing/allocations).
Keep request threads off I/O: Use async/buffered sinks with bounded queues; choose overflow behavior (Discard vs block) intentionally.
Avoid expensive enrichers/layouts by default: Callsite/stacktrace capture and deep object serialization can dominate CPU; enable only for short-lived debugging.
Batch and flush strategically: Batch OTLP/HTTP exports; tune batch sizes/flush intervals to keep p95/p99 latency stable.
Control volume: Use sampling/rate limiting on noisy categories and add time-bounded flags for temporary Debug/Trace.

Code examples (proper + low overhead)

1) Message templates + `EventId` (cheap when disabled)

private static readonly EventId OrderAccepted = new(2001, nameof(OrderAccepted));

logger.LogInformation(
    OrderAccepted,
    "Order {OrderId} accepted for {Tenant} in {ElapsedMs}ms",
    orderId,
    tenant,
    elapsedMs);

2) Avoid interpolated strings (allocates even if not logged)

// ❌ Avoid (allocates string regardless of level)
logger.LogDebug($"Calculated price for {sku}: {price}");

// ✅ Prefer (no formatting work when Debug is disabled)
logger.LogDebug("Calculated price for {Sku}: {Price}", sku, price);

3) Guard expensive work (serialization, LINQ, stack traces)

if (logger.IsEnabled(LogLevel.Debug))
{
    var payloadJson = JsonSerializer.Serialize(payload); // potentially expensive
    logger.LogDebug("Outgoing payload: {PayloadJson}", payloadJson);
}

4) Correct exception logging (keep stack trace)

try
{
    await processor.ProcessAsync(order, ct);
}
catch (Exception ex)
{
    logger.LogError(ex, "Order {OrderId} processing failed", order.Id);
    throw;
}

5) Use scopes for correlation (applies to all nested logs)

using var scope = logger.BeginScope(new Dictionary<string, object>
{
    ["CorrelationId"] = correlationId,
    ["Tenant"] = tenant,
    ["Region"] = region
});

logger.LogInformation("Checkout started");

6) High-volume logging with `LoggerMessage` source generator

internal static partial class CheckoutLog
{
    [LoggerMessage(
        EventId = 3001,
        Level = LogLevel.Information,
        Message = "Charged {Amount} {Currency} for order {OrderId}")]
    public static partial void Charged(
        this ILogger logger,
        decimal amount,
        string currency,
        Guid orderId);
}

// usage
CheckoutLog.Charged(logger, amount, currency, orderId);

7) Simple rate limiting for noisy warnings (protects throughput)

private static long _suppressed;
private static long _lastLogTicks;

// Allow one log per second (example)
var now = Environment.TickCount64;
var last = Interlocked.Read(ref _lastLogTicks);

if (now - last >= 1000 &&
    Interlocked.CompareExchange(ref _lastLogTicks, now, last) == last)
{
    var dropped = Interlocked.Exchange(ref _suppressed, 0);
    logger.LogWarning("Cache miss storm detected (suppressed {SuppressedCount} warnings)", dropped);
}
else
{
    Interlocked.Increment(ref _suppressed);
}

NLog: Performance Playbook (Practical)

Use NLog via Microsoft.Extensions.Logging: Keep one abstraction (ILogger) and route to NLog with NLog.Extensions.Logging.
Make targets non-blocking: Wrap file/network targets in AsyncWrapper with a bounded queueLimit; decide overflowAction (often Discard for high-RPS services).
Keep layouts cheap: Avoid ${callsite}, ${stacktrace}, and ${all-event-properties} in production rules; explicitly list JSON attributes you need.
Tune file target for throughput: For single-process file logging, prefer keepFileOpen="true" (lower overhead). If multiple processes must write the same file, use concurrentWrites="true" (slower, but safe).
Don’t pre-format strings: Prefer logger.LogDebug("Order {OrderId} validated in {ElapsedMs}ms", orderId, elapsedMs) and only compute heavy values when the level is enabled.

NLog.config example (bounded async file)

<targets>
  <target xsi:type="AsyncWrapper"
          name="asyncFile"
          queueLimit="10000"
          overflowAction="Discard">
    <target xsi:type="File"
            fileName="${basedir}/logs/app-${shortdate}.log"
            keepFileOpen="true"
            concurrentWrites="false"
            layout="${longdate}|${level:uppercase=true}|${logger}|${message} ${exception:format=ToString}" />
  </target>
</targets>

<rules>
  <logger name="Microsoft.*" maxlevel="Info" writeTo="asyncFile" final="true" />
  <logger name="System.*" maxlevel="Info" writeTo="asyncFile" final="true" />
  <logger name="*" minlevel="Info" writeTo="asyncFile" />
</rules>

NLog.config example (JSON + scopes for correlation)

<targets>
  <target xsi:type="AsyncWrapper"
          name="asyncJsonFile"
          queueLimit="20000"
          overflowAction="Discard">
    <target xsi:type="File"
            fileName="${basedir}/logs/app-${shortdate}.jsonl"
            keepFileOpen="true"
            concurrentWrites="false">
      <layout xsi:type="JsonLayout" includeAllProperties="false">
        <attribute name="ts" layout="${date:format=o}" />
        <attribute name="level" layout="${level:uppercase=true}" />
        <attribute name="logger" layout="${logger}" />
        <attribute name="message" layout="${message}" />
        <attribute name="eventId" layout="${event-properties:item=EventId_Id}" />
        <attribute name="correlationId" layout="${scopeproperty:CorrelationId}" />
        <attribute name="tenant" layout="${scopeproperty:Tenant}" />
        <attribute name="traceId" layout="${activity:property=TraceId}" />
        <attribute name="spanId" layout="${activity:property=SpanId}" />
        <attribute name="exception" layout="${exception:format=ToString}" />
      </layout>
    </target>
  </target>
</targets>

Program.cs sketch (NLog provider)

builder.Logging.ClearProviders();
builder.Host.UseNLog();

Operational Best Practices

Correlate everything: Propagate trace/span IDs across HTTP, messaging, and background jobs.
Schema discipline: Maintain a central log schema (event name, category, correlation, tenant, outcome, duration).
PII controls: Classify fields and redact at the edge; log tokens/credentials only in secure staging with strict retention.
Deployment safety: Use configuration-driven sinks/levels; favor hot-reloadable changes (appsettings, feature flags) over redeploys.
Alerting: Alert on error-rate deltas, burst of dropped events, and missing heartbeats rather than raw log volume.
What to log (and not):
Log: Request/response envelope metadata (not bodies) for public APIs, dependency outcomes (latency/status/retries), domain milestones (order accepted/settled), security-relevant events (authn/z decisions), and drop reason when sampling/discarding.
Avoid: Large payloads, secrets, personal data, high-cardinality values (raw GUID lists), and repetitive success spam in hot loops. Summarize counts instead.
Environments:
Local/dev: Enable Debug/Trace, file/console targets, and payload logging only with synthetic data.
Staging: Mirror production sinks/levels; allow short-term Debug with sampling for reproductions.
Production: Default to Information for business events and Warning/Error for issues; enable Debug/Trace only via time-bound flags with sampling to protect throughput and cost.
Performance guardrails: Keep hot-path logging off by default, prefer structured summaries, and set per-rule rate limits so troubleshooting toggles don't DOS the service.

Sample Diagnostic Pattern (Resilient HTTP Client)

public class InventoryClient
{
    private static readonly EventId FetchInventory = new(1001, nameof(FetchAsync));
    private readonly HttpClient _httpClient;
    private readonly ILogger<InventoryClient> _logger;

    public InventoryClient(HttpClient httpClient, ILogger<InventoryClient> logger)
    {
        _httpClient = httpClient;
        _logger = logger;
    }

    public async Task<InventoryResponse?> FetchAsync(string sku, CancellationToken ct)
    {
        using var scope = _logger.BeginScope(new { Sku = sku });

        _logger.LogInformation(FetchInventory, "Fetching inventory");

        try
        {
            var response = await _httpClient.GetAsync($"/inventory/{sku}", ct);
            var body = await response.Content.ReadAsStringAsync(ct);

            if (!response.IsSuccessStatusCode)
            {
                _logger.LogWarning("Inventory lookup failed with {StatusCode} and body length {Length}",
                    (int)response.StatusCode, body.Length);
                return null;
            }

            _logger.LogInformation("Inventory fetched successfully in {ElapsedMs} ms",
                response.Headers.GetValues("X-ElapsedMs").FirstOrDefault());

            return JsonSerializer.Deserialize<InventoryResponse>(body);
        }
        catch (OperationCanceledException) when (ct.IsCancellationRequested)
        {
            _logger.LogWarning("Inventory fetch canceled by caller");
            throw;
        }
        catch (Exception ex)
        {
            _logger.LogError(ex, "Unexpected failure during inventory fetch");
            throw;
        }
    }
}

Sample Interview Q&A

Q: How do you prevent logging from degrading throughput?
A: Use async/buffered sinks with bounded capacity, sample debug/trace logs, and rely on message templates to avoid string allocations when the level is disabled.
Q: What do you log to support SLOs in a multi-tenant service?
A: Log per-request correlation IDs, tenant, region, outcome, duration, and retry counts; emit heartbeats and dependency statuses for dashboards.
Q: How do you handle sensitive data in logs?
A: Apply a schema with field-level classification, default to redaction, and restrict sinks/retention for sensitive environments; validate with automated scanners.
Q: How do you integrate logging with tracing?
A: Propagate W3C trace context, attach span IDs to log scopes, export via OpenTelemetry so logs/metrics/traces share correlation IDs.
Q: What patterns help during incident response?
A: Toggle verbose levels via config, enable sampling to capture representative failures, and use structured events with consistent keys for quick querying.
Q: Why run Serilog or NLog behind Microsoft.Extensions.Logging instead of using them directly?
A: The facade lets ASP.NET Core, EF Core, and custom services share the same pipeline and DI story. Providers (Serilog/NLog) plug in via AddSerilog()/AddNLog(), so you avoid duplicate configuration, keep scopes consistent, and can swap sinks without touching app code.
Q: How do you keep correlation data consistent across APIs, queues, and workers?
A: Wrap each request/message in a logger scope containing CorrelationId, tenant, region, and trace/span IDs. Forward these fields on outbound HTTP/messaging headers so downstream services enrich their scopes too, creating an unbroken chain for querying dashboards.
Q: When would you favor synchronous logging and how do you mitigate its risk?
A: Only for small local dev tooling or early startup when async infrastructure isn’t ready. Even then, keep messages tiny and avoid blocking network calls. In production, always switch to async/buffered sinks with drop policies to shield request threads.
Q: How do you tune OpenTelemetry or OTLP exporters for logging?
A: Batch records (tens-hundreds per batch), tune FlushInterval to balance latency and throughput, and size the export queue to absorb bursts. Watch metrics for dropped spans/logs and adjust MaxConcurrency or sampling rates accordingly.
Q: How do you verify logging instrumentation in CI?
A: Use in-memory sinks/exporters during integration tests, execute key scenarios, and assert on emitted event names, scopes, and structured fields. This catches schema regressions or missing context before they hit observability platforms.
Q: How do you control log level explosions when temporarily enabling Debug or Trace?
A: Flip levels via config/feature flags with TTLs, scope the change to specific categories, and combine with sampling or rate limiting. Always capture the reason in runbooks so toggles are reverted quickly and storage cost stays predictable.