Skip to content

Observability

FraiseQL provides comprehensive observability through metrics, distributed tracing, and structured logging.

FraiseQL exposes a /metrics endpoint in Prometheus format. The metric names and label schema are:

Terminal window
# No TOML config needed — metrics endpoint is always available at /metrics
curl http://localhost:8080/metrics
MetricTypeDescription
fraiseql_http_requests_totalCounterTotal HTTP requests
fraiseql_http_request_duration_secondsHistogramRequest latency
fraiseql_http_requests_in_flightGaugeActive requests
fraiseql_http_response_size_bytesHistogramResponse sizes
MetricTypeDescription
fraiseql_graphql_queries_totalCounterTotal queries
fraiseql_graphql_mutations_totalCounterTotal mutations
fraiseql_graphql_subscriptions_activeGaugeActive subscriptions
fraiseql_graphql_errors_totalCounterGraphQL errors
fraiseql_graphql_query_duration_secondsHistogramQuery execution time
fraiseql_graphql_query_complexityHistogramQuery complexity scores
MetricTypeDescription
fraiseql_db_connections_activeGaugeActive connections
fraiseql_db_connections_idleGaugeIdle connections
fraiseql_db_query_duration_secondsHistogramSQL query latency
fraiseql_db_errors_totalCounterDatabase errors
MetricTypeDescription
fraiseql_cache_hits_totalCounterCache hits
fraiseql_cache_misses_totalCounterCache misses
fraiseql_cache_size_bytesGaugeCache memory usage
fraiseql_cache_evictions_totalCounterCache evictions
MetricTypeDescription
fraiseql_auth_attempts_totalCounterAuth attempts
fraiseql_auth_success_totalCounterSuccessful auth
fraiseql_auth_failure_totalCounterFailed auth
fraiseql_auth_sessions_activeGaugeActive sessions
fraiseql_pkce_redis_errors_totalCounterRedis errors in PKCE state store (fail-open; only present with redis-pkce feature)
fraiseql_rate_limit_redis_errors_totalCounterRedis errors in rate limiter (fail-open; only present with redis-rate-limiting feature)

Common labels across metrics:

LabelDescription
operationGraphQL operation name
typequery, mutation, subscription
statussuccess, error
error_codeError code if failed

Example queries for dashboards:

# Request rate
rate(fraiseql_http_requests_total[5m])
# P99 latency
histogram_quantile(0.99, rate(fraiseql_http_request_duration_seconds_bucket[5m]))
# Error rate
rate(fraiseql_graphql_errors_total[5m]) / rate(fraiseql_graphql_queries_total[5m])
# Cache hit ratio
rate(fraiseql_cache_hits_total[5m]) /
(rate(fraiseql_cache_hits_total[5m]) + rate(fraiseql_cache_misses_total[5m]))
# Active connections
fraiseql_db_connections_active / fraiseql_db_connections_max

FraiseQL exports traces via OpenTelemetry using standard env vars:

Terminal window
OTEL_SERVICE_NAME=fraiseql-api
OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4317
OTEL_EXPORTER_OTLP_PROTOCOL=grpc # or http/protobuf
OTEL_TRACES_SAMPLER=traceidratio
OTEL_TRACES_SAMPLER_ARG=0.1 # sample 10% of requests

FraiseQL creates spans for each request with attributes including:

AttributeDescription
graphql.operation.nameOperation name
graphql.operation.typequery/mutation/subscription
graphql.documentQuery document (if enabled)
db.systemDatabase type
db.statementSQL query (if enabled)
db.operationSELECT/INSERT/UPDATE/DELETE
user.idAuthenticated user ID
tenant.idTenant ID

FraiseQL propagates trace context via headers:

traceparent: 00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01
tracestate: fraiseql=user:123
Terminal window
OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4317
OTEL_EXPORTER_OTLP_PROTOCOL=grpc
OTEL_SERVICE_NAME=fraiseql-api

Logging is configured via environment variables:

Terminal window
# Log level
RUST_LOG=info # error | warn | info | debug | trace
RUST_LOG=fraiseql=debug,info # per-crate level (fraiseql at debug, everything else at info)
# Log format (JSON output for production log aggregators)
FRAISEQL_LOG_FORMAT=json # json | pretty (default: pretty in dev, json in prod)
{
"timestamp": "2024-01-15T10:30:00.123Z",
"level": "INFO",
"target": "fraiseql_server::graphql",
"message": "Query executed",
"span": {
"request_id": "abc-123",
"user_id": "user-456"
},
"fields": {
"operation": "getUser",
"duration_ms": 45,
"cache_hit": true
}
}

Use the standard RUST_LOG directive syntax for per-crate levels:

Terminal window
RUST_LOG=fraiseql_server=info,fraiseql_core::cache=debug,fraiseql_core::db=warn,tower_http=debug,sqlx=warn

FraiseQL exposes health endpoints automatically — no configuration required:

Basic health:

Terminal window
curl http://localhost:8080/health
# {"status": "ok"}

Detailed health:

Terminal window
curl http://localhost:8080/health/detailed
{
"status": "ok",
"checks": {
"database": {
"status": "ok",
"latency_ms": 2
},
"cache": {
"status": "ok",
"size": 1500,
"max_size": 10000
},
"schema": {
"status": "ok",
"version": "1.0.0",
"loaded_at": "2024-01-15T10:00:00Z"
}
},
"version": "2.0.0",
"uptime_seconds": 3600
}
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 5
periodSeconds: 10
readinessProbe:
httpGet:
path: /health/detailed
port: 8080
initialDelaySeconds: 5
periodSeconds: 5

Example Prometheus alerting rules:

groups:
- name: fraiseql
rules:
- alert: HighErrorRate
expr: |
rate(fraiseql_graphql_errors_total[5m]) /
rate(fraiseql_graphql_queries_total[5m]) > 0.05
for: 5m
labels:
severity: warning
annotations:
summary: "High GraphQL error rate"
- alert: HighLatency
expr: |
histogram_quantile(0.99,
rate(fraiseql_http_request_duration_seconds_bucket[5m])
) > 1
for: 5m
labels:
severity: warning
annotations:
summary: "P99 latency above 1 second"
- alert: LowCacheHitRate
expr: |
rate(fraiseql_cache_hits_total[5m]) /
(rate(fraiseql_cache_hits_total[5m]) + rate(fraiseql_cache_misses_total[5m])) < 0.5
for: 15m
labels:
severity: info
annotations:
summary: "Cache hit rate below 50%"
- alert: DatabaseConnectionPoolExhausted
expr: |
fraiseql_db_connections_active / fraiseql_db_connections_max > 0.9
for: 5m
labels:
severity: critical
annotations:
summary: "Database connection pool nearly exhausted"
Terminal window
OTEL_TRACES_SAMPLER=traceidratio
OTEL_TRACES_SAMPLER_ARG=0.1 # sample 10% of requests

For error-focused sampling, configure your OTel collector or use a tail-based sampler (e.g., Grafana Tempo, OpenTelemetry Collector’s tailsampling processor) to keep 100% of error traces while downsampling success traces.

Log rotation is handled by your container runtime or log aggregator (Fluent Bit, Logstash, Loki, etc.) — not by FraiseQL. Write logs to stdout and let your infrastructure handle rotation and retention.

FraiseQL does not include user_id or request_id as metric labels by default — these high-cardinality values are kept in structured logs and traces instead, where cardinality is not a concern.

  1. Check /metrics endpoint is reachable from your Prometheus instance
  2. Verify Prometheus scrape config points to the correct port and path
  3. Ensure FraiseQL started successfully (fraiseql run with no errors)
  1. Verify OTEL_EXPORTER_OTLP_ENDPOINT is set and reachable
  2. Check OTEL_TRACES_SAMPLER_ARG — a low value (e.g., 0.001) may drop traces in low-traffic tests
  3. Check trace context propagation headers
  1. Adjust RUST_LOG level per component (e.g., RUST_LOG=warn,fraiseql=info)
  2. Filter in your log aggregator (Loki, Elasticsearch, etc.)
  1. Start FraiseQL — metrics are always available, no config needed:

    Terminal window
    fraiseql run
  2. Test metrics endpoint:

    Terminal window
    curl http://localhost:8080/metrics

    Expected output (partial):

    # HELP fraiseql_http_requests_total Total HTTP requests
    # TYPE fraiseql_http_requests_total counter
    fraiseql_http_requests_total{method="POST",status="200"} 42
    fraiseql_http_requests_total{method="POST",status="400"} 3
    # HELP fraiseql_graphql_queries_total Total GraphQL queries
    # TYPE fraiseql_graphql_queries_total counter
    fraiseql_graphql_queries_total{type="query"} 38
    fraiseql_graphql_queries_total{type="mutation"} 4
    # HELP fraiseql_db_connections_active Active database connections
    # TYPE fraiseql_db_connections_active gauge
    fraiseql_db_connections_active 5
  3. Execute some queries to generate metrics:

    Terminal window
    # Run a few queries
    for i in {1..10}; do
    curl -s -X POST http://localhost:8080/graphql \
    -H "Content-Type: application/json" \
    -d '{"query": "{ __typename }"}' > /dev/null
    done
  4. Verify metrics updated:

    Terminal window
    curl http://localhost:8080/metrics | grep fraiseql_graphql_queries_total

    Expected output:

    fraiseql_graphql_queries_total{type="query"} 48
  5. Test health endpoints:

    Terminal window
    # Basic health
    curl http://localhost:8080/health

    Expected output:

    {"status": "ok"}
    Terminal window
    # Detailed health
    curl http://localhost:8080/health/detailed

    Expected output:

    {
    "status": "ok",
    "checks": {
    "database": {
    "status": "ok",
    "latency_ms": 2
    },
    "schema": {
    "status": "ok",
    "version": "1.0.0"
    }
    },
    "version": "2.0.0",
    "uptime_seconds": 3600
    }
  6. Verify structured logging:

    Terminal window
    # Make a request and check logs
    curl -s -X POST http://localhost:8080/graphql \
    -H "Content-Type: application/json" \
    -d '{"query": "{ me { id } }"}'

    Check FraiseQL stdout for JSON log lines:

    {
    "timestamp": "2024-01-15T10:30:00.123Z",
    "level": "INFO",
    "target": "fraiseql_server::graphql",
    "message": "Query executed",
    "span": {
    "request_id": "abc-123",
    "user_id": "user-456"
    },
    "fields": {
    "operation": "me",
    "duration_ms": 12,
    "cache_hit": false
    }
    }
  7. Test OpenTelemetry tracing (if enabled):

    Terminal window
    OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317 \
    OTEL_SERVICE_NAME=fraiseql-api \
    fraiseql run

    After making requests, check your tracing backend (Jaeger, Zipkin, etc.) for spans.

If /metrics returns empty or connection refused:

  1. Check FraiseQL started successfully:

    Terminal window
    fraiseql run 2>&1 | head -20
  2. Verify the metrics endpoint is on port 8080 (same port as GraphQL):

    Terminal window
    curl http://localhost:8080/metrics

If database connection metrics are absent:

  1. Verify database connectivity:

    Terminal window
    curl http://localhost:8080/health/detailed | jq .checks.database
  2. Check [database] pool configuration in fraiseql.toml

FraiseQL does not include user_id or request_id as metric labels by default — no configuration needed.

If traces don’t show up in your backend:

  1. Verify OTLP endpoint is reachable from the FraiseQL process:

    Terminal window
    curl http://otel-collector:4317
  2. Set OTEL_TRACES_SAMPLER_ARG=1.0 temporarily to sample 100% for testing

  3. Ensure trace context propagation:

    Terminal window
    curl -H "traceparent: 00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01" \
    http://localhost:8080/graphql

Deployment

Deployment — Production monitoring setup

Performance

Performance — Using metrics to optimize