Deployment
Deployment — Production monitoring setup
FraiseQL provides comprehensive observability through metrics, distributed tracing, and structured logging.
FraiseQL exposes a /metrics endpoint in Prometheus format. The metric names and label schema are:
# No TOML config needed — metrics endpoint is always available at /metricscurl http://localhost:8080/metrics| Metric | Type | Description |
|---|---|---|
fraiseql_http_requests_total | Counter | Total HTTP requests |
fraiseql_http_request_duration_seconds | Histogram | Request latency |
fraiseql_http_requests_in_flight | Gauge | Active requests |
fraiseql_http_response_size_bytes | Histogram | Response sizes |
| Metric | Type | Description |
|---|---|---|
fraiseql_graphql_queries_total | Counter | Total queries |
fraiseql_graphql_mutations_total | Counter | Total mutations |
fraiseql_graphql_subscriptions_active | Gauge | Active subscriptions |
fraiseql_graphql_errors_total | Counter | GraphQL errors |
fraiseql_graphql_query_duration_seconds | Histogram | Query execution time |
fraiseql_graphql_query_complexity | Histogram | Query complexity scores |
| Metric | Type | Description |
|---|---|---|
fraiseql_db_connections_active | Gauge | Active connections |
fraiseql_db_connections_idle | Gauge | Idle connections |
fraiseql_db_query_duration_seconds | Histogram | SQL query latency |
fraiseql_db_errors_total | Counter | Database errors |
| Metric | Type | Description |
|---|---|---|
fraiseql_cache_hits_total | Counter | Cache hits |
fraiseql_cache_misses_total | Counter | Cache misses |
fraiseql_cache_size_bytes | Gauge | Cache memory usage |
fraiseql_cache_evictions_total | Counter | Cache evictions |
| Metric | Type | Description |
|---|---|---|
fraiseql_auth_attempts_total | Counter | Auth attempts |
fraiseql_auth_success_total | Counter | Successful auth |
fraiseql_auth_failure_total | Counter | Failed auth |
fraiseql_auth_sessions_active | Gauge | Active sessions |
fraiseql_pkce_redis_errors_total | Counter | Redis errors in PKCE state store (fail-open; only present with redis-pkce feature) |
fraiseql_rate_limit_redis_errors_total | Counter | Redis errors in rate limiter (fail-open; only present with redis-rate-limiting feature) |
Common labels across metrics:
| Label | Description |
|---|---|
operation | GraphQL operation name |
type | query, mutation, subscription |
status | success, error |
error_code | Error code if failed |
Example queries for dashboards:
# Request raterate(fraiseql_http_requests_total[5m])
# P99 latencyhistogram_quantile(0.99, rate(fraiseql_http_request_duration_seconds_bucket[5m]))
# Error raterate(fraiseql_graphql_errors_total[5m]) / rate(fraiseql_graphql_queries_total[5m])
# Cache hit ratiorate(fraiseql_cache_hits_total[5m]) /(rate(fraiseql_cache_hits_total[5m]) + rate(fraiseql_cache_misses_total[5m]))
# Active connectionsfraiseql_db_connections_active / fraiseql_db_connections_maxFraiseQL exports traces via OpenTelemetry using standard env vars:
OTEL_SERVICE_NAME=fraiseql-apiOTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4317OTEL_EXPORTER_OTLP_PROTOCOL=grpc # or http/protobufOTEL_TRACES_SAMPLER=traceidratioOTEL_TRACES_SAMPLER_ARG=0.1 # sample 10% of requestsFraiseQL creates spans for each request with attributes including:
| Attribute | Description |
|---|---|
graphql.operation.name | Operation name |
graphql.operation.type | query/mutation/subscription |
graphql.document | Query document (if enabled) |
db.system | Database type |
db.statement | SQL query (if enabled) |
db.operation | SELECT/INSERT/UPDATE/DELETE |
user.id | Authenticated user ID |
tenant.id | Tenant ID |
FraiseQL propagates trace context via headers:
traceparent: 00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01tracestate: fraiseql=user:123OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4317OTEL_EXPORTER_OTLP_PROTOCOL=grpcOTEL_SERVICE_NAME=fraiseql-apiJaeger supports OTLP natively (v1.35+). Point the OTLP exporter at Jaeger’s OTLP receiver:
OTEL_EXPORTER_OTLP_ENDPOINT=http://jaeger:4317OTEL_SERVICE_NAME=fraiseql-apiOTEL_TRACES_EXPORTER=zipkinOTEL_EXPORTER_ZIPKIN_ENDPOINT=http://zipkin:9411/api/v2/spansOTEL_SERVICE_NAME=fraiseql-apiLogging is configured via environment variables:
# Log levelRUST_LOG=info # error | warn | info | debug | traceRUST_LOG=fraiseql=debug,info # per-crate level (fraiseql at debug, everything else at info)
# Log format (JSON output for production log aggregators)FRAISEQL_LOG_FORMAT=json # json | pretty (default: pretty in dev, json in prod){ "timestamp": "2024-01-15T10:30:00.123Z", "level": "INFO", "target": "fraiseql_server::graphql", "message": "Query executed", "span": { "request_id": "abc-123", "user_id": "user-456" }, "fields": { "operation": "getUser", "duration_ms": 45, "cache_hit": true }}Use the standard RUST_LOG directive syntax for per-crate levels:
RUST_LOG=fraiseql_server=info,fraiseql_core::cache=debug,fraiseql_core::db=warn,tower_http=debug,sqlx=warnFraiseQL exposes health endpoints automatically — no configuration required:
Basic health:
curl http://localhost:8080/health# {"status": "ok"}Detailed health:
curl http://localhost:8080/health/detailed{ "status": "ok", "checks": { "database": { "status": "ok", "latency_ms": 2 }, "cache": { "status": "ok", "size": 1500, "max_size": 10000 }, "schema": { "status": "ok", "version": "1.0.0", "loaded_at": "2024-01-15T10:00:00Z" } }, "version": "2.0.0", "uptime_seconds": 3600}livenessProbe: httpGet: path: /health port: 8080 initialDelaySeconds: 5 periodSeconds: 10
readinessProbe: httpGet: path: /health/detailed port: 8080 initialDelaySeconds: 5 periodSeconds: 5Example Prometheus alerting rules:
groups: - name: fraiseql rules: - alert: HighErrorRate expr: | rate(fraiseql_graphql_errors_total[5m]) / rate(fraiseql_graphql_queries_total[5m]) > 0.05 for: 5m labels: severity: warning annotations: summary: "High GraphQL error rate"
- alert: HighLatency expr: | histogram_quantile(0.99, rate(fraiseql_http_request_duration_seconds_bucket[5m]) ) > 1 for: 5m labels: severity: warning annotations: summary: "P99 latency above 1 second"
- alert: LowCacheHitRate expr: | rate(fraiseql_cache_hits_total[5m]) / (rate(fraiseql_cache_hits_total[5m]) + rate(fraiseql_cache_misses_total[5m])) < 0.5 for: 15m labels: severity: info annotations: summary: "Cache hit rate below 50%"
- alert: DatabaseConnectionPoolExhausted expr: | fraiseql_db_connections_active / fraiseql_db_connections_max > 0.9 for: 5m labels: severity: critical annotations: summary: "Database connection pool nearly exhausted"OTEL_TRACES_SAMPLER=traceidratioOTEL_TRACES_SAMPLER_ARG=0.1 # sample 10% of requestsFor error-focused sampling, configure your OTel collector or use a tail-based sampler (e.g., Grafana Tempo, OpenTelemetry Collector’s tailsampling processor) to keep 100% of error traces while downsampling success traces.
Log rotation is handled by your container runtime or log aggregator (Fluent Bit, Logstash, Loki, etc.) — not by FraiseQL. Write logs to stdout and let your infrastructure handle rotation and retention.
FraiseQL does not include user_id or request_id as metric labels by default — these high-cardinality values are kept in structured logs and traces instead, where cardinality is not a concern.
/metrics endpoint is reachable from your Prometheus instancefraiseql run with no errors)OTEL_EXPORTER_OTLP_ENDPOINT is set and reachableOTEL_TRACES_SAMPLER_ARG — a low value (e.g., 0.001) may drop traces in low-traffic testsRUST_LOG level per component (e.g., RUST_LOG=warn,fraiseql=info)Start FraiseQL — metrics are always available, no config needed:
fraiseql runTest metrics endpoint:
curl http://localhost:8080/metricsExpected output (partial):
# HELP fraiseql_http_requests_total Total HTTP requests# TYPE fraiseql_http_requests_total counterfraiseql_http_requests_total{method="POST",status="200"} 42fraiseql_http_requests_total{method="POST",status="400"} 3
# HELP fraiseql_graphql_queries_total Total GraphQL queries# TYPE fraiseql_graphql_queries_total counterfraiseql_graphql_queries_total{type="query"} 38fraiseql_graphql_queries_total{type="mutation"} 4
# HELP fraiseql_db_connections_active Active database connections# TYPE fraiseql_db_connections_active gaugefraiseql_db_connections_active 5Execute some queries to generate metrics:
# Run a few queriesfor i in {1..10}; do curl -s -X POST http://localhost:8080/graphql \ -H "Content-Type: application/json" \ -d '{"query": "{ __typename }"}' > /dev/nulldoneVerify metrics updated:
curl http://localhost:8080/metrics | grep fraiseql_graphql_queries_totalExpected output:
fraiseql_graphql_queries_total{type="query"} 48Test health endpoints:
# Basic healthcurl http://localhost:8080/healthExpected output:
{"status": "ok"}# Detailed healthcurl http://localhost:8080/health/detailedExpected output:
{ "status": "ok", "checks": { "database": { "status": "ok", "latency_ms": 2 }, "schema": { "status": "ok", "version": "1.0.0" } }, "version": "2.0.0", "uptime_seconds": 3600}Verify structured logging:
# Make a request and check logscurl -s -X POST http://localhost:8080/graphql \ -H "Content-Type: application/json" \ -d '{"query": "{ me { id } }"}'Check FraiseQL stdout for JSON log lines:
{ "timestamp": "2024-01-15T10:30:00.123Z", "level": "INFO", "target": "fraiseql_server::graphql", "message": "Query executed", "span": { "request_id": "abc-123", "user_id": "user-456" }, "fields": { "operation": "me", "duration_ms": 12, "cache_hit": false }}Test OpenTelemetry tracing (if enabled):
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317 \OTEL_SERVICE_NAME=fraiseql-api \fraiseql runAfter making requests, check your tracing backend (Jaeger, Zipkin, etc.) for spans.
If /metrics returns empty or connection refused:
Check FraiseQL started successfully:
fraiseql run 2>&1 | head -20Verify the metrics endpoint is on port 8080 (same port as GraphQL):
curl http://localhost:8080/metricsIf database connection metrics are absent:
Verify database connectivity:
curl http://localhost:8080/health/detailed | jq .checks.databaseCheck [database] pool configuration in fraiseql.toml
FraiseQL does not include user_id or request_id as metric labels by default — no configuration needed.
If traces don’t show up in your backend:
Verify OTLP endpoint is reachable from the FraiseQL process:
curl http://otel-collector:4317Set OTEL_TRACES_SAMPLER_ARG=1.0 temporarily to sample 100% for testing
Ensure trace context propagation:
curl -H "traceparent: 00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01" \ http://localhost:8080/graphqlDeployment
Deployment — Production monitoring setup
Performance
Performance — Using metrics to optimize
Troubleshooting
Troubleshooting — Debugging with logs and traces