技能档案

Prometheus & Grafana

Prometheus: metrics types, PromQL, Grafana dashboards, alerting rules

Observability & Monitoring Metrics & Monitoring

角色数

69

包含此技能的角色

级别数

5

结构化成长路径

必要要求

96

其余 241 个可选

领域

Observability & Monitoring

skills.group

Metrics & Monitoring

最后更新

2026/3/17

如何使用

选择当前级别并对比期望。下方卡片显示晋升所需掌握的内容。

各级别期望

表格展示从初级到首席的技能深度变化。点击行查看详情。

角色 必要性 描述
1C Developer Understands basic Prometheus & Grafana for 1C infrastructure: reading 1C server performance dashboards, understanding basic metric types for 1C platform monitoring, navigating pre-built Grafana panels for database and session metrics. Follows team conventions for monitoring 1C environments.
AI Product Engineer Studies monitoring fundamentals with Prometheus and metric visualization in Grafana for AI products. Understands concepts of metrics, alerts, and dashboards for tracking ML service health in production.
Analytics Engineer Studies monitoring basics with Prometheus and Grafana for tracking analytical pipeline health. Understands metric and alert concepts as applied to ETL processes and data warehouses.
Android Developer Understands basic Prometheus & Grafana for mobile backend monitoring: reading API performance dashboards, understanding latency and error rate metrics for mobile backends, navigating Grafana panels for app backend health. Follows team conventions for monitoring mobile API dependencies.
Application Security Engineer Understands basic Prometheus & Grafana for security monitoring: reading security-relevant dashboards (auth failures, rate limiting), understanding metric types for security event tracking, navigating Grafana panels for security alerting. Follows team conventions for security metric monitoring.
AR/VR Developer Getting started with Prometheus and Grafana for monitoring AR/VR application backends. Views dashboards with server-side metrics — connections, FPS, and session latency.
Backend Developer (C#/.NET) Understands .NET service monitoring. Reads Grafana dashboards: CPU, memory, GC, request rate. Knows ASP.NET Core metrics (/metrics via prometheus-net). Responds per runbook.
Backend Developer (Elixir) Sets up basic metrics for Elixir applications through :telemetry and PromEx. Exports standard Phoenix metrics (request duration, count) and Ecto metrics (query time) to Prometheus. Creates simple Grafana dashboards for monitoring HTTP endpoints and database.
Backend Developer (Go) Adds basic Prometheus metrics to Go services via client_golang: counters for requests, histograms for latency, gauges for active connections. Understands /metrics exposition format, creates simple Grafana dashboards for monitoring key indicators.
Backend Developer (Java/Kotlin) Understands why Java service monitoring is needed. Reads Grafana dashboards: JVM heap, GC pauses, request rate, error rate. Knows basic Spring Boot Actuator metrics. Responds to alerts following runbooks.
Backend Developer (Node.js) Understands Node.js monitoring: prom-client for custom metrics, default metrics (event loop lag, GC, memory). Reads Grafana dashboards. Responds to alerts.
Backend Developer (PHP) Understands why PHP application monitoring is needed. Reads Grafana dashboards: CPU, memory, request rate, error rate. Knows basic metrics: response time, throughput, error percentage. Responds to alerts following instructions.
Backend Developer (Python) 必要 Understands metrics concepts (counter, gauge, histogram). Views Grafana dashboards. Knows PromQL basics. Adds basic metrics to the application.
Backend Developer (Rust) Exports basic metrics from Rust services through prometheus-rs crate: HTTP request count/duration, active connections. Configures /metrics endpoint in Axum/Actix-web and creates simple Grafana dashboards for monitoring.
Backend Developer (Scala) Understands basic metrics and monitoring concepts for Scala applications: JVM metrics (heap, GC, threads) through Micrometer/Prometheus client. Reads Grafana dashboards, understands main metric values and can find service status information in monitoring.
BI Analyst Understands basic Prometheus & Grafana for BI pipeline monitoring: reading ETL job performance dashboards, understanding data processing metrics for pipeline health, navigating Grafana panels for data freshness tracking. Follows team conventions for monitoring data pipeline infrastructure.
Blockchain Developer Monitors blockchain via Prometheus: node metrics, block height, gas prices. Views dashboards.
Cloud Engineer Views infrastructure metrics in Grafana: EC2 instance CPU/memory, pod utilization in Kubernetes, CloudWatch metrics. Creates simple dashboards, configures basic alerting rules. Understands pull vs push metric collection models and Prometheus role in cloud-native monitoring.
Compiler Engineer Understands the purpose of monitoring and metrics systems. Can view existing Grafana dashboards for tracking CI pipeline status and compiler build time.
Computer Vision Engineer Understands the purpose of Prometheus and Grafana for monitoring and can view ready-made dashboards with CV service metrics. Knows basic ML metric types.
Data Analyst Understands basic Prometheus & Grafana for data pipeline monitoring: reading transformation job dashboards, understanding processing metrics for analytical workload health, navigating Grafana panels for data quality indicators. Follows team conventions for monitoring analytical infrastructure.
Data Engineer Views dashboards in Grafana. Understands what metrics are (CPU, memory, latency). Can find the right dashboard and analyze a simple graph.
Data Scientist Monitors ML via Prometheus/Grafana: model latency, prediction throughput. Views ML dashboards.
Database Engineer / DBA Uses Grafana for DB monitoring: reading dashboards with metrics (connections, QPS, replication lag). Understands basic MySQL/PostgreSQL exporter metrics. Sets up simple threshold-based alerts.
Desktop Developer (.NET WPF/WinUI/MAUI) Studies Prometheus and Grafana monitoring basics for tracking .NET desktop ecosystem server components. Understands metrics, alerts and dashboard concepts for backend services supporting desktop applications.
Desktop Developer (Electron/Tauri) Understands basic Prometheus & Grafana for desktop app backend monitoring: reading backend service dashboards supporting Electron apps, understanding API performance metrics, navigating Grafana panels for update service health. Follows team conventions for monitoring desktop app infrastructure.
Desktop Developer (Qt/C++) Studies monitoring basics with Prometheus and Grafana for tracking health of server components in the Qt ecosystem. Understands metric and alert concepts for backend services supporting desktop applications.
DevOps Engineer Understands basic Prometheus & Grafana for DevOps: setting up Prometheus exporters, creating basic Grafana dashboards for infrastructure metrics, understanding PromQL basics for metric queries. Follows team conventions for monitoring CI/CD pipelines and deployment infrastructure.
DevSecOps Engineer Installs Prometheus and Grafana for infrastructure and application monitoring. Configures exporters: node_exporter, kube-state-metrics. Creates basic Grafana dashboards for CPU, memory, disk. Configures Alertmanager alerts for critical metrics: high CPU, disk space, pod restarts.
Embedded Developer Understands metrics and monitoring concepts, can view Grafana dashboards for tracking embedded device status. Knows basic PromQL syntax for simple metric queries.
Flutter Developer Familiarizes with Prometheus and Grafana for monitoring the server side of Flutter apps. Views dashboards with API request metrics and backend response times.
Frontend Developer (Angular) Gets familiar with Prometheus and Grafana for monitoring Angular application backends. Views dashboards with API metrics — response time, request count, and errors.
Frontend Developer (React) Understands basic Prometheus & Grafana for frontend monitoring context: reading backend performance dashboards relevant to frontend API calls, understanding latency and error metrics for user-facing services, navigating Grafana panels for Core Web Vitals tracking. Follows team conventions for frontend performance monitoring.
Frontend Developer (Svelte) Understands the purpose of Prometheus and Grafana for monitoring and can view pre-built dashboards with frontend application metrics. Knows the main metric types.
Frontend Developer (Vue) Understands basic Prometheus & Grafana for Vue app monitoring context: reading backend dashboards for API dependencies, understanding performance metrics for SSR/Nuxt server monitoring, navigating Grafana panels for frontend service health. Follows team conventions for Vue application performance monitoring.
Fullstack Developer Works with Prometheus/Grafana: understands application metrics, views dashboards. Configures basic alerts for services.
Game Designer Understands basic Prometheus & Grafana for game metrics: reading player analytics dashboards, understanding gameplay metrics (session duration, retention), navigating Grafana panels for game economy and balance data. Follows team conventions for monitoring game KPIs.
Game QA Engineer Works with Prometheus/Grafana for QA: monitors game server metrics, tracks performance during tests.
Game Server Developer Understands basic Prometheus & Grafana for game servers: reading server performance dashboards (tick rate, player count, latency), understanding game-specific metric types, navigating Grafana panels for matchmaking and session health. Follows team conventions for game server monitoring.
Infrastructure Engineer Uses Prometheus and Grafana for basic infrastructure monitoring: viewing standard dashboards (node-exporter, kube-state-metrics), executing simple PromQL queries (rate, sum). Understands Prometheus data model (metrics, labels, timestamps) and can find issues through existing alerts.
iOS Developer Studies monitoring basics with Prometheus and Grafana for tracking iOS ecosystem server components. Understands metric and alert concepts for backend services serving mobile applications.
IoT Engineer Understands basic Prometheus & Grafana for IoT monitoring: reading device fleet dashboards, understanding telemetry metrics for device health tracking, navigating Grafana panels for connectivity and data ingestion rates. Follows team conventions for IoT infrastructure monitoring.
Language Tooling Engineer Understands basic Prometheus & Grafana for language tooling: reading compiler/LSP server performance dashboards, understanding build time and memory usage metrics, navigating Grafana panels for language service health. Follows team conventions for monitoring language tool infrastructure.
LLM Engineer Monitors LLM via Prometheus: latency, token usage, error rate. Views dashboards.
ML Engineer Views dashboards in Grafana. Understands what metrics are (CPU, memory, latency). Can find the right dashboard and analyze a simple chart.
MLOps Engineer Understands basic Prometheus & Grafana for MLOps: reading model serving performance dashboards, understanding training job metrics (GPU utilization, loss curves), navigating Grafana panels for ML pipeline health. Follows team conventions for monitoring ML infrastructure.
Network Engineer Knows basic Prometheus and Grafana concepts for network engineering and can apply them in typical tasks. Uses standard tools and follows established team practices. Understands when and why this approach is used.
NLP Engineer Understands basic Prometheus & Grafana for NLP systems: reading model inference dashboards, understanding NLP service metrics (latency, throughput, token usage), navigating Grafana panels for language model serving health. Follows team conventions for monitoring NLP pipeline infrastructure.
Penetration Testing Engineer Understands the purpose of Prometheus and Grafana for monitoring and can view dashboards with security scanning metrics. Knows basic infrastructure metrics.
Performance Testing Engineer Monitors performance via Grafana: request rate, latency percentiles, error rate, resource utilization. Creates dashboards for load test results.
Platform Engineer Configures Prometheus scraping for platform services: ServiceMonitor, PodMonitor in Kubernetes. Creates basic Grafana dashboards for infrastructure monitoring. Understands counter, gauge, histogram metric types. Sets up simple alerting rules through Alertmanager.
QA Automation Engineer Understands basic Prometheus & Grafana for test infrastructure: reading test environment dashboards, understanding test execution metrics (pass rate, duration), navigating Grafana panels for CI/CD pipeline health. Follows team conventions for monitoring test infrastructure.
QA Engineer (Manual) Studies monitoring fundamentals with Prometheus and metric visualization in Grafana. Understands concepts of metrics, alerts, and dashboards for tracking application health during testing.
QA Security Engineer Monitors security metrics: failed login attempts, authorization denials, scan findings count. Reads security dashboards. Responds to security alerts.
React Native Developer Understands basic Prometheus & Grafana for mobile backend monitoring: reading API performance dashboards for React Native backends, understanding latency and error metrics for mobile services, navigating Grafana panels for push notification and update services. Follows team conventions for mobile infrastructure monitoring.
Release Engineer Knows basic Prometheus and Grafana concepts for release engineering and can apply them in typical tasks. Uses standard tools and follows established team practices. Understands when and why this approach is applied.
Security Analyst Understands basic Prometheus & Grafana for security operations: reading security dashboards (auth events, anomaly rates), understanding security-relevant metric types, navigating Grafana panels for threat detection alerts. Follows team conventions for security metric monitoring and alert triage.
Site Reliability Engineer (SRE) Works with Prometheus/Grafana: reads metrics, creates basic dashboards. Understands metric types: counter, gauge, histogram, summary. Configures simple alerting rules.
Smart Contract Developer Monitors blockchain via Prometheus: node metrics, block height, peer count. Views Grafana dashboards.
Systems Programmer (C/C++) Monitors systems via Prometheus: CPU metrics, memory usage, I/O throughput. Exports custom metrics.
Technical Product Manager Understands basic Prometheus & Grafana for product management: reading product health dashboards (availability, latency), understanding SLI/SLO metrics for product reliability discussions, navigating Grafana panels for user-facing service metrics. Follows team conventions for product health monitoring.
Technical Writer Understands the purpose of Prometheus and Grafana and can document basic metrics and dashboards. Knows metric types and their format for creating application monitoring reference guides. Describes basic monitoring setup in project operational guides.
Telecom Developer Understands basic Prometheus & Grafana for telecom systems: reading call processing performance dashboards, understanding telecom-specific metrics (call completion rate, jitter, MOS), navigating Grafana panels for signaling service health. Follows team conventions for telecom service monitoring.
Unity Developer Understands Prometheus/Grafana for games: monitors server metrics, game performance. Views dashboards.
Unreal Engine Developer Monitors Unreal servers via Prometheus/Grafana: server FPS, player count, network metrics.
XR Unity Developer Understands basic Prometheus & Grafana for XR/Unity backends: reading multiplayer server dashboards, understanding frame timing and network latency metrics, navigating Grafana panels for XR session quality. Follows team conventions for monitoring XR service infrastructure.
角色 必要性 描述
1C Developer Configures 1C infrastructure monitoring: server cluster metrics, DBMS performance, operation duration. Creates dashboards for tracking key system indicators.
AI Product Engineer Configures AI product monitoring in Prometheus with inference, latency, and resource utilization metrics. Creates informative Grafana dashboards for tracking product KPIs and ML model health in real time.
Analytics Engineer Configures analytics pipeline monitoring in Prometheus — ETL duration metrics, processed data volumes, and load errors. Creates Grafana dashboards for tracking data freshness and analytics platform SLA.
Android Developer Configures monitoring for Android application backend infrastructure: API latency, error rate, throughput. Creates dashboards correlating server and client metrics.
Application Security Engineer Configures Prometheus and Grafana for security monitoring: implements custom metrics for security event tracking, creates security-focused dashboards with anomaly detection panels, sets up alerts for authentication failures and rate limit violations. Analyzes security incidents using metric correlation.
AR/VR Developer Configures AR/VR infrastructure monitoring via Prometheus and Grafana. Creates dashboards for tracking server simulation FPS, network delays, and rendering load.
Backend Developer (C#/.NET) Adds custom metrics via System.Diagnostics.Metrics / prometheus-net: Counter, Histogram, Gauge. Creates Grafana dashboards. Configures alerts. Monitors GC and thread pool.
Backend Developer (Elixir) Implements comprehensive monitoring of Elixir applications through PromEx with modules for Phoenix, Ecto, Broadway and BEAM VM. Creates custom :telemetry events for business metrics, configures Grafana dashboards with BEAM process, memory and scheduler visualization.
Backend Developer (Go) Configures comprehensive Go service monitoring: RED metrics (Rate, Errors, Duration) via promauto, custom business metrics, SLI/SLO. Creates Grafana dashboards with variables for multi-service monitoring, configures alerting via Alertmanager.
Backend Developer (Java/Kotlin) Adds custom metrics via Micrometer: business metrics (Counter, Gauge, Timer), JVM metrics, Hikari connection pool stats. Creates Grafana dashboards for the service. Configures Prometheus alerts via AlertManager.
Backend Developer (Node.js) Configures monitoring: custom business metrics (Counter, Histogram), recording rules, alerting on SLI. Creates Grafana dashboards for Node.js: event loop, heap, connections.
Backend Developer (PHP) Adds custom metrics to PHP applications: business metrics (orders/min, registrations), technical (queue size, cache hit ratio). Configures PHP-FPM and OPcache metrics. Creates Grafana dashboards for services. Configures basic alerts.
Backend Developer (Python) 必要 Creates custom metrics with prometheus-client. Writes PromQL queries (rate, histogram_quantile). Creates Grafana dashboards. Configures basic alerting rules.
Backend Developer (Rust) Implements metrics for Rust services through metrics crate with Prometheus exporter: custom histograms for business operations, gauges for resources (connection pool, queue depth). Configures Grafana dashboards with variables and alerts on SLI metrics.
Backend Developer (Scala) Configures Prometheus metrics for Scala services: custom counters, histograms for latency, gauges for business metrics through Kamon or ZIO Metrics. Creates Grafana dashboards with RED metrics (Rate, Errors, Duration) for endpoints, configures basic alerts for performance anomalies.
BI Analyst Configures Prometheus and Grafana for data pipeline monitoring: implements metrics for ETL job tracking and data freshness, creates dashboards for pipeline throughput and error rates, sets up alerts for data processing SLA violations. Analyzes data incidents using pipeline metric analysis.
Blockchain Developer Creates blockchain dashboards: node health, contract metrics, transaction monitoring.
Cloud Engineer Configures Prometheus for cloud infrastructure monitoring: kube-state-metrics, node-exporter, custom exporters for cloud APIs. Designs Grafana dashboards — resource utilization, cost tracking, SLI/SLO. Integrates with AlertManager and configures notification channels.
Compiler Engineer Configures Prometheus metrics for compiler infrastructure: build time by phases, error counts, resource utilization. Creates Grafana dashboards for CI/CD monitoring.
Computer Vision Engineer Configures CV service monitoring in Prometheus — inference latency, throughput, GPU utilization, accuracy drift. Creates Grafana dashboards for tracking model quality.
Data Analyst Configures Prometheus and Grafana for analytical infrastructure: implements metrics for data transformation job monitoring, creates dashboards for query performance and data quality tracking, sets up alerts for analytical workload anomalies. Analyzes processing incidents using metric-based investigation.
Data Engineer 必要 Adds custom metrics to applications (counter, gauge, histogram). Writes PromQL queries for dashboards. Creates Grafana dashboards. Configures basic alerts (high error rate, high latency).
Data Scientist Creates ML monitoring: model performance dashboards, data drift metrics, feature importance tracking.
Database Engineer / DBA Configures Prometheus monitoring for databases: mysqld_exporter, postgres_exporter, custom queries for business metrics. Creates Grafana dashboards: query performance, buffer pool hit ratio, lock waits. Configures alerting rules.
Desktop Developer (.NET WPF/WinUI/MAUI) Configures monitoring for .NET desktop ecosystem server components through Prometheus with System.Diagnostics.Metrics integration. Creates Grafana dashboards for build pipeline, update and desktop application telemetry metrics.
Desktop Developer (Electron/Tauri) Configures metric collection from Electron application server components and creates Grafana dashboards. Sets up alerts for critical metrics: synchronization errors, API delays, server load.
Desktop Developer (Qt/C++) Configures monitoring for Qt ecosystem server components — update services, licensing and telemetry. Creates Grafana dashboards for tracking build pipeline metrics and desktop application usage statistics.
DevOps Engineer Deploys and configures Prometheus stack: kube-prometheus-stack in Kubernetes, ServiceMonitor/PodMonitor, alerting rules. Creates Grafana dashboards for infrastructure and applications, configures alertmanager with routing to Slack/PagerDuty.
DevSecOps Engineer Configures Prometheus for security monitoring: failed auth attempts, certificate expiration, secret access patterns. Creates Grafana dashboards for security operations: vulnerability trends, compliance scores, incident metrics. Introduces Thanos for long-term storage and multi-cluster querying. Configures recording rules.
Embedded Developer Configures metrics collection from embedded gateways through Prometheus exporters and creates Grafana dashboards for device fleet monitoring. Configures alerts on telemetry anomalies and firmware status.
Engineering Manager Configures Prometheus and Grafana practices for engineering teams: establishes dashboard and alerting standards, reviews monitoring coverage for team services, tracks reliability metrics (MTTD/MTTR) for incident management improvement. Participates in on-call rotation and post-mortem processes.
Flutter Developer Configures monitoring for Flutter app backends using Prometheus and Grafana. Creates dashboards for tracking API performance, errors, and user activity.
Frontend Developer (Angular) Configures Angular application performance monitoring through custom metrics in Grafana. Creates dashboards for tracking Core Web Vitals and client-side errors.
Frontend Developer (React) Configures frontend metrics collection from React applications: Web Vitals, load time, rendering errors. Creates Grafana dashboards for monitoring user experience and performance.
Frontend Developer (Svelte) Configures frontend metric collection in Prometheus — Core Web Vitals, load time, errors — and creates Grafana dashboards. Configures alerts for critical indicators.
Frontend Developer (Vue) Configures frontend metric monitoring through Prometheus — Core Web Vitals, bundle load time, client error count. Creates Grafana dashboards for key indicators.
Fullstack Developer Instruments fullstack: custom metrics for backend (latency, error rate) and frontend (Core Web Vitals). Creates Grafana dashboards.
Game Designer Configures Prometheus for collecting game server metrics: TPS, matchmaking time, player sessions. Creates Grafana dashboards for monitoring game subsystem performance and health. Sets up alerts for critical game metrics: crash rate, queue length, server lag.
Game QA Engineer Creates QA dashboards: game performance metrics, test execution monitoring, build quality tracking.
Game Server Developer Configures Prometheus and Grafana for game server monitoring: implements custom metrics for game state (player count, tick rate, match latency), creates dashboards for game server cluster health, sets up alerts for player experience degradation. Analyzes game incidents using server metric correlation.
Infrastructure Engineer Administers Prometheus stack for infrastructure monitoring: deployment through kube-prometheus-stack, configuring ServiceMonitor for target auto-discovery, creating Grafana dashboards. Writes intermediate PromQL queries (histogram_quantile, absent), configures alerting rules and Alertmanager receivers.
iOS Developer Configures monitoring for iOS ecosystem server components — API backend, push notifications, and update service. Creates Grafana dashboards for tracking mobile API metrics — latency, error rate, and active users.
IoT Engineer Configures Prometheus and Grafana for IoT fleet monitoring: implements custom metrics for device telemetry and connectivity, creates dashboards for fleet health and data ingestion rates, sets up alerts for device offline patterns and data loss. Analyzes device incidents using telemetry metric correlation.
Language Tooling Engineer Configures detailed monitoring of language tools: AST construction performance metrics, autocompletion latency, parser memory usage. Creates informative dashboards.
LLM Engineer Creates LLM monitoring: quality metrics, cost dashboards, throughput tracking.
ML Engineer 必要 Adds custom metrics to application (counter, gauge, histogram). Writes PromQL queries for dashboards. Creates Grafana dashboards. Configures basic alerts (high error rate, high latency).
MLOps Engineer Configures ML service monitoring via Prometheus: exporting metrics from inference endpoints (prediction_latency_seconds, model_predictions_total), configuring ServiceMonitor in Kubernetes. Creates Grafana dashboards for MLOps — model metrics by version, GPU utilization, feature store latency — and configures alerting via Alertmanager.
Network Engineer Confidently applies Prometheus and Grafana for network engineering in non-standard tasks. Independently selects the optimal approach and tools. Analyzes trade-offs and proposes improvements to existing solutions.
NLP Engineer Configures detailed NLP infrastructure monitoring: real-time model quality metrics, data drift, GPU resource utilization. Creates dashboards for comparing model versions.
Penetration Testing Engineer Configures pentest infrastructure monitoring — scanner status, vulnerability count, scan duration. Creates Grafana dashboards for security metrics.
Performance Testing Engineer Configures performance monitoring: custom metrics (latency histograms, throughput gauges), recording rules for SLI, alerting on performance degradation. Load test result dashboards.
Platform Engineer Administers Prometheus stack for the platform: Thanos/Cortex for long-term storage, federation for multi-cluster. Creates recording rules for query optimization. Develops standard dashboard templates (Jsonnet/Grafonnet). Configures Alertmanager routing and escalation.
QA Automation Engineer Configures Prometheus and Grafana for test infrastructure monitoring: implements metrics for test execution tracking and environment health, creates dashboards for test pass rates and CI pipeline performance, sets up alerts for test infrastructure degradation. Analyzes test failures using infrastructure metric correlation.
QA Engineer (Manual) Uses Prometheus metrics to analyze system behavior during testing and after releases. Creates custom Grafana dashboards for monitoring test runs and regressions. Configures alerts for detecting quality degradation in staging and production environments.
QA Security Engineer Creates security dashboards: vulnerability trends, scan coverage, remediation progress, SLA compliance. Configures alerting for security thresholds.
React Native Developer Configures monitoring for React Native application backend services: API latency, error rates, load. Creates dashboards for tracking server issue impact on the mobile client.
Release Engineer Confidently applies Prometheus and Grafana for release engineering in non-standard tasks. Independently selects the optimal approach and tools. Analyzes trade-offs and proposes improvements to existing solutions.
Security Analyst Configures Prometheus and Grafana for security operations: implements custom metrics for threat detection and incident tracking, creates security operations dashboards with IOC monitoring, sets up alerts for suspicious activity thresholds. Analyzes security events using metric-based threat hunting.
Site Reliability Engineer (SRE) Configures Prometheus monitoring: service discovery, recording rules for aggregates, alerting rules with proper severity. Creates comprehensive Grafana dashboards. Configures AlertManager routing.
Smart Contract Developer Creates blockchain dashboards: node health, gas prices, contract metrics, chain synchronization.
Systems Programmer (C/C++) Creates system monitoring: custom exporters, kernel metrics, hardware monitoring. Grafana dashboards.
Technical Lead Configures Prometheus and Grafana practices for technical teams: establishes monitoring standards for service components, reviews dashboard effectiveness and alert noise levels, tracks operational metrics for reliability improvements. Participates in incident management and drives observability practices.
Technical Product Manager Formulates product metric monitoring requirements — conversion, key operation latency, error rate. Creates Grafana dashboards for tracking business KPIs in real-time.
Technical Writer Documents project monitoring systems: metrics, dashboards, alerts with descriptions and thresholds. Creates runbook documentation for incident response based on Grafana dashboards. Describes Prometheus integration with other observability stack components in architectural guides.
Telecom Developer Configures Prometheus and Grafana for telecom service monitoring: implements custom metrics for call processing and signaling performance, creates dashboards for telecom service quality (MOS, jitter, packet loss), sets up alerts for call completion rate degradation. Analyzes telecom incidents using protocol metric correlation.
Unity Developer Creates game monitoring: custom game metrics, server performance dashboards, player activity tracking.
Unreal Engine Developer Creates game monitoring: server performance dashboards, matchmaking metrics, player activity.
XR Unity Developer Configures metric collection from XR platform server components in Prometheus and creates informative Grafana dashboards. Configures alerts on critical metrics: multiplayer latency, synchronization errors.
角色 必要性 描述
1C Developer Designs comprehensive 1C platform monitoring system with alerting on critical indicators. Implements business operation metrics and SLI/SLO for key accounting processes.
AI Product Engineer Designs comprehensive AI product monitoring system with custom metrics for tracking drift, accuracy, and business KPIs. Implements multi-level alerts with escalation and automatic model degradation diagnosis through Grafana.
Analytics Engineer Architects a comprehensive analytics platform monitoring system with data quality, freshness, and query cost metrics. Implements alerts for early detection of pipeline issues and warehouse performance degradation.
Android Developer Designs comprehensive monitoring system for Android infrastructure. Links Prometheus server metrics with Firebase Analytics for end-to-end user experience analysis.
Application Security Engineer 必要 Designs security observability strategy with Prometheus & Grafana: implements security metric collection for threat detection, defines SLIs/SLOs for security monitoring effectiveness, conducts post-mortems for security incidents with metric forensics. Mentors team on security-focused observability practices.
AR/VR Developer Architects a comprehensive AR/VR platform monitoring system with XR-specific metrics. Configures alerts for motion-to-photon latency and spatial tracking degradation.
Backend Developer (C#/.NET) 必要 Designs .NET platform monitoring: RED metrics, .NET runtime dashboards, distributed tracing. SLI/SLO dashboards. Optimizes cardinality.
Backend Developer (Elixir) 必要 Designs monitoring system for the Elixir platform on Prometheus and Grafana. Implements custom PromEx plugins for domain metrics, configures alerting rules for SLO/SLI. Implements RED metrics for microservices, monitoring GenServer mailbox size and process reductions.
Backend Developer (Go) 必要 Designs monitoring system for Go microservices: standard metrics via middleware, cardinality management, Prometheus federation. Develops SLO-based alerting, custom Prometheus exporters in Go, optimizes PromQL queries for complex dashboards.
Backend Developer (Java/Kotlin) 必要 Designs Java platform monitoring: RED metrics for API, JVM dashboards (heap, GC, threads), distributed tracing. Configures SLI/SLO dashboards. Optimizes cardinality. Implements alerting runbooks.
Backend Developer (Node.js) 必要 Designs Node.js platform monitoring: RED metrics for API, V8/event loop dashboards, distributed tracing correlation. Optimizes metric cardinality.
Backend Developer (PHP) 必要 Designs PHP platform monitoring: RED metrics for API, USE metrics for infrastructure. Configures distributed tracing. Creates SLI/SLO dashboards. Optimizes metric cardinality. Implements alerting runbooks for on-call.
Backend Developer (Python) 必要 Designs monitoring strategy for services. Configures SLI/SLO monitoring. Creates RED/USE dashboards. Optimizes cardinality. Configures recording rules.
Backend Developer (Rust) 必要 Designs metrics system for Rust microservices: RED metrics through tower middleware, custom metric types for business KPIs, high-cardinality label management. Configures Prometheus federation, recording rules for aggregations and Grafana dashboards with SLO tracking.
Backend Developer (Scala) 必要 Designs metrics system for Scala microservices: SLI/SLO monitoring through Prometheus, USE metrics for infrastructure, custom business KPIs. Configures multi-dimensional metrics considering cardinality, builds Grafana dashboards for troubleshooting and capacity planning, integrates with alerting through PagerDuty.
BI Analyst 必要 Designs observability strategy for BI data platforms with Prometheus & Grafana: implements end-to-end pipeline tracing with custom metrics, defines SLI/SLO for data freshness and processing throughput, conducts post-mortems for data pipeline incidents. Mentors analysts on metric-driven data quality monitoring.
Blockchain Developer Designs monitoring: multi-chain monitoring, alerting, performance tracking.
Cloud Engineer 必要 Designs observability stack: Thanos/Cortex for long-term storage and multi-cluster aggregation, Grafana Mimir for scalable metrics storage. Optimizes cardinality, retention, recording rules. Introduces Prometheus Operator for GitOps-managed monitoring in Kubernetes.
Compiler Engineer Designs metrics system for the compiler: latency per pass, error type distribution, performance regressions. Configures alerting on compilation time anomalies.
Computer Vision Engineer Designs production CV model monitoring system with data drift detection, quality metric degradation alerting. Implements A/B model testing through metrics.
Data Analyst 必要 Designs observability strategy for analytics infrastructure with Prometheus & Grafana: implements monitoring for data transformation pipelines, defines SLI/SLO for analytical data quality and query performance, conducts post-mortems for processing failures. Mentors team on metric-based analytical infrastructure monitoring.
Data Engineer 必要 Designs observability stack for services. Creates SLI/SLO dashboards. Writes complex PromQL (rate, histogram_quantile, recording rules). Configures Alertmanager with routing and silencing. Optimizes metric cardinality. Integrates with PagerDuty/OpsGenie.
Data Scientist Designs ML monitoring: automated drift detection, performance alerting, experiment dashboards.
Database Engineer / DBA 必要 Designs database monitoring platform on Prometheus/Grafana: multi-database monitoring, custom exporters for specific metrics, recording rules for aggregation. Implements SLI-based alerting for database services.
Desktop Developer (.NET WPF/WinUI/MAUI) Designs comprehensive .NET desktop infrastructure monitoring system with custom metrics for build time and test stability. Implements alerts for early detection of update delivery issues and backend service degradation.
Desktop Developer (Electron/Tauri) Designs monitoring system for Electron infrastructure with custom metrics: active users, sync success rate, crash reports. Creates correlation dashboards for diagnosing client-server issues.
Desktop Developer (Qt/C++) Designs comprehensive monitoring system for Qt development infrastructure with build time, test stability and update delivery metrics. Implements alerts for early detection of build and distribution pipeline issues.
DevOps Engineer 必要 Designs scalable monitoring system: Thanos/Mimir for long-term storage and multi-cluster, recording rules for optimization. Creates SLO dashboards with burn rate alerts, implements custom exporters. Configures federation and remote write.
DevSecOps Engineer 必要 Designs observability platform for security monitoring: custom metrics, SLO-based alerting, anomaly detection. Introduces PromQL for complex security queries: failed login rate, unusual API patterns. Configures Grafana OnCall for security incident alerting. Develops security-specific Grafana dashboard library.
Embedded Developer Designs comprehensive IoT infrastructure monitoring systems with Prometheus server hierarchy and federation. Creates advanced dashboards correlating hardware metrics with software indicators.
Engineering Manager 必要 Designs observability strategy for engineering teams with Prometheus & Grafana: implements organizational monitoring standards, defines SLI/SLO frameworks aligned with business objectives, drives post-mortem culture and reliability improvements. Mentors leads on building observability-driven engineering practices.
Flutter Developer Designs comprehensive monitoring system for the Flutter ecosystem — from client to server. Configures alerts on performance degradation and user behavior anomalies.
Frontend Developer (Angular) Designs Angular application monitoring system with client-side and server-side metrics. Configures frontend metrics correlation with backend Prometheus dashboards for the full picture.
Frontend Developer (React) Designs React application monitoring system with custom business metrics and server data correlation. Creates alerts for Core Web Vitals degradation and error rate increases.
Frontend Developer (Svelte) Designs Svelte application monitoring system with custom metrics, SLI/SLO dashboards, and backend metric correlation. Implements RUM and synthetic monitoring.
Frontend Developer (Vue) Designs monitoring system for Vue application — custom metrics for business events, real user monitoring, performance budgets with alerts. Correlates frontend metrics with server-side metrics.
Fullstack Developer Designs monitoring for fullstack: end-to-end observability, SLI metrics, custom Grafana dashboards with business metrics.
Game Designer Designs monitoring system for the entire game infrastructure with custom metrics and SLOs. Integrates business metrics into monitoring: retention, engagement, in-game economy health. Optimizes metric collection and storage for high-throughput game servers.
Game QA Engineer Designs QA monitoring: automated performance regression detection, test quality dashboards, game server health.
Game Server Developer 必要 Designs observability strategy for game server infrastructure with Prometheus & Grafana: implements distributed monitoring across game service clusters, defines SLI/SLO for player experience metrics, conducts post-mortems for game outages. Mentors team on game-specific monitoring patterns.
Infrastructure Engineer 必要 Designs scalable Prometheus infrastructure: federation for multi-cluster monitoring, Thanos/Cortex for long-term storage and global query, highly available Alertmanager. Optimizes metric cardinality, configures recording rules for query performance and designs Grafana dashboards-as-code through Jsonnet.
iOS Developer Architects comprehensive iOS infrastructure monitoring with backend API metrics, push delivery rate, and A/B test tracking. Implements alerts for early mobile API degradation detection and correlates server metrics with client experience.
IoT Engineer 必要 Designs observability strategy for IoT platforms with Prometheus & Grafana: implements fleet-wide monitoring with device telemetry aggregation, defines SLI/SLO for device connectivity and data delivery, conducts post-mortems for fleet-wide incidents. Mentors team on IoT-specific monitoring patterns.
Language Tooling Engineer Designs comprehensive performance monitoring system for the entire language toolchain. Implements SLI/SLO for critical operations: parsing, type analysis, refactoring.
LLM Engineer Designs LLM monitoring: quality regression detection, cost alerting, usage analytics.
ML Engineer 必要 Designs observability stack for the service. Creates SLI/SLO dashboards. Writes complex PromQL (rate, histogram_quantile, recording rules). Configures Alertmanager with routing and silencing. Optimizes metric cardinality. Integrates with PagerDuty/OpsGenie.
MLOps Engineer 必要 Architects monitoring system for the ML platform: custom metrics for model quality (drift score, prediction confidence distribution), long-term storage via Thanos/Mimir. Implements recording rules for ML metric aggregation, configures multi-cluster monitoring for training and serving clusters, and creates SLO dashboards for inference services.
Network Engineer Expertly applies Prometheus and Grafana for network engineering to design complex systems. Optimizes existing solutions and prevents architectural mistakes. Conducts code reviews and trains colleagues on best practices.
NLP Engineer Designs comprehensive NLP platform monitoring system with quality degradation alerting. Implements SLI/SLO for inference services and automatic concept drift detection.
Penetration Testing Engineer Designs security monitoring on Prometheus/Grafana — attack surface monitoring, vulnerability trends, remediation SLA. Integrates with SIEM and vulnerability management.
Performance Testing Engineer 必要 Designs performance metrics platform: real-time load test monitoring, historical comparison dashboards, automated regression detection. Optimizes high-cardinality metrics.
Platform Engineer 必要 Designs monitoring platform: Prometheus Operator + Thanos for scalable metrics, Grafana-as-code. Implements self-service monitoring for teams: automated dashboard generation, alert templates. Creates centralized metrics catalog and naming conventions for the organization.
QA Automation Engineer 必要 Designs observability strategy for test infrastructure with Prometheus & Grafana: implements monitoring for test execution environments and CI pipelines, defines SLI/SLO for test infrastructure reliability, conducts post-mortems for test pipeline failures. Mentors team on observability-driven test infrastructure management.
QA Engineer (Manual) Designs quality monitoring system based on Prometheus/Grafana for the entire QA process. Defines quality gates based on metrics: error budgets, SLI/SLO for release decisions. Integrates monitoring data into QA reports for evidence-based quality assessment.
QA Security Engineer 必要 Designs security metrics framework: security KPIs (MTTD, MTTR, vulnerability density), automated reporting, trend analysis. Integrates security metrics with business risk.
React Native Developer Architects comprehensive monitoring for the entire React Native project infrastructure. Correlates server metrics with client telemetry for end-to-end performance analysis.
Release Engineer Expertly applies Prometheus and Grafana for release engineering to design complex systems. Optimizes existing solutions and prevents architectural mistakes. Conducts code reviews and trains colleagues on best practices.
Security Analyst 必要 Designs security observability strategy with Prometheus & Grafana: implements advanced security metric collection and correlation, defines SLI/SLO for security monitoring coverage and response times, conducts post-mortems for security incidents. Mentors team on metric-based threat hunting and detection engineering.
Site Reliability Engineer (SRE) 必要 Designs Prometheus architecture: federation, Thanos/Mimir for long-term storage and global view. Optimizes cardinality, retention. Implements multi-cluster monitoring.
Smart Contract Developer Designs monitoring: multi-chain monitoring, anomaly alerting, performance tracking.
Solutions Architect 必要 Designs observability architecture with Prometheus & Grafana for distributed systems: implements cross-service monitoring patterns and federation, defines SLI/SLO frameworks aligned with architecture tiers, conducts post-mortems driving architectural improvements. Mentors teams on observability patterns for microservices.
Systems Programmer (C/C++) Designs system monitoring: low-overhead metrics collection, eBPF-based monitoring.
Technical Lead 必要 Designs observability strategy with Prometheus & Grafana for technical teams: implements monitoring standards across service components, defines SLIs/SLOs aligned with product reliability targets, drives post-mortem culture and incident learning. Mentors team on observability-driven development practices.
Technical Product Manager Designs product monitoring system — SLI/SLO for key user scenarios, alerts on business metric degradation, correlation of technical and product indicators.
Technical Writer Designs operational documentation standards: runbooks, playbooks, monitoring guides for the organization. Creates in-depth observability stack configuration guides for various deployment scenarios. Implements auto-generation of documentation from Prometheus rules and Grafana dashboard JSON.
Telecom Developer 必要 Designs observability strategy for telecom platforms with Prometheus & Grafana: implements monitoring for signaling and media processing chains, defines SLI/SLO for telecom service quality metrics, conducts post-mortems for telecom outages. Mentors team on telecom-specific monitoring and alerting patterns.
Unity Developer Designs monitoring: comprehensive game telemetry, performance dashboards, alerting.
Unreal Engine Developer Designs monitoring: comprehensive game telemetry, automated alerts, performance tracking.
XR Unity Developer Designs monitoring system for XR infrastructure with custom metrics: concurrent users, latency by region, asset delivery time. Creates correlation dashboards for issue diagnosis.
角色 必要性 描述
1C Developer Defines monitoring strategy for the entire 1C infrastructure of the organization. Designs a unified observability platform with correlation of server metrics and business indicators.
AI Product Engineer Defines monitoring strategy for the AI product portfolio, standardizes metrics and dashboards across teams. Fosters a data-driven decision-making culture based on monitoring product and technical ML system metrics.
Analytics Engineer Defines the analytics platform monitoring strategy, standardizes data quality and performance metrics. Implements a data-driven approach to ETL process optimization based on Prometheus and Grafana monitoring.
Android Developer Defines monitoring strategy for all mobile backends in the organization. Designs a unified observability system correlating server metrics with mobile telemetry.
Application Security Engineer 必要 Defines security observability strategy at the product level with Prometheus & Grafana: establishes SLO-based approach for security monitoring, coordinates security incident management, optimizes MTTD/MTTR for security events through improved detection and alerting pipelines.
AR/VR Developer Establishes monitoring standards for AR/VR projects with mandatory XR metrics. Develops reference dashboards and trains the team on immersive environment performance analysis.
Backend Developer (C#/.NET) 必要 Defines monitoring strategy: mandatory metrics, SLO, error budget. Observability culture. Post-mortem with metrics.
Backend Developer (Elixir) 必要 Defines monitoring standards for all Elixir services through Prometheus and Grafana. Designs dashboard hierarchy: platform, service, business metrics. Implements SLO-based alerting, configures Grafana templates for typical Elixir/Phoenix services with PromEx metrics.
Backend Developer (Go) 必要 Defines monitoring standards for the Go team: mandatory RED metrics, Grafana dashboard templates, SLO framework. Implements monitoring as part of Definition of Done, configures on-call rotation with automatic alerting and incident runbooks.
Backend Developer (Java/Kotlin) 必要 Defines monitoring strategy: mandatory metrics, SLO, error budget policy. Implements observability culture. Conducts post-mortems with metrics and traces.
Backend Developer (Node.js) 必要 Defines metrics standards: mandatory metrics per service, naming conventions, dashboard templates. Implements SLO-based alerting.
Backend Developer (PHP) 必要 Defines monitoring strategy for the product: mandatory metrics, SLOs, error budget policy. Implements observability culture in the team. Conducts post-mortem analysis of incidents using metrics and traces.
Backend Developer (Python) 必要 Defines monitoring standards for the organization. Implements unified dashboards. Configures multi-cluster monitoring. Optimizes Prometheus infrastructure.
Backend Developer (Rust) 必要 Defines metrics standards for Rust platform: mandatory metric labels (service, version, environment), naming conventions, shared metrics crate. Develops Grafana templates for standardized dashboards and alerting runbooks with automated escalation.
Backend Developer (Scala) 必要 Defines monitoring standards for Scala team: mandatory metrics per service, SLO budgets, on-call procedures. Reviews dashboard and alert quality, implements SRE practices, configures automated runbooks and defines metrics for error budget policy.
BI Analyst 必要 Defines observability strategy for BI data platforms: establishes SLO-based approach for data pipeline reliability and freshness, coordinates data incident management, optimizes MTTD/MTTR for data quality incidents through improved monitoring.
Blockchain Developer Defines monitoring standards: mandatory metrics, dashboard templates.
Cloud Engineer 必要 Defines metrics strategy for cloud platform: Prometheus vs CloudWatch vs Datadog, golden signals framework, standard dashboards for each workload type. Introduces SLO-based monitoring, budget alerts and automated capacity recommendations based on metrics.
Compiler Engineer Defines monitoring strategy for the compiler platform: key SLI/SLO, cascading alerts, metrics correlation with releases. Standardizes dashboards for the entire development team.
Computer Vision Engineer Defines CV system monitoring standards for the team, creates dashboard templates and SLI/SLO for inference services. Ensures unified ML production health view.
Data Analyst 必要 Defines observability strategy for analytics platforms: establishes SLO-based approach for analytical data quality and processing latency, coordinates data processing incident management, optimizes MTTD/MTTR for analytical pipeline failures.
Data Scientist Defines ML monitoring standards: mandatory metrics, model health dashboards, alerting policies.
Database Engineer / DBA 必要 Defines database monitoring standards: mandatory DBMS metrics, dashboard templates, alerting severity levels. Coordinates monitoring between DBA and SRE. Implements monitoring as part of database provisioning.
Desktop Developer (.NET WPF/WinUI/MAUI) Defines monitoring strategy for .NET desktop development platform, standardizes metrics and integrates OpenTelemetry. Introduces data-driven approach to infrastructure optimization based on server component monitoring.
Desktop Developer (Electron/Tauri) Defines monitoring and observability strategy for all organization's desktop products. Introduces SLI/SLO based on Prometheus metrics adapted to Electron application specifics.
Desktop Developer (Qt/C++) Defines monitoring strategy for the Qt development platform, standardizes quality and performance metrics. Introduces a data-driven approach to decision making based on build infrastructure monitoring and desktop application telemetry.
DevOps Engineer 必要 Defines organizational monitoring strategy: metric standards (RED, USE), mandatory dashboards for each service, SLO framework. Designs centralized Prometheus platform with multi-tenancy, cost-effective retention and self-service for teams.
DevSecOps Engineer 必要 Defines metrics and monitoring strategy for security operations. Manages observability platform (Prometheus + Thanos + Grafana). Builds security KPI dashboards for CISO: MTTD, MTTR, vulnerability trends, compliance posture. Introduces SLO-based approach to security: availability, data integrity, confidentiality.
Embedded Developer Defines observability standards for embedded products, including golden signals for device health assessment. Designs scalable monitoring architecture for tens of thousands of IoT devices.
Engineering Manager 必要 Defines observability strategy at the organizational level with Prometheus & Grafana: establishes SLO frameworks aligned with business metrics, coordinates cross-team incident management, optimizes MTTD/MTTR through organizational reliability programs.
Flutter Developer Establishes monitoring standards for Flutter projects with mandatory metrics. Develops reference dashboards and trains the team on Prometheus and Grafana.
Frontend Developer (Angular) Establishes monitoring standards for Angular projects with mandatory SLI metrics. Develops reference Grafana dashboards and trains the team on performance analysis.
Frontend Developer (React) Defines frontend application monitoring strategy for the organization with Prometheus and Grafana. Introduces performance budgets and SLI/SLO for React applications based on collected metrics.
Frontend Developer (Svelte) Defines frontend application monitoring standards for the team, creates dashboard templates and alerting policies. Ensures a unified frontend observability picture.
Frontend Developer (Vue) Standardizes frontend application monitoring for the team. Defines SLI/SLO for user experience, configures on-call alerts and dashboards for different stakeholder levels.
Fullstack Developer Defines monitoring standards: mandatory metrics, dashboard templates, alerting policies. Implements observability culture.
Game Designer Defines monitoring and observability strategy for the entire game project. Standardizes metrics and dashboards for all teams: gameplay, infra, analytics, QA. Builds a data-driven operations culture through game system monitoring and alerting.
Game QA Engineer Defines QA monitoring standards: performance tracking requirements, quality dashboards, alerting policies.
Game Server Developer 必要 Defines observability strategy for game server products with Prometheus & Grafana: establishes SLO-based approach for player experience metrics, coordinates game incident management, optimizes MTTD/MTTR for game server issues through real-time monitoring.
Infrastructure Engineer 必要 Defines monitoring standards for the organization: mandatory metrics (RED/USE methods), standard dashboards for each service type, alert creation process with runbooks. Implements Grafana-as-a-service for self-service monitoring, reviews team alerting rules and defines SLO for monitoring infrastructure.
iOS Developer Defines monitoring strategy for the iOS ecosystem server platform, standardizes metrics and dashboards for mobile-specific KPIs. Implements data-driven approach to mobile API optimization based on performance monitoring.
IoT Engineer 必要 Defines observability strategy for IoT products with Prometheus & Grafana: establishes SLO-based approach for device fleet reliability, coordinates IoT incident management, optimizes MTTD/MTTR for device connectivity and data processing incidents.
Language Tooling Engineer Defines monitoring strategy for all organizational language tools. Designs a unified observability platform with cross-service correlation of compiler and analyzer metrics.
LLM Engineer Defines monitoring standards: mandatory LLM metrics, dashboards, alerting.
MLOps Engineer 必要 Defines monitoring standards for the MLOps team: mandatory metrics for each inference service, standard dashboards, alerting policies. Implements SLI/SLO framework for ML services, configures on-call processes for model incidents, and ensures model metric visibility for data science and product teams.
Network Engineer Establishes Prometheus and Grafana usage standards for the network engineering team and makes architectural decisions. Defines the technical roadmap incorporating this skill. Mentors senior engineers and influences practices of adjacent teams.
NLP Engineer Defines monitoring strategy for all organizational ML/NLP systems. Designs unified observability platform correlating infrastructure and ML metrics for rapid diagnostics.
Penetration Testing Engineer Defines security monitoring standards for the team, creates executive dashboards and alerting for critical vulnerabilities. Coordinates with SOC on security metrics integration.
Performance Testing Engineer 必要 Defines performance metrics standards: mandatory metrics, dashboard templates, threshold management. Implements SLO-based performance monitoring.
Platform Engineer 必要 Defines metrics platform strategy: build vs buy, cardinality management, cost optimization. Leads observability-as-code adoption across all teams. Designs SLO-based monitoring with automated alerting. Creates observability maturity model for the organization.
QA Automation Engineer 必要 Defines observability strategy for QA infrastructure with Prometheus & Grafana: establishes SLO-based approach for test pipeline reliability, coordinates test infrastructure incident management, optimizes MTTD/MTTR for test environment failures.
QA Engineer (Manual) Defines monitoring strategy for test infrastructure using Prometheus and Grafana. Establishes SLO-based approach for quality metrics. Coordinates test observability.
QA Security Engineer 必要 Defines security metrics standards: mandatory KPIs, dashboard templates, reporting cadence. Implements data-driven security management.
React Native Developer Defines the monitoring strategy for all mobile projects. Architects a unified observability system correlating server and client metrics for fast problem diagnosis.
Release Engineer Establishes Prometheus and Grafana standards for the release engineering team and makes architectural decisions. Defines the technical roadmap considering this skill. Mentors senior engineers and influences practices of adjacent teams.
Security Analyst 必要 Defines security observability strategy at the product level with Prometheus & Grafana: establishes SLO-based approach for security monitoring effectiveness, coordinates security incident management across teams, optimizes MTTD/MTTR for threat detection.
Site Reliability Engineer (SRE) 必要 Defines metrics standards: mandatory metrics per service, naming conventions, dashboard templates. Manages Prometheus infrastructure costs. Coordinates monitoring adoption.
Smart Contract Developer Defines monitoring standards for blockchain systems with Prometheus & Grafana: establishes mandatory metrics for smart contract execution and gas usage, creates dashboard templates for on-chain activity monitoring, implements alerting policies for contract anomalies and security events.
Solutions Architect 必要 Defines observability strategy for distributed architectures with Prometheus & Grafana: establishes SLO-based approach aligned with architecture tiers, coordinates cross-team incident management through monitoring standardization, optimizes MTTD/MTTR through architectural observability patterns.
Systems Programmer (C/C++) Defines monitoring standards: system metrics requirements, dashboards, alerting.
Technical Lead 必要 Defines observability strategy for technical products with Prometheus & Grafana: establishes SLO-based approach for service reliability targets, coordinates incident management and on-call practices, optimizes MTTD/MTTR through improved monitoring and alerting standardization.
Technical Product Manager Defines monitoring strategy for the product team. Establishes SLOs for the product, ensures visibility for stakeholders and configures escalation process on SLA violations.
Technical Writer Defines corporate operational documentation and monitoring guide standards for all projects. Coordinates creation of a unified runbook repository with automatic updates from alerting rules. Implements documentation-as-code approach to operational knowledge management.
Telecom Developer 必要 Defines observability strategy for telecom products with Prometheus & Grafana: establishes SLO-based approach for telecom service quality metrics, coordinates telecom incident management, optimizes MTTD/MTTR for signaling and media processing incidents.
Unity Developer Defines monitoring standards for Unity game infrastructure with Prometheus & Grafana: establishes mandatory metrics for game server performance and player experience, creates dashboard templates for multiplayer session monitoring, implements alerting policies for game service degradation.
Unreal Engine Developer Defines monitoring standards: mandatory metrics, dashboards, alerting policies.
XR Unity Developer Defines monitoring and observability strategy for all organizational XR services. Introduces SLI/SLO based on Prometheus metrics, adapted to XR application latency requirements.
角色 必要性 描述
1C Developer Shapes organizational 1C system monitoring standards. Defines key performance and reliability metrics ensuring stable operation of all company 1C solutions.
AI Product Engineer Shapes corporate observability platform for AI products based on Prometheus and Grafana with metric federation. Defines SLI/SLO for AI services and integrates technical monitoring with product portfolio business analytics.
Analytics Engineer Shapes the corporate observability strategy for the analytics ecosystem integrating pipeline monitoring and business metrics. Defines SLI/SLO for the data platform and ensures data state transparency for the entire organization.
Android Developer Shapes mobile infrastructure monitoring standards at the organizational level. Defines key SLIs/SLOs for Android products and ensures service quality transparency.
Application Security Engineer 必要 Defines organizational security observability strategy with Prometheus & Grafana: implements enterprise security monitoring platforms and detection engineering, shapes security-aware reliability culture, defines enterprise SLO framework for security detection coverage and incident response.
AR/VR Developer Defines the observability strategy for the AR/VR platform across the entire organization. Creates platform dashboards for comparative performance analysis of XR products.
Backend Developer (C#/.NET) 必要 Shapes observability platform: standards for .NET services, SLO framework, cost management. End-to-end visibility.
Backend Developer (Elixir) 必要 Develops monitoring strategy for the entire organizational Elixir ecosystem. Designs unified observability platform with Prometheus Federation, Thanos and Grafana. Defines SLO/SLI standards for Elixir services, implements automatic degradation detection through ML on metrics.
Backend Developer (Go) 必要 Shapes organizational Prometheus/Grafana monitoring strategy for Go platform: multi-cluster monitoring, Thanos/Mimir for long-term storage, metric standards. Develops platform libraries with auto-instrumentation and unified observability portal.
Backend Developer (Java/Kotlin) 必要 Shapes observability platform: metrics and dashboard standards for all Java services, SLO framework, cost management for metrics. Ensures end-to-end visibility.
Backend Developer (Node.js) 必要 Shapes metrics strategy: unified instrumentation for Node.js, Prometheus vs Datadog, cost management. Defines observability platform for organization.
Backend Developer (PHP) 必要 Shapes organizational observability platform: metric, dashboard, and alerting standards for all PHP services. Defines SLO framework. Selects and evolves monitoring tools. Ensures end-to-end visibility for business processes.
Backend Developer (Python) 必要 Shapes company monitoring strategy. Evaluates Prometheus vs Datadog vs alternatives. Designs observability platform.
Backend Developer (Rust) 必要 Shapes organizational metrics strategy: Prometheus architecture for multi-cluster monitoring, Thanos/Cortex for long-term storage. Defines SLO framework based on Rust metrics, automated capacity planning and metric storage cost optimization.
Backend Developer (Scala) 必要 Shapes monitoring strategy for Scala platform: Prometheus federation/Thanos architecture for scaling, unified SLO framework. Defines organizational observability culture, monitoring budgets, metrics integration with business intelligence and AIOps strategy for automated incident response.
BI Analyst 必要 Defines organizational observability strategy for data analytics with Prometheus & Grafana: implements enterprise data pipeline monitoring, builds data reliability culture across analytics teams, establishes enterprise SLO framework for data quality and freshness at scale.
Blockchain Developer Shapes blockchain monitoring strategy with Prometheus & Grafana: designs enterprise observability platform for blockchain infrastructure, establishes governance for on-chain and off-chain monitoring standards, implements organizational monitoring frameworks for DeFi and smart contract operations.
Cloud Engineer 必要 Shapes enterprise-level observability platform: unified metrics across clouds (Grafana Cloud/Datadog), custom metric taxonomies, FinOps dashboards. Designs architecture for processing millions of time series with cost-effective storage and sub-second query latency.
Compiler Engineer Shapes observability strategy for the compiler ecosystem: metrics federation across teams, predictive performance analytics, integration with capacity planning.
Computer Vision Engineer Shapes CV platform monitoring strategy for the organization, defines unified SLI/SLO for ML services. Coordinates ML metrics integration into the overall observability system.
Data Analyst 必要 Defines organizational observability strategy for data platforms with Prometheus & Grafana: implements enterprise analytical pipeline monitoring, builds data quality culture across data teams, establishes enterprise SLO framework for analytical data reliability.
Data Scientist Shapes ML observability strategy: platform model monitoring, drift detection governance.
Database Engineer / DBA 必要 Shapes database observability strategy: Prometheus federation for fleet monitoring, Thanos/Mimir for long-term storage, unified dashboards. Defines metrics strategy for the entire organizational data platform.
Desktop Developer (.NET WPF/WinUI/MAUI) Shapes corporate observability strategy for .NET desktop development ecosystem with server and client monitoring integration. Defines SLI/SLO for infrastructure services and desktop application delivery processes.
Desktop Developer (Electron/Tauri) Shapes monitoring architecture for the Electron product ecosystem at scale of thousands of installations. Defines observability standards for desktop applications with server infrastructure.
Desktop Developer (Qt/C++) Shapes corporate observability strategy for the Qt desktop development ecosystem with integrated server and client monitoring. Defines SLI/SLO for infrastructure services and desktop application delivery processes.
DevOps Engineer 必要 Develops organizational metrics platform architecture: multi-cluster Prometheus with Thanos/Cortex, scaling to millions of time series. Defines observability strategy: unified metrics, logs, traces with correlation and ML-driven insights.
DevSecOps Engineer 必要 Architecturally designs enterprise observability platform with security as first-class signal. Defines monitoring-as-code strategy. Develops reference architecture for security monitoring in multi-cloud environment. Influences industry security observability practices through publications.
Embedded Developer Shapes observability platform for IoT: unified monitoring across edge devices and cloud, custom metrics for device health, scalable architecture for millions of data points from device fleet.
Engineering Manager 必要 Defines organizational observability strategy with Prometheus & Grafana: implements enterprise reliability platforms and monitoring standards, builds reliability engineering culture, establishes enterprise SLO frameworks aligned with business outcomes and customer experience.
Flutter Developer Defines Flutter app observability strategy for the entire organization. Creates platform dashboards for comparative analysis of mobile product performance.
Frontend Developer (Angular) Defines frontend application observability strategy for the entire organization. Creates platform dashboards for comparative performance analysis of Angular products.
Frontend Developer (React) Shapes frontend monitoring architecture for the React product ecosystem. Defines observability standards and user experience quality metrics at organizational scale.
Frontend Developer (Svelte) Shapes frontend platform monitoring strategy for the organization, defines unified SLI/SLO. Coordinates frontend metric integration into the overall observability platform.
Frontend Developer (Vue) Shapes organizational frontend platform monitoring strategy. Creates unified observability stack for all client applications with automated degradation detection.
Fullstack Developer Shapes observability strategy: platform-wide monitoring, unified dashboards, SLO framework. Defines monitoring governance.
Game Designer Shapes corporate monitoring strategy for the studio's game product lineup. Defines unified observability platform architecture for all projects. Researches and implements advanced monitoring approaches: AIOps, predictive alerting for live-service games.
Game QA Engineer Shapes QA observability strategy: platform-wide quality monitoring, automated regression detection, governance.
Game Server Developer 必要 Defines organizational observability strategy for game platforms with Prometheus & Grafana: implements enterprise monitoring infrastructure for multi-title game services, builds reliability culture across game development studios, establishes enterprise SLO framework for player experience metrics at scale.
Infrastructure Engineer 必要 Shapes metrics and monitoring strategy at company level: architecture for millions of time-series (Mimir, Victoria Metrics), multi-region observability, monitoring stack FinOps. Defines unified observability platform roadmap, custom metrics standards for business observability and cost models for monitoring infrastructure.
iOS Developer Shapes the corporate observability strategy for mobile ecosystem server infrastructure with client telemetry integration. Defines SLIs/SLOs for mobile API and ensures mobile service health transparency.
IoT Engineer 必要 Defines organizational observability strategy. Implements platform solutions. Builds reliability culture. Defines enterprise SLO framework.
Language Tooling Engineer Shapes language tool performance monitoring standards at the industry level. Defines key quality and reliability metrics for the development tools ecosystem.
LLM Engineer Shapes LLM monitoring strategy: platform observability, cost governance.
MLOps Engineer 必要 Shapes the observability strategy for the organization's MLOps platform: unified Prometheus/Grafana stack for all ML teams, metric and dashboard standards. Designs scalable monitoring architecture for thousands of ML services, defines organization-level SLO framework, and integrates ML metrics with business KPIs for tracking model ROI.
Network Engineer Shapes Prometheus and Grafana strategy for network engineering at the organizational level. Defines best practices and influences technology choices beyond their own team. Is a recognized expert in this area.
NLP Engineer Shapes ML system monitoring standards at organizational level. Defines key NLP service quality and reliability metrics influencing all company ML products.
Penetration Testing Engineer Shapes security metrics strategy for the organization based on Prometheus/Grafana. Defines security KPIs and creates unified security dashboards for leadership.
Performance Testing Engineer 必要 Designs performance observability platform: unified metrics for testing and production, automated analysis, capacity forecasting. Defines metrics strategy.
Platform Engineer 必要 Shapes vision for unified metrics platform: PromQL federation, cross-signal correlation, ML-based anomaly detection. Defines OpenMetrics and OTLP standardization strategy. Evaluates next-gen approaches: streaming metrics, real-time analytics for intelligent observability platform.
QA Automation Engineer 必要 Defines organizational observability strategy for quality engineering with Prometheus & Grafana: implements enterprise test infrastructure monitoring, builds quality-focused reliability culture, establishes enterprise SLO framework for test pipeline and environment reliability.
QA Engineer (Manual) Shapes corporate observability-driven QA methodology at the company level. Defines standards for using monitoring to assess software quality in the industry. Explores and implements AIOps approaches to automatic defect detection through monitoring.
QA Security Engineer 必要 Designs security metrics platform: unified security dashboards, executive reporting, risk quantification. Defines organizational security measurement strategy.
React Native Developer Shapes mobile infrastructure monitoring standards at the organizational level. Defines key SLIs/SLOs for mobile products and ensures service quality transparency.
Release Engineer Shapes Prometheus and Grafana strategy for release engineering at the organizational level. Defines best practices and influences technology choices beyond their own team. Is a recognized expert in this area.
Security Analyst 必要 Defines organizational security observability strategy with Prometheus & Grafana: implements enterprise security monitoring platforms, builds security-aware reliability culture, establishes enterprise SLO framework for security detection and response effectiveness.
Site Reliability Engineer (SRE) 必要 Designs organizational metrics platform: Prometheus/Mimir vs Datadog vs VictoriaMetrics, multi-tenant architecture, cost management. Defines metrics governance.
Smart Contract Developer Shapes blockchain monitoring strategy with Prometheus & Grafana: designs enterprise observability for smart contract platforms, establishes governance for contract execution and gas usage monitoring, implements organizational frameworks for on-chain security and performance observability.
Solutions Architect 必要 Defines organizational observability strategy for distributed architectures with Prometheus & Grafana: implements enterprise monitoring platform with federation and multi-cluster support, builds reliability culture through architectural patterns, establishes enterprise SLO framework for system reliability.
Systems Programmer (C/C++) Shapes system observability strategy: kernel-level monitoring, eBPF governance.
Technical Lead 必要 Defines organizational observability strategy for technical excellence with Prometheus & Grafana: implements enterprise reliability platforms and monitoring tooling standards, shapes reliability culture through engineering practices, defines enterprise SLO framework for service reliability and developer experience.
Technical Product Manager Shapes observability strategy for the product portfolio. Creates unified business monitoring platform linking technical metrics with business results for all products.
Technical Writer Shapes industry standards for documenting monitoring and observability practices. Publishes research on runbook documentation effectiveness for reducing MTTR. Influences the development of automatic operational documentation generation from monitoring.
Telecom Developer 必要 Defines organizational observability strategy for telecom platforms with Prometheus & Grafana: implements enterprise monitoring for multi-standard telecom services, builds reliability culture across telecom engineering, establishes enterprise SLO framework for telecom service quality at scale.
Unity Developer Shapes observability strategy: game telemetry platform, real-time analytics, governance.
Unreal Engine Developer Shapes monitoring strategy: game observability platform, analytics, governance.
XR Unity Developer Shapes monitoring architecture for XR product ecosystem at scale. Defines observability standards and service quality metrics for global XR platform with thousands of users.

社区

👁 关注 ✏️ 建议修改 登录以建议修改
📋 提案
暂无提案 Prometheus & Grafana
正在加载评论...