Domain
Observability & Monitoring
Skill Profile
New Relic, Elastic APM, Datadog APM, transaction tracing, bottleneck analysis
Roles
2
where this skill appears
Levels
5
structured growth path
Mandatory requirements
6
the other 4 optional
Observability & Monitoring
Profiling
3/17/2026
Choose your current level and compare expectations. The items below show what to cover to advance to the next level.
The table shows how skill depth grows from Junior to Principal. Click a row to see details.
| Role | Required | Description |
|---|---|---|
| Performance Testing Engineer | Uses APM tools like New Relic or Datadog to monitor test environments. Reads pre-configured dashboards to identify performance regressions. Correlates APM metrics with load test results under guidance. | |
| Site Reliability Engineer (SRE) | Navigates APM dashboards to check service health and error rates. Understands basic metrics like latency, throughput, and error percentages. Escalates anomalies detected via APM alerts to senior engineers. |
| Role | Required | Description |
|---|---|---|
| Performance Testing Engineer | Configures APM agents to instrument services during performance tests. Builds custom dashboards correlating load patterns with application metrics. Sets up alerting thresholds based on performance SLAs and baseline measurements. | |
| Site Reliability Engineer (SRE) | Configures APM instrumentation across microservices for production monitoring. Creates dashboards tracking golden signals and SLI compliance. Participates in on-call rotation using APM data to triage and resolve incidents. |
| Role | Required | Description |
|---|---|---|
| Performance Testing Engineer | Required | Designs end-to-end APM strategy for performance testing pipelines. Implements distributed tracing to pinpoint bottlenecks across service boundaries. Defines performance SLI/SLO frameworks and leads post-test analysis reviews. |
| Site Reliability Engineer (SRE) | Required | Architects the observability platform integrating APM, logging, and tracing. Defines SLI/SLO for critical services and automates error-budget tracking. Leads post-mortem processes and drives reliability improvements based on APM insights. |
| Role | Required | Description |
|---|---|---|
| Performance Testing Engineer | Required | Defines APM strategy for performance: tool selection (Datadog/New Relic/Dynatrace), integration with load testing, automated bottleneck detection. Implements APM for continuous performance. |
| Site Reliability Engineer (SRE) | Required | Defines APM strategy: Datadog vs New Relic vs open-source (OTel + backends), feature comparison, cost analysis. Implements APM for critical services. Defines instrumentation requirements. |
| Role | Required | Description |
|---|---|---|
| Performance Testing Engineer | Required | Designs APM platform for performance engineering: unified application monitoring, automated performance analysis, capacity prediction. Defines tool evaluation criteria. |
| Site Reliability Engineer (SRE) | Required | Designs APM platform: unified APM for all services, custom dashboards, automated alerting. Defines vendor selection criteria, negotiation strategy, multi-year roadmap. |