Site Reliability Engineer (SRE)
Ensuring reliability, scalability, and performance of production systems
Site Reliability Engineer (SRE) ist eine Rolle in der Familie DevOps & SRE. Es umfasst 61 Fähigkeiten über 5 Stufen (von Junior bis Principal). 139 Fähigkeiten sind obligatorisch. Schlüsselbereiche: Programming Fundamentals, Backend Development, Database Management.
Technologie-Stack
Fokus nach Stufe
Monitoring SLI/SLO. Participating in on-call rotation. Writing runbooks. Automating routine operations. Incident analysis.
Defining SLI/SLO/SLA. Designing monitoring. Capacity planning. Automating incident response. Post-mortem analysis.
Designing highly available systems. Chaos engineering. Performance engineering. Error budgets. Coordination with development.
SRE strategy. Reliability culture. SLO standards. Incident management processes. Coordination with product.
Enterprise reliability strategy. Multi-region architecture. SRE culture at scale. Industry best practices.
Kompetenzmatrix
61 Fähigkeiten × 5 Stufen. Klicken Sie auf eine Zelle für Details.
AI-Assisted Development
4 Fähigkeiten| Faehigkeiten | Jun | Mid | Sen | Lead | Princ |
|---|---|---|---|---|---|
| GitHub Copilot | A | W | A | E | E |
| Cursor IDE | A | W | A | E | E |
| ChatGPT / Claude | A | W | A | E | E |
| Prompt Engineering for Code | A | W | A | E | E |
API & Integration
3 Fähigkeiten| Faehigkeiten | Jun | Mid | Sen | Lead | Princ |
|---|---|---|---|---|---|
| REST API Design | A | W | A | E | E |
| GraphQL Design | A | W | A | E | E |
| API Documentation | A | W | A | E | E |
Architecture & System Design
4 Fähigkeiten| Faehigkeiten | Jun | Mid | Sen | Lead | Princ |
|---|---|---|---|---|---|
| System Design Fundamentals | A | W | A | E | E |
| High Load Architecture | A | W | A | E | E |
| Capacity Planning | A | W | A | E | E |
| Disaster Recovery Design | A | W | A | E | E |
Backend Development
3 Fähigkeiten| Faehigkeiten | Jun | Mid | Sen | Lead | Princ |
|---|---|---|---|---|---|
| Python Web Frameworks | A | W | A | E | E |
| Apache Kafka | A | W | A | E | E |
| Redis | A | W | A | E | E |
Cloud & Infrastructure
9 Fähigkeiten| Faehigkeiten | Jun | Mid | Sen | Lead | Princ |
|---|---|---|---|---|---|
| Docker | A | W | A | E | E |
| Kubernetes Core | A | W | A | E | E |
| Kubernetes Advanced | A | W | A | E | E |
| Helm | A | W | A | E | E |
| Terraform | A | W | A | E | E |
| AWS | A | W | A | E | E |
| Network Fundamentals | A | W | A | E | E |
| Load Balancing | A | W | A | E | E |
| VPN & Network Isolation | A | W | A | E | E |
Database Management
3 Fähigkeiten| Faehigkeiten | Jun | Mid | Sen | Lead | Princ |
|---|---|---|---|---|---|
| PostgreSQL | A | W | A | E | E |
| Database Indexing | A | W | A | E | E |
| Query Optimization | A | W | A | E | E |
DevOps & CI/CD
3 Fähigkeiten| Faehigkeiten | Jun | Mid | Sen | Lead | Princ |
|---|---|---|---|---|---|
| GitHub Actions / GitLab CI | A | W | A | E | E |
| GitOps Practices | A | W | A | E | E |
| ArgoCD | A | W | A | E | E |
Observability & Monitoring
11 Fähigkeiten| Faehigkeiten | Jun | Mid | Sen | Lead | Princ |
|---|---|---|---|---|---|
| Structured Logging | A | W | A | E | E |
| ELK Stack | A | W | A | E | E |
| Grafana Loki | A | W | A | E | E |
| Prometheus & Grafana | A | W | A | E | E |
| Custom Business Metrics | A | W | A | E | E |
| OpenTelemetry | A | W | A | E | E |
| Jaeger / Grafana Tempo | A | W | A | E | E |
| Continuous Profiling | A | W | A | E | E |
| APM Tools | A | W | A | E | E |
| SLI / SLO / SLA | A | W | A | E | E |
| On-Call Management | A | W | A | E | E |
Performance Engineering
1 Fähigkeiten| Faehigkeiten | Jun | Mid | Sen | Lead | Princ |
|---|---|---|---|---|---|
| Latency Optimization | A | W | A | E | E |
Programming Fundamentals
9 Fähigkeiten| Faehigkeiten | Jun | Mid | Sen | Lead | Princ |
|---|---|---|---|---|---|
| Algorithms & Complexity | A | W | A | E | E |
| Data Structures | A | W | A | E | E |
| OOP & SOLID Principles | A | W | A | E | E |
| Design Patterns | A | W | A | E | E |
| Multithreading | A | W | A | E | E |
| Async Programming | A | W | A | E | E |
| Code Quality & Refactoring | A | W | A | E | E |
| Type Safety & Type Systems | A | W | A | E | E |
| Memory Management | A | W | A | E | E |
Security
5 Fähigkeiten| Faehigkeiten | Jun | Mid | Sen | Lead | Princ |
|---|---|---|---|---|---|
| OWASP & Application Security | A | W | A | E | E |
| Secure Coding Practices | A | W | A | E | E |
| Secrets Management | A | W | A | E | E |
| JWT / OAuth2 / OIDC | A | W | A | E | E |
| Incident Response Process | A | W | A | E | E |
Testing & QA
4 Fähigkeiten| Faehigkeiten | Jun | Mid | Sen | Lead | Princ |
|---|---|---|---|---|---|
| Unit Testing | A | W | A | E | E |
| Integration Testing | A | W | A | E | E |
| E2E Testing | A | W | A | E | E |
| Chaos Engineering | A | W | A | E | E |
Version Control & Collaboration
2 Fähigkeiten| Faehigkeiten | Jun | Mid | Sen | Lead | Princ |
|---|---|---|---|---|---|
| Git Advanced | A | W | A | E | E |
| Code Review | A | W | A | E | E |
Häufig gestellte Fragen
Welche Fähigkeiten werden für die Rolle Site Reliability Engineer (SRE) benötigt?
Die Rolle Site Reliability Engineer (SRE) erfordert 61 Fähigkeiten, davon 139 obligatorisch. Die Fähigkeiten verteilen sich auf 5 Stufen: von Junior bis Principal. Vollständige Matrix ansehen.
Wie steigt man in der Rolle Site Reliability Engineer (SRE) auf die nächste Stufe auf?
Nutzen Sie den Grade-Rechner, um Ihre aktuelle Stufe einzuschätzen und personalisierte Empfehlungen zu erhalten.
Welcher Technologie-Stack wird in der Rolle Site Reliability Engineer (SRE) verwendet?
Der Stack umfasst 5 Technologien auf verschiedenen Stufen. Linux, Prometheus/Grafana, PagerDuty/OpsGenie, Bash/Python scripting, Docker, Kubernetes basics, Kubernetes, Prometheus/Thanos, Grafana/Loki, OpenTelemetry, Terraform, Go/Python, Chaos Monkey basics, Runbook automation, Kubernetes advanced, Chaos Engineering (Litmus/Gremlin), eBPF tools, OpenTelemetry advanced, Custom exporters, Load testing (k6/Gatling)...
Wie definiert die Community die Anforderungen für die Rolle Site Reliability Engineer (SRE)?
Die Anforderungen werden von der Community durch ein Vorschlagssystem gestaltet. Jedes Mitglied kann Änderungen vorschlagen, die durch Abstimmung und Expertenprüfung gehen.