Site Reliability Engineer (SRE)

Ensuring reliability, scalability, and performance of production systems

DevOps & SRE Junior Middle Senior Lead / Staff Principal
Matriz completa Trayectoria profesional PDF
61 habilidades
5 niveles
139 obligatorias
305 requisitos

Site Reliability Engineer (SRE) es un rol en la familia DevOps & SRE. Tiene 61 habilidades en 5 niveles (de Junior a Principal). 139 habilidades son obligatorias. Dominios clave: Programming Fundamentals, Backend Development, Database Management.

Stack tecnológico

Junior Linux, Prometheus/Grafana, PagerDuty/OpsGenie, Bash/Python scripting, Docker, Kubernetes basics
Middle Kubernetes, Prometheus/Thanos, Grafana/Loki, OpenTelemetry, Terraform, Go/Python, Chaos Monkey basics, Runbook automation
Senior Kubernetes advanced, Chaos Engineering (Litmus/Gremlin), eBPF tools, OpenTelemetry advanced, Custom exporters, Load testing (k6/Gatling)
Lead / Staff SRE platform, Incident management automation, SLO automation, Multi-cluster monitoring, FinOps, Disaster Recovery testing
Principal Enterprise SRE architecture, Multi-region, Global traffic management, Reliability at scale

Enfoque por nivel

Junior

Monitoring SLI/SLO. Participating in on-call rotation. Writing runbooks. Automating routine operations. Incident analysis.

Middle

Defining SLI/SLO/SLA. Designing monitoring. Capacity planning. Automating incident response. Post-mortem analysis.

Senior

Designing highly available systems. Chaos engineering. Performance engineering. Error budgets. Coordination with development.

Lead / Staff

SRE strategy. Reliability culture. SLO standards. Incident management processes. Coordination with product.

Principal

Enterprise reliability strategy. Multi-region architecture. SRE culture at scale. Industry best practices.

Matriz de habilidades

61 habilidades × 5 niveles. Haga clic en una celda para ver detalles.

A Awareness W Working V Advanced E Expert

AI-Assisted Development

4 habilidades
Habilidades Jun Mid Sen Lead Princ
GitHub Copilot A W A E E
Cursor IDE A W A E E
ChatGPT / Claude A W A E E
Prompt Engineering for Code A W A E E

API & Integration

3 habilidades
Habilidades Jun Mid Sen Lead Princ
REST API Design A W A E E
GraphQL Design A W A E E
API Documentation A W A E E

Architecture & System Design

4 habilidades
Habilidades Jun Mid Sen Lead Princ
System Design Fundamentals A W A E E
High Load Architecture A W A E E
Capacity Planning A W A E E
Disaster Recovery Design A W A E E

Backend Development

3 habilidades
Habilidades Jun Mid Sen Lead Princ
Python Web Frameworks A W A E E
Apache Kafka A W A E E
Redis A W A E E

Cloud & Infrastructure

9 habilidades
Habilidades Jun Mid Sen Lead Princ
Docker A W A E E
Kubernetes Core A W A E E
Kubernetes Advanced A W A E E
Helm A W A E E
Terraform A W A E E
AWS A W A E E
Network Fundamentals A W A E E
Load Balancing A W A E E
VPN & Network Isolation A W A E E

Database Management

3 habilidades
Habilidades Jun Mid Sen Lead Princ
PostgreSQL A W A E E
Database Indexing A W A E E
Query Optimization A W A E E

DevOps & CI/CD

3 habilidades
Habilidades Jun Mid Sen Lead Princ
GitHub Actions / GitLab CI A W A E E
GitOps Practices A W A E E
ArgoCD A W A E E

Observability & Monitoring

11 habilidades
Habilidades Jun Mid Sen Lead Princ
Structured Logging A W A E E
ELK Stack A W A E E
Grafana Loki A W A E E
Prometheus & Grafana A W A E E
Custom Business Metrics A W A E E
OpenTelemetry A W A E E
Jaeger / Grafana Tempo A W A E E
Continuous Profiling A W A E E
APM Tools A W A E E
SLI / SLO / SLA A W A E E
On-Call Management A W A E E

Performance Engineering

1 habilidades
Habilidades Jun Mid Sen Lead Princ
Latency Optimization A W A E E

Programming Fundamentals

9 habilidades
Habilidades Jun Mid Sen Lead Princ
Algorithms & Complexity A W A E E
Data Structures A W A E E
OOP & SOLID Principles A W A E E
Design Patterns A W A E E
Multithreading A W A E E
Async Programming A W A E E
Code Quality & Refactoring A W A E E
Type Safety & Type Systems A W A E E
Memory Management A W A E E

Security

5 habilidades
Habilidades Jun Mid Sen Lead Princ
OWASP & Application Security A W A E E
Secure Coding Practices A W A E E
Secrets Management A W A E E
JWT / OAuth2 / OIDC A W A E E
Incident Response Process A W A E E

Testing & QA

4 habilidades
Habilidades Jun Mid Sen Lead Princ
Unit Testing A W A E E
Integration Testing A W A E E
E2E Testing A W A E E
Chaos Engineering A W A E E

Version Control & Collaboration

2 habilidades
Habilidades Jun Mid Sen Lead Princ
Git Advanced A W A E E
Code Review A W A E E

Preguntas frecuentes

¿Qué habilidades se necesitan para el rol de Site Reliability Engineer (SRE)?

El rol de Site Reliability Engineer (SRE) requiere 61 habilidades, de las cuales 139 son obligatorias. Las habilidades se distribuyen en 5 niveles: de Junior a Principal. Ver matriz completa.

¿Cómo avanzar al siguiente nivel en el rol de Site Reliability Engineer (SRE)?

Use la Calculadora de grado para evaluar su nivel actual y obtener recomendaciones personalizadas.

¿Qué stack tecnológico se usa en el rol de Site Reliability Engineer (SRE)?

El stack incluye 5 tecnologías en diferentes niveles. Linux, Prometheus/Grafana, PagerDuty/OpsGenie, Bash/Python scripting, Docker, Kubernetes basics, Kubernetes, Prometheus/Thanos, Grafana/Loki, OpenTelemetry, Terraform, Go/Python, Chaos Monkey basics, Runbook automation, Kubernetes advanced, Chaos Engineering (Litmus/Gremlin), eBPF tools, OpenTelemetry advanced, Custom exporters, Load testing (k6/Gatling)...

¿Cómo define la comunidad los requisitos para el rol de Site Reliability Engineer (SRE)?

Los requisitos del rol son definidos por la comunidad a través de un sistema de propuestas. Cualquier miembro puede sugerir cambios que pasan por votación y revisión de expertos.

Comunidad

👁 Seguir ✏️ Sugerir cambio Inicia sesión para sugerir cambios
📋 Propuestas
Aún no hay propuestas para Site Reliability Engineer (SRE)
Cargando comentarios...