Data Engineer
Building and maintaining data pipelines, data warehouses, ensuring data availability and quality
Data Engineer是Data Engineering族群中的角色。涵盖5个级别的64项技能(从Junior到Principal)。其中186项为必备技能。关键领域:Programming Fundamentals, Backend Development, Database Management。
技术栈
各级别重点
Writing ETL scripts (Python/SQL). Working with Airflow DAGs. Loading data into warehouse. Monitoring pipelines. SQL queries for analysts.
Designing data pipelines. Working with Spark/Flink. Optimizing SQL queries on large datasets. Data quality checks. Working with data warehouse.
Data platform architecture. Designing data lake/lakehouse. Storage cost optimization. Designing real-time pipelines. Mentoring.
Data platform strategy. DataOps practices. Governance and lineage. Coordination with ML and Analytics. Data quality standards.
Enterprise data strategy. Multi-cloud data architecture. Data mesh. Cost optimization at scale. Vendor evaluation.
技能矩阵
64 技能 × 5 级别. 点击单元格查看详情。
AI-Assisted Development
4 技能| 技能 | Jun | Mid | Sen | Lead | Princ |
|---|---|---|---|---|---|
| GitHub Copilot | A | W | A | E | E |
| Cursor IDE | A | W | A | A | — |
| ChatGPT / Claude | A | W | A | E | E |
| Prompt Engineering for Code | A | W | A | E | — |
API & Integration
4 技能| 技能 | Jun | Mid | Sen | Lead | Princ |
|---|---|---|---|---|---|
| REST API Design | A | W | A | E | E |
| GraphQL Design | A | W | A | E | E |
| gRPC & Protocol Buffers | A | W | A | E | E |
| API Documentation | A | W | A | E | E |
Architecture & System Design
1 技能| 技能 | Jun | Mid | Sen | Lead | Princ |
|---|---|---|---|---|---|
| System Design Fundamentals | A | W | A | E | E |
Backend Development
6 技能| 技能 | Jun | Mid | Sen | Lead | Princ |
|---|---|---|---|---|---|
| Python Web Frameworks | A | W | A | E | E |
| Apache Kafka | A | W | A | E | E |
| Redis | A | W | A | E | E |
| Task Queues | A | W | A | E | E |
| Elasticsearch / OpenSearch | A | W | A | E | E |
| S3 / Object Storage | A | W | A | E | E |
Cloud & Infrastructure
5 技能| 技能 | Jun | Mid | Sen | Lead | Princ |
|---|---|---|---|---|---|
| Docker | A | W | A | E | E |
| Kubernetes Core | A | W | A | E | E |
| Terraform | A | W | A | E | E |
| AWS | A | W | A | E | E |
| Network Fundamentals | A | W | A | — | — |
Data Engineering
14 技能| 技能 | Jun | Mid | Sen | Lead | Princ |
|---|---|---|---|---|---|
| Apache Spark | A | W | A | E | E |
| dbt | A | W | A | E | E |
| Pandas / Polars | A | W | A | E | E |
| SQL-based ETL | A | W | A | E | E |
| Stream Processing | A | W | A | E | E |
| Delta Lake / Apache Iceberg | A | W | A | E | E |
| Data Lake Architecture | A | W | A | E | E |
| Data Warehouse Design | A | W | A | E | E |
| Data Catalog | A | W | A | E | E |
| Data Lineage | A | W | A | E | E |
| Data Contracts | A | W | A | E | E |
| Data Quality | A | W | A | E | E |
| Apache Airflow | A | W | A | E | E |
| Dagster / Prefect | A | W | A | E | E |
Database Management
9 技能| 技能 | Jun | Mid | Sen | Lead | Princ |
|---|---|---|---|---|---|
| PostgreSQL | A | W | A | E | E |
| ClickHouse | A | W | A | E | E |
| Apache Cassandra | A | W | A | E | E |
| Database Indexing | A | W | A | E | E |
| Query Optimization | A | W | A | E | E |
| Replication & High Availability | A | W | A | E | E |
| Backup & Disaster Recovery | A | W | A | E | E |
| Data Modeling & Schema Design | A | W | A | E | E |
| Database Migrations | A | W | A | E | E |
DevOps & CI/CD
1 技能| 技能 | Jun | Mid | Sen | Lead | Princ |
|---|---|---|---|---|---|
| GitHub Actions / GitLab CI | A | W | A | E | E |
Observability & Monitoring
4 技能| 技能 | Jun | Mid | Sen | Lead | Princ |
|---|---|---|---|---|---|
| Structured Logging | A | W | A | E | E |
| Prometheus & Grafana | A | W | A | — | — |
| OpenTelemetry | A | W | A | E | E |
| SLI / SLO / SLA | A | W | A | E | E |
Programming Fundamentals
8 技能| 技能 | Jun | Mid | Sen | Lead | Princ |
|---|---|---|---|---|---|
| Algorithms & Complexity | A | W | A | E | E |
| Data Structures | A | W | A | E | E |
| OOP & SOLID Principles | A | W | A | E | E |
| Design Patterns | A | W | A | E | E |
| Multithreading | A | W | A | E | E |
| Async Programming | A | W | A | E | E |
| Code Quality & Refactoring | A | W | A | E | E |
| Type Safety & Type Systems | A | W | A | E | E |
Security
3 技能| 技能 | Jun | Mid | Sen | Lead | Princ |
|---|---|---|---|---|---|
| OWASP & Application Security | A | W | A | E | E |
| Secure Coding Practices | A | W | A | E | E |
| JWT / OAuth2 / OIDC | A | W | A | E | E |
Testing & QA
3 技能| 技能 | Jun | Mid | Sen | Lead | Princ |
|---|---|---|---|---|---|
| Unit Testing | A | W | A | E | E |
| Integration Testing | A | W | A | E | E |
| E2E Testing | A | W | A | E | E |
Version Control & Collaboration
2 技能| 技能 | Jun | Mid | Sen | Lead | Princ |
|---|---|---|---|---|---|
| Git Advanced | A | W | A | E | E |
| Code Review | A | W | A | E | E |
常见问题
Data Engineer角色需要哪些技能?
Data Engineer角色需要64项技能,其中186项为必备。技能分布在5个级别:从Junior到Principal。 查看完整矩阵.
如何在Data Engineer角色中晋升到下一级别?
使用等级计算器评估您当前的级别并获取个性化建议。系统将显示晋升所需发展的技能。
Data Engineer角色使用什么技术栈?
技术栈包含5种不同级别的技术。 Python 3.11+, SQL, Apache Airflow, PostgreSQL/ClickHouse, pandas, Docker, Git, Python 3.12+, SQL advanced, Airflow, Spark/PySpark, ClickHouse/BigQuery, dbt, Kafka basics, S3/GCS, Python, Spark/Flink, ClickHouse/BigQuery/Snowflake, Kafka/Debezium, Delta Lake/Iceberg, Terraform, Kubernetes, Data lineage (OpenLineage)...
社区如何定义Data Engineer角色的要求?
角色要求由社区通过提案系统制定。任何成员都可以提出修改建议,经过投票和专家评审后生效。