领域
Data Engineering
技能档案
Stored procedures, CTEs, window functions, bulk operations, SQL transformations
角色数
6
包含此技能的角色
级别数
5
结构化成长路径
必要要求
28
其余 2 个可选
Data Engineering
Batch Processing
2026/3/17
选择当前级别并对比期望。下方卡片显示晋升所需掌握的内容。
表格展示从初级到首席的技能深度变化。点击行查看详情。
| 角色 | 必要性 | 描述 |
|---|---|---|
| Analytics Engineer | 必要 | Writes basic SQL transformations in dbt: SELECT with column renaming, type casting, simple filters for staging models. Understands the ELT concept and the role of SQL as the primary language for analytical transformations. |
| BI Analyst | 必要 | Understands SQL-based ETL basics for BI warehouses. Writes simple extract-load queries for dimensional tables. Follows existing star schema load patterns and naming conventions for staging layers. |
| Data Analyst | 必要 | Understands SQL-based ETL fundamentals for analytical datasets. Writes basic data extraction and cleaning queries. Follows established pipeline patterns to prepare filtered datasets for ad-hoc analysis requests. |
| Data Engineer | 必要 | Writes SQL for ETL: INSERT INTO SELECT, MERGE for upserts, CTE for readable transformations. Uses window functions (ROW_NUMBER, LAG, LEAD) for data processing. |
| Data Scientist | Understands SQL-based ETL for ML data preparation. Writes basic queries to extract and filter training datasets. Follows established patterns for feature extraction and handles simple data type conversions in ETL steps. | |
| ML Engineer | 必要 | Writes SQL for extracting training data. Understands ETL for ML: extract features, transform, load into training format. Uses pandas.read_sql for data loading. |
| 角色 | 必要性 | 描述 |
|---|---|---|
| Analytics Engineer | 必要 | Develops complex SQL transformations in dbt: window functions for metric calculation, CTE chains for multi-step business logic, Jinja macros for DRY approach. Implements incremental models with merge strategy for optimization. |
| BI Analyst | 必要 | Builds ETL pipelines that populate dimensional models for BI reporting. Implements SCD Type 1/2 loads, manages surrogate keys, and ensures referential integrity across fact and dimension tables in the warehouse. |
| Data Analyst | 必要 | Builds SQL ETL pipelines for cohort extraction and analytical dataset preparation. Implements data cleaning transformations, handles missing values and outliers, and creates reusable ad-hoc data transformation templates. |
| Data Engineer | 必要 | Designs SQL transformations: stored procedures for complex ETL, parameterized queries, temp tables for intermediate computations. Optimizes execution plans. Manages transaction control. |
| Data Scientist | Builds ETL pipelines for feature engineering and ML training data preparation. Implements SQL-based feature transforms, manages dataset versioning through snapshot tables, and ensures reproducibility of data extraction for model experiments. | |
| ML Engineer | 必要 | Designs SQL ETL for feature computation. Uses dbt for ML feature transformation. Writes incremental ETL for updating training data. Automates through Airflow. |
| 角色 | 必要性 | 描述 |
|---|---|---|
| Analytics Engineer | 必要 | Architects optimal SQL transformations for the analytical warehouse: decomposing complex logic into intermediate models, warehouse-specific optimizations (Snowflake QUALIFY, BigQuery STRUCT). Creates reusable dbt macros for common patterns. |
| BI Analyst | 必要 | Architects end-to-end ETL for enterprise BI warehouses. Designs incremental load strategies, optimizes star/snowflake schema refresh cycles, and implements data quality gates ensuring report-ready datasets across business domains. |
| Data Analyst | 必要 | Architects complex ETL workflows for cross-functional analytical datasets. Designs cohort extraction frameworks, builds self-service data cleaning pipelines, and optimizes transformation logic for large-scale ad-hoc analytical workloads. |
| Data Engineer | 必要 | Designs SQL-based ETL architecture: ELT pattern (load-then-transform), incremental processing through merge/upsert, materialized views for performance. Integrates with dbt for version-controlled SQL. |
| Data Scientist | 必要 | Architects ETL workflows for end-to-end ML pipelines including feature stores. Designs scalable feature engineering transforms, implements data versioning strategies, and builds automated training data validation gates within ETL orchestration. |
| ML Engineer | 必要 | Designs ETL architecture for ML data pipeline. Optimizes ETL for large data volumes. Configures data quality checks in ETL. Integrates ETL with feature store. |
| 角色 | 必要性 | 描述 |
|---|---|---|
| Analytics Engineer | 必要 | Defines organizational SQL transformation standards: coding style guide, mandatory patterns (surrogate keys, audit columns), dbt macros and packages library. Implements automated SQL review and performance benchmarking for critical models. |
| BI Analyst | 必要 | Defines BI warehouse ETL strategy and standards across teams. Governs dimensional modeling conventions, orchestrates cross-domain data integration, and establishes SLA-driven refresh schedules for executive dashboards. |
| Data Analyst | 必要 | Defines ETL standards and data cleaning methodology for analytics teams. Establishes cohort definition governance, coordinates cross-team dataset preparation workflows, and drives adoption of reproducible analytical data pipelines. |
| Data Engineer | 必要 | Defines SQL standards for data team: style guide, review checklist, performance budgets. Chooses between SQL-based ETL (dbt) and code-based (PySpark) by scenario. |
| Data Scientist | 必要 | Defines ML data platform ETL strategy and feature engineering standards. Governs training data preparation workflows across DS teams, establishes data versioning policies, and coordinates ETL infrastructure for model training at scale. |
| ML Engineer | 必要 | Defines ETL strategy for ML data. Coordinates with Data Engineering on ML data requirements. Designs data contracts for ML features. |
| 角色 | 必要性 | 描述 |
|---|---|---|
| Analytics Engineer | 必要 | Architects the enterprise transformation layer strategy: SQL dialect unification through dbt adapters, portable business logic between warehouses. Defines the architecture for supporting real-time and batch transformations on a unified platform. |
| BI Analyst | 必要 | Shapes organization-wide BI data platform vision and ETL architecture. Drives adoption of modern ELT patterns, defines enterprise semantic layer standards, and aligns warehouse ETL strategy with long-term business intelligence roadmap. |
| Data Analyst | 必要 | Shapes enterprise analytical data strategy and ETL architecture. Defines organization-wide data cleaning standards, designs scalable cohort analysis infrastructure, and aligns ETL capabilities with strategic analytical objectives across business units. |
| Data Engineer | 必要 | Designs transformation strategy: SQL for declarative ETL, Python for complex logic, hybrid approaches. Defines query engine selection (Trino, BigQuery, Redshift) by workload pattern. |
| Data Scientist | 必要 | Shapes organization-wide ML data architecture and ETL vision. Drives feature store adoption, defines enterprise standards for training data lineage and versioning, and aligns ETL infrastructure with long-term AI/ML platform strategy. |
| ML Engineer | 必要 | Defines data pipeline strategy for ML platform. Evaluates ETL vs ELT vs streaming for ML. Designs data architecture for enterprise ML. |