Domain
Data Engineering
Skill Profile
Great Expectations, dbt tests, data contracts, anomaly detection
Roles
6
where this skill appears
Levels
5
structured growth path
Mandatory requirements
28
the other 2 optional
Data Engineering
Data Quality
3/17/2026
Choose your current level and compare expectations. The items below show what to cover to advance to the next level.
The table shows how skill depth grows from Junior to Principal. Click a row to see details.
| Role | Required | Description |
|---|---|---|
| Analytics Engineer | Required | Runs basic dbt tests: not_null, unique, accepted_values, relationships. Understands test results and fixes simple data quality issues. Monitors dbt test warnings in CI. |
| BI Analyst | Required | Uses basic data quality checks in Tableau and Power BI dashboards. Validates data sources in SQL before building reports. Follows team standards for data cleansing in Excel and Google Sheets. Identifies obvious anomalies in ClickHouse/PostgreSQL query results. |
| Data Analyst | Required | Applies data quality checks using pandas and SQL before analysis. Validates datasets for completeness in Jupyter notebooks. Uses profiling tools to detect missing values and outliers. Follows team conventions for data cleansing in Superset reports. |
| Data Engineer | Required | Writes basic data quality checks: NOT NULL, unique constraints, value ranges. Uses dbt tests or Great Expectations for validation. Understands metrics: completeness, accuracy, timeliness. |
| Data Scientist | Validates training datasets for completeness and label quality using pandas and numpy. Applies data profiling to detect missing values and distribution anomalies in Jupyter. Uses scikit-learn utilities for feature validation before model training. Documents data quality issues found during EDA. | |
| ML Engineer | Required | Understands importance of data quality for ML. Performs basic checks: null values, duplicates, distribution shifts. Uses pandas profiling for EDA. |
| Role | Required | Description |
|---|---|---|
| Analytics Engineer | Required | Configures comprehensive dbt testing: custom generic tests, dbt expectations package for statistical checks, freshness tests for sources. Implements data quality dashboards for monitoring quality metrics. |
| BI Analyst | Required | Implements automated data quality checks in BI pipelines using SQL and dbt tests. Configures freshness and completeness monitors for Tableau/Power BI dashboards. Builds validation layers in ClickHouse and BigQuery to catch schema drift. Creates quality scorecards and alerting for key metrics. |
| Data Analyst | Required | Builds automated validation pipelines using Great Expectations and dbt tests. Implements statistical anomaly detection for A/B testing datasets. Configures quality monitors in Airflow DAGs to catch upstream issues. Designs profiling reports with pandas-profiling and custom SQL checks. |
| Data Engineer | Required | Configures data quality framework: Great Expectations/Soda for automated checks, custom expectations, alerting on failures. Monitors data freshness and volume anomalies. |
| Data Scientist | Independently implements data pipelines with Data Quality tools. Optimizes performance. Ensures data quality. | |
| ML Engineer | Required | Uses Great Expectations/Soda for data validation. Configures automated data quality checks in ML pipeline. Monitors data drift before retraining. |
| Role | Required | Description |
|---|---|---|
| Analytics Engineer | Required | Architects the data quality strategy for the analytics platform: multi-layer testing (source → staging → marts), anomaly detection via dbt + elementary, automated alerting. Integrates quality checks into the CI/CD pipeline. |
| BI Analyst | Required | Architects data quality frameworks across Tableau, Power BI, and Superset. Designs end-to-end validation strategies for BigQuery and ClickHouse warehouses. Implements automated lineage tracking and quality scoring for business KPIs. Mentors team on data governance and quality-first culture. |
| Data Analyst | Required | Designs data quality architecture using Great Expectations, dbt, and custom frameworks. Implements observability with anomaly detection and root-cause analysis. Establishes data contracts between producer and consumer teams. Drives governance and SLA definitions for analytical datasets. |
| Data Engineer | Required | Designs data quality system: multi-layer validation (source → staging → mart), anomaly detection (statistical), automated remediation. Integrates quality metrics into data catalog. |
| Data Scientist | Required | Designs data architecture with Data Quality tools. Optimizes for big data. Implements data governance and quality frameworks. |
| ML Engineer | Required | Designs data quality framework for ML. Integrates data validation into ML pipeline. Configures alerting on data anomalies. Defines data quality SLAs for ML. |
| Role | Required | Description |
|---|---|---|
| Analytics Engineer | Required | Defines the organization's data quality standards: quality SLA per layer, mandatory tests for production models, processes for responding to quality incidents. Implements a data observability platform (Monte Carlo, Elementary). |
| BI Analyst | Required | Defines data quality strategy across BI and analytics teams. Coordinates data mesh principles with embedded quality gates. Shapes platform roadmap prioritizing observability, lineage, and quality automation. Drives cross-team alignment on data contracts and SLAs. |
| Data Analyst | Required | Leads data quality strategy across analytical teams and data domains. Shapes platform architecture to embed quality checks at every pipeline stage. Coordinates data mesh adoption with domain-owned quality accountability. Establishes org-wide quality KPIs and improvement processes. |
| Data Engineer | Required | Defines data quality standards: SLA per dataset, quality dimensions (accuracy, completeness, consistency, timeliness), ownership model. Implements data quality scorecard. |
| Data Scientist | Required | Defines data quality strategy for ML teams ensuring high-quality training datasets. Coordinates validation standards across feature stores and model pipelines. Drives adoption of observability tools and quality gates before training. Aligns quality practices with MLOps and data mesh. |
| ML Engineer | Required | Defines data quality strategy for ML organization. Introduces data quality culture in ML team. Coordinates with Data Engineering on data quality. |
| Role | Required | Description |
|---|---|---|
| Analytics Engineer | Required | Architects the enterprise data quality strategy: unified quality framework for all sources and transformations, ML-driven anomaly detection, automated root cause analysis. Defines a quality-as-code approach with version-controlled rules. |
| BI Analyst | Required | Defines enterprise data quality vision across BI, analytics, and reporting. Architects governance frameworks with automated quality enforcement for Tableau, Power BI, and cloud warehouses. Establishes standards for data certification, lineage, and trust scoring. Influences industry practices. |
| Data Analyst | Required | Shapes enterprise data quality strategy across analytical and operational domains. Designs governance integrating quality, lineage, and cataloging into a unified platform. Establishes data certification programs and quality maturity models. Drives data-driven culture with measurable standards. |
| Data Engineer | Required | Designs data quality platform: centralized quality monitoring, ML-based anomaly detection, quality-driven data trust scoring. Defines organization-wide data quality framework. |
| Data Scientist | Required | Defines organizational data quality strategy ensuring reliable ML training and production data. Architects governance integrating validation into feature stores and model registries. Establishes certification and quality maturity models for ML workflows. Drives data-centric AI best practices. |
| ML Engineer | Required | Defines enterprise data quality strategy for ML. Designs data governance for ML platform. Evaluates data quality tools and frameworks. |