Data Scientist是ML & AI Engineering族群中的角色。涵盖5个级别的78项技能(从Junior到Principal)。其中101项为必备技能。关键领域:Programming Fundamentals, Backend Development, Database Management。
技术栈
各级别重点
Exploratory Data Analysis (EDA). Building baseline models. Feature engineering. Data visualization. Preparing reports.
Formalizing business problems as ML tasks. Building and validating models. A/B testing. Presenting results to stakeholders.
Researching new approaches (NLP, CV, RecSys). Designing experiments. Publishing results. Mentoring. Cross-functional collaboration.
Data Science strategy. Prioritizing ML projects by business impact. Coordinating DS and Engineering. Experimentation standards.
AI research strategy. Conference publications. Building DS culture. LLM/GenAI adoption strategy.
技能矩阵
78 技能 × 5 级别. 点击单元格查看详情。
AI-Assisted Development
4 技能| 技能 | Jun | Mid | Sen | Lead | Princ |
|---|---|---|---|---|---|
| GitHub Copilot | A | W | A | E | E |
| Cursor IDE | A | W | A | A | — |
| ChatGPT / Claude | A | W | A | E | E |
| Prompt Engineering for Code | A | W | A | E | E |
API & Integration
3 技能| 技能 | Jun | Mid | Sen | Lead | Princ |
|---|---|---|---|---|---|
| REST API Design | A | W | A | E | E |
| GraphQL Design | A | W | A | E | E |
| API Documentation | A | W | A | E | E |
Architecture & System Design
1 技能| 技能 | Jun | Mid | Sen | Lead | Princ |
|---|---|---|---|---|---|
| System Design Fundamentals | A | W | A | E | E |
Backend Development
2 技能| 技能 | Jun | Mid | Sen | Lead | Princ |
|---|---|---|---|---|---|
| Python Web Frameworks | A | W | A | E | E |
| Redis | A | W | A | E | E |
Cloud & Infrastructure
4 技能| 技能 | Jun | Mid | Sen | Lead | Princ |
|---|---|---|---|---|---|
| Docker | A | W | A | E | E |
| Kubernetes Core | A | W | A | E | E |
| AWS | A | W | A | E | E |
| Network Fundamentals | A | W | A | E | E |
Data Engineering
6 技能| 技能 | Jun | Mid | Sen | Lead | Princ |
|---|---|---|---|---|---|
| Apache Spark | A | W | A | E | E |
| Pandas / Polars | A | W | A | E | E |
| SQL-based ETL | A | W | A | E | E |
| Data Quality | A | W | A | E | E |
| BI Dashboards | A | W | A | E | E |
| Data Visualization | A | W | A | E | E |
Database Management
5 技能| 技能 | Jun | Mid | Sen | Lead | Princ |
|---|---|---|---|---|---|
| PostgreSQL | A | W | A | E | E |
| Advanced SQL | A | W | A | E | E |
| ClickHouse | A | W | A | E | E |
| Database Indexing | A | W | A | E | E |
| Query Optimization | A | W | A | E | E |
DevOps & CI/CD
1 技能| 技能 | Jun | Mid | Sen | Lead | Princ |
|---|---|---|---|---|---|
| GitHub Actions / GitLab CI | A | W | A | E | E |
Machine Learning & AI
35 技能Observability & Monitoring
3 技能| 技能 | Jun | Mid | Sen | Lead | Princ |
|---|---|---|---|---|---|
| Structured Logging | A | W | A | E | E |
| Prometheus & Grafana | A | W | A | E | E |
| OpenTelemetry | A | W | A | E | E |
Programming Fundamentals
7 技能| 技能 | Jun | Mid | Sen | Lead | Princ |
|---|---|---|---|---|---|
| Algorithms & Complexity | A | W | A | E | E |
| Data Structures | A | W | A | E | E |
| OOP & SOLID Principles | A | W | A | E | E |
| Design Patterns | A | W | A | E | E |
| Multithreading | A | W | A | E | E |
| Async Programming | A | W | A | E | E |
| Code Quality & Refactoring | A | W | A | E | E |
Security
2 技能| 技能 | Jun | Mid | Sen | Lead | Princ |
|---|---|---|---|---|---|
| OWASP & Application Security | A | W | A | E | E |
| Secure Coding Practices | A | W | A | E | E |
Testing & QA
3 技能| 技能 | Jun | Mid | Sen | Lead | Princ |
|---|---|---|---|---|---|
| Unit Testing | A | W | A | E | E |
| Unit Testing | A | W | A | E | E |
| Integration Testing | A | W | A | E | E |
Version Control & Collaboration
2 技能| 技能 | Jun | Mid | Sen | Lead | Princ |
|---|---|---|---|---|---|
| Git Advanced | A | W | A | E | E |
| Code Review | A | W | A | E | E |
常见问题
Data Scientist角色需要哪些技能?
Data Scientist角色需要78项技能,其中101项为必备。技能分布在5个级别:从Junior到Principal。 查看完整矩阵.
如何在Data Scientist角色中晋升到下一级别?
使用等级计算器评估您当前的级别并获取个性化建议。系统将显示晋升所需发展的技能。
Data Scientist角色使用什么技术栈?
技术栈包含5种不同级别的技术。 Python 3.11+, pandas, numpy, matplotlib/seaborn, scikit-learn, Jupyter, SQL, Python, scikit-learn, XGBoost/CatBoost/LightGBM, PyTorch basics, Optuna/Hyperopt, MLflow, SQL advanced, Spark basics, PyTorch/JAX, Transformers (HuggingFace), Deep Learning advanced, Causal Inference, Bayesian methods, Spark, LLM fine-tuning...
社区如何定义Data Scientist角色的要求?
角色要求由社区通过提案系统制定。任何成员都可以提出修改建议,经过投票和专家评审后生效。