Dominio
Machine Learning & AI
Perfil de habilidad
Esta habilidad define expectativas en roles y niveles.
Roles
1
donde aparece esta habilidad
Niveles
5
ruta de crecimiento estructurada
Requisitos obligatorios
0
los otros 5 opcionales
Machine Learning & AI
LLM & Generative AI
22/2/2026
Selecciona tu nivel actual y compara las expectativas.
La tabla muestra cómo crece la profundidad desde Junior hasta Principal.
| Rol | Obligatorio | Descripción |
|---|---|---|
| LLM Engineer | Knows distributed training basics: DataParallel, model parallelism. Understands gradient synchronization concepts and runs simple multi-GPU training under mentor guidance on PyTorch. |
| Rol | Obligatorio | Descripción |
|---|---|---|
| LLM Engineer | Independently configures distributed training with DeepSpeed ZeRO and FSDP. Configures data parallel, pipeline parallel, and tensor parallel for models up to 7B parameters on GPU clusters. |
| Rol | Obligatorio | Descripción |
|---|---|---|
| LLM Engineer | Designs distributed training strategies for large LLM: 3D parallelism, ZeRO-3 offloading, activation checkpointing. Optimizes communication overhead and GPU utilization on 100+ GPUs. |
| Rol | Obligatorio | Descripción |
|---|---|---|
| LLM Engineer | Defines distributed training infrastructure for the LLM team. Establishes best practices for multi-node training configuration, monitoring and debugging distributed jobs on GPU clusters. |
| Rol | Obligatorio | Descripción |
|---|---|---|
| LLM Engineer | Shapes enterprise distributed training strategy. Defines approaches to scaling to 1000+ GPUs, cost optimization, and GPU resource planning for pre-training and fine-tuning. |