Dominio
Machine Learning & AI
Perfil de habilidad
Benchmarks, BLEU/ROUGE metrics, human eval, LLM-as-judge, generation quality assessment
Roles
2
donde aparece esta habilidad
Niveles
5
ruta de crecimiento estructurada
Requisitos obligatorios
8
los otros 2 opcionales
Machine Learning & AI
LLM & Generative AI
17/3/2026
Selecciona tu nivel actual y compara las expectativas.
La tabla muestra cómo crece la profundidad desde Junior hasta Principal.
| Rol | Obligatorio | Descripción |
|---|---|---|
| AI Product Engineer | Understands the fundamentals of LLM Evaluation. Applies basic practices in daily work. Follows recommendations from the team and documentation. | |
| LLM Engineer | Obligatorio | Knows basic LLM evaluation metrics: perplexity, BLEU, ROUGE. Runs standard benchmarks (MMLU, HellaSwag) under mentor guidance and interprets basic results. |
| Rol | Obligatorio | Descripción |
|---|---|---|
| AI Product Engineer | Independently applies LLM Evaluation in practice. Understands trade-offs of different approaches. Solves typical tasks independently. | |
| LLM Engineer | Obligatorio | Independently designs evaluation pipelines: custom benchmarks, domain-specific eval sets, human evaluation protocols. Compares models across multiple metrics for production decision-making. |
| Rol | Obligatorio | Descripción |
|---|---|---|
| AI Product Engineer | Obligatorio | Has deep expertise in LLM Evaluation. Designs solutions for production systems. Optimizes and scales. Mentors the team. |
| LLM Engineer | Obligatorio | Designs comprehensive evaluation frameworks: automated eval with LLM-as-judge, contamination detection, statistical significance testing. Develops domain-specific benchmarks for production tasks. |
| Rol | Obligatorio | Descripción |
|---|---|---|
| AI Product Engineer | Obligatorio | Defines LLM Evaluation strategy at the team/product level. Establishes standards and best practices. Conducts reviews. |
| LLM Engineer | Obligatorio | Defines evaluation standards for the LLM team. Establishes model evaluation guidelines, regression testing, benchmark management. Coordinates human evaluation processes and quality assurance. |
| Rol | Obligatorio | Descripción |
|---|---|---|
| AI Product Engineer | Obligatorio | Defines LLM Evaluation strategy at the organizational level. Establishes enterprise approaches. Mentors leads and architects. |
| LLM Engineer | Obligatorio | Shapes enterprise evaluation strategy. Defines approaches to continuous evaluation, model quality governance, and benchmark development. Ensures alignment between evaluation metrics and business objectives. |