Dominio
Machine Learning & AI
Perfil de habilidad
Esta habilidad define expectativas en roles y niveles.
Roles
1
donde aparece esta habilidad
Niveles
5
ruta de crecimiento estructurada
Requisitos obligatorios
0
los otros 5 opcionales
Machine Learning & AI
Natural Language Processing
22/2/2026
Selecciona tu nivel actual y compara las expectativas.
La tabla muestra cómo crece la profundidad desde Junior hasta Principal.
| Rol | Obligatorio | Descripción |
|---|---|---|
| LLM Engineer | Knows tokenization basics: BPE, WordPiece, SentencePiece. Understands how tokenizer affects LLM quality and cost. Uses pre-trained tokenizers from Hugging Face for basic tasks. |
| Rol | Obligatorio | Descripción |
|---|---|---|
| LLM Engineer | Independently works with LLM tokenization: analyzes token distribution, optimizes input length, handles special tokens. Trains custom tokenizers on domain-specific corpora. |
| Rol | Obligatorio | Descripción |
|---|---|---|
| LLM Engineer | Designs tokenization strategies for LLM: multi-language tokenizer training, vocabulary extension, tokenizer-aware data preprocessing. Optimizes fertility rate and coverage for target domains. |
| Rol | Obligatorio | Descripción |
|---|---|---|
| LLM Engineer | Defines tokenization standards for the LLM team. Establishes guidelines for tokenizer selection and training, tokenization quality evaluation, and integration with training and inference pipelines. |
| Rol | Obligatorio | Descripción |
|---|---|---|
| LLM Engineer | Shapes enterprise tokenization strategy. Defines approaches to unified tokenizer management, multi-language coverage, tokenizer versioning, and evaluation at organizational scale. |