Skill-Profil

vLLM Inference

Dieser Skill definiert Erwartungen über Rollen und Level.

Machine Learning & AI LLM & Generative AI

Rollen

wo dieser Skill vorkommt

Stufen

strukturierter Entwicklungspfad

Pflichtanforderungen

die anderen 5 optional

Machine Learning & AI

LLM & Generative AI

22.2.2026

Wählen Sie Ihr aktuelles Level und vergleichen Sie die Erwartungen.

Was wird auf jedem Level erwartet

Die Tabelle zeigt, wie die Tiefe von Junior bis Principal wächst.

Rolle	Pflicht	Beschreibung
LLM Engineer		Knows vLLM basics: what is PagedAttention, continuous batching, inference serving. Launches vLLM server for pre-trained model inference with basic configuration under mentor guidance.

Rolle	Pflicht	Beschreibung
LLM Engineer		Independently configures vLLM for production: tensor parallelism, quantization (AWQ/GPTQ), GPU memory management. Optimizes throughput by tuning batch size and scheduling parameters.

Rolle	Pflicht	Beschreibung
LLM Engineer		Designs production vLLM infrastructure: multi-model serving, speculative decoding, custom sampling strategies. Optimizes latency and throughput through advanced configuration and hardware-specific tuning.

Rolle	Pflicht	Beschreibung
LLM Engineer		Defines vLLM deployment standards for the LLM team. Establishes guidelines for configuration, monitoring, capacity planning. Coordinates upgrades and migration between vLLM versions.

Rolle	Pflicht	Beschreibung
LLM Engineer		Shapes enterprise vLLM inference strategy. Defines approaches to multi-cluster inference, hardware planning (A100/H100/H200), and cost optimization. Ensures SLA for critical inference workloads.

Junior 1 Anforderungen

LLM Engineer

Knows vLLM basics: what is PagedAttention, continuous batching, inference serving. Launches vLLM server for pre-trained model inference with basic configuration under mentor guidance.

Middle 1 Anforderungen

LLM Engineer

Independently configures vLLM for production: tensor parallelism, quantization (AWQ/GPTQ), GPU memory management. Optimizes throughput by tuning batch size and scheduling parameters.

Senior 1 Anforderungen

LLM Engineer

Designs production vLLM infrastructure: multi-model serving, speculative decoding, custom sampling strategies. Optimizes latency and throughput through advanced configuration and hardware-specific tuning.

Lead / Staff 1 Anforderungen

LLM Engineer

Defines vLLM deployment standards for the LLM team. Establishes guidelines for configuration, monitoring, capacity planning. Coordinates upgrades and migration between vLLM versions.

Principal 1 Anforderungen

LLM Engineer

Shapes enterprise vLLM inference strategy. Defines approaches to multi-cluster inference, hardware planning (A100/H100/H200), and cost optimization. Ensures SLA for critical inference workloads.

Kommentare werden geladen...