GENAICourse

Generative AI & foundation models

Lessons11modules

Total88mfull study

Quick7mtrailer

Projects10docker labs

Skills you'll gain

Model selection & routingProduction
Pick frontier vs mini vs reasoning vs local models against latency, cost, and quality budgets — and route requests between them in one service.
Cost-bounded LLM featuresProduction
Ship features with hard token budgets, max_tokens caps, and per-request $ tracking — defendable on a finance review.
Prompt engineeringWorking
Author and maintain prompts that survive 3+ revisions: zero-shot, few-shot, CoT, structured-output, role design, anti-drift patterns.
Structured outputProduction
Build JSON-mode + Pydantic + Instructor services that validate on every turn and retry on schema failure.
Function calling & tool useWorking
Wire single and parallel tool-use, design idempotent tool contracts, and decide when an agent is the wrong answer.
Eval-driven LLM developmentProduction
Write Promptfoo / DeepEval suites and gate releases on regression — turn prompts into testable, versioned artifacts.
Streaming & latency engineeringWorking
Implement chunked SSE, partial-JSON streaming, and cut perceived chat latency from 3s+ to under 500ms.
Caching (prompt + semantic)Production
Design cacheable prefixes, set up prompt caching and Redis-backed semantic cache — verified 40-70% spend reduction.
LLM observabilityProduction
Stand up LiteLLM + Prometheus + Grafana + Loki to trace every call with prompt hash, tokens, cost, and provider id.
Safety & prompt-injection defenceWorking
Apply NeMo Guardrails / Llama Guard, run a red-team drill on your own service, and ship a hardened input/output filter chain.
Local-first deploymentWorking
Run Ollama / vLLM with small models (Phi-4, Llama-3.2) and route to hosted APIs only on overflow — works offline, beats compliance reviews.
When NOT to generateProduction
Replace LLM calls with regex / SQL / classifiers / embeddings where deterministic — shown to cut spend 30%+ on real audits.