QAMOD.QA-06 · v1.0

Eval harnesses
that bite.
Regressions that don't ship.

6 micro-lessons · ~48 min · Real Docker images

TEST BENCH · LATEST RUN

BENCH.A · LLM-AS-JUDGE

PASSING

TESTS 184/187 PASS· 2 FAIL· 1 SKIP· DUR 4m 12s

184 pass 3 fail 3 amber 2 skip

PASS-RATE · LAST 7 DAYS

MON

TUE

WED

THU

FRI

SAT

SUN

QAROLE TRACK

AI for QA / Test Engineers

LLM-as-judge, eval harnesses, adversarial probes.

WHY THIS MATTERS · SNAP INTERNAL

Most-shared track on social — QA engineers use the lessons as team primers.

WHAT YOU'LL LEARN

01LLM-as-judge

02Eval harnesses

03Adversarial probes

04Regression bench

YOU'LL BE ABLE TO

Build LLM-as-judge that doesn't hallucinate

Set up regression benches that bite

Run adversarial probes in CI

SKILLS YOU'LL GAIN

Real skills, real career delta.

Skills you'll gain

Build LLM-as-judge that doesn't hallucinateWorking
Outcome from completing the course: build llm-as-judge that doesn't hallucinate.
Set up regression benches that biteWorking
Outcome from completing the course: set up regression benches that bite.
Run adversarial probes in CIWorking
Outcome from completing the course: run adversarial probes in ci.
Eval harnessesWorking
Covered in lesson sequence — drop-in ready.

RUNNABLE ON YOUR MACHINE

$ docker pull snap/ai-qa:lesson-01

$ docker run --rm -it snap/ai-qa:lesson-01

snap/ai-qa:lesson-01

QUICK PREVIEW · 7 MIN

VERIFIED ENGINEER REVIEWS

Adversarial probe lesson caught 3 prod regressions.

@qa_quinVERIFY ON GITHUB

LLM-as-judge with controls — the lesson is gold.

@devops_julesVERIFY ON GITHUB

LESSONS6

HOURS~0.8

LEARNERS620

THIS WEEK+19%

Eval harnessesthat bite.Regressions that don't ship.

AI for QA / Test Engineers

Real skills, real career delta.

Skills you'll gain

Eval harnesses
that bite.
Regressions that don't ship.