QAMOD.QA-06 · v1.0

Eval harnesses
that bite.
Regressions that don't ship.

6 micro-lessons · ~48 min · Real Docker images

TEST BENCH · LATEST RUN
BENCH.A · LLM-AS-JUDGE
PASSING
TESTS 184/187 PASS· 2 FAIL· 1 SKIP· DUR 4m 12s
184 pass 3 fail 3 amber 2 skip
PASS-RATE · LAST 7 DAYS
MON
TUE
WED
THU
FRI
SAT
SUN
QAROLE TRACK

AI for QA / Test Engineers

LLM-as-judge, eval harnesses, adversarial probes.

WHY THIS MATTERS · SNAP INTERNAL
Most-shared track on social — QA engineers use the lessons as team primers.
WHAT YOU'LL LEARN
01LLM-as-judge
02Eval harnesses
03Adversarial probes
04Regression bench
YOU'LL BE ABLE TO
Build LLM-as-judge that doesn't hallucinate
Set up regression benches that bite
Run adversarial probes in CI
SKILLS YOU'LL GAIN

Real skills, real career delta.

Skills you'll gain

04
  • Build LLM-as-judge that doesn't hallucinateWorking

    Outcome from completing the course: build llm-as-judge that doesn't hallucinate.

  • Set up regression benches that biteWorking

    Outcome from completing the course: set up regression benches that bite.

  • Run adversarial probes in CIWorking

    Outcome from completing the course: run adversarial probes in ci.

  • Eval harnessesWorking

    Covered in lesson sequence — drop-in ready.

RUNNABLE ON YOUR MACHINE
$ docker pull snap/ai-qa:lesson-01
$ docker run --rm -it snap/ai-qa:lesson-01
snap/ai-qa:lesson-01
QUICK PREVIEW · 7 MIN
VERIFIED ENGINEER REVIEWS
Adversarial probe lesson caught 3 prod regressions.
@qa_quinVERIFY ON GITHUB
LLM-as-judge with controls — the lesson is gold.
@devops_julesVERIFY ON GITHUB
LESSONS6
HOURS~0.8
LEARNERS620
THIS WEEK+19%