PAICourse

Physical AI · robotics, world models & VLA

Lessons8modules

Total83mfull study

Quick7mtrailer

Projects8docker labs

CHEATSHEET · 01Physical AI · master cheatsheet

Where to start by team shape

·No robot, no sim experience: LeRobot 0.5 + Pi0.5 inference on Aloha demo data
·Have a laptop GPU: Isaac Lab 2.3 Lift task; export to ONNX; run policy server
·Have a desk arm (Franka FR3 / UR5e): + ROS2 Jazzy runtime + Foxglove 2.0
·Have a humanoid (Unitree G1 / Apollo / Stretch3): + GR00T N1.7 + safety guardian + Jetson Thor

The 8 layers of a 2026 stack

·1. World model (Cosmos Predict/Transfer/Reason · Genesis · DreamerV3)
·2. Sim (Isaac Lab 2.3 · MuJoCo MJX 3.8 · Genesis 0.4 · Habitat 3.0)
·3. Dataset (LeRobotDataset v3 · Open-X-Embodiment · DROID · BridgeData v2)
·4. Policy (Pi0.5 · Pi0-FAST · OpenVLA-OFT · GR00T N1.7 · RDT · Diffusion Policy)
·5. Trainer (PPO · DAgger · diffusion · LoRA/PEFT · flow matching)
·6. Eval (LIBERO · SIMPLER · RoboCasa · CALVIN · Isaac Lab-Arena · VLA-Arena)
·7. Runtime (ROS2 Jazzy → Lyrical · Fast DDS / Cyclone DDS · ONNX/TensorRT · Foxglove 2.0)
·8. Safety guardian (rate · joint clamp · e-stop · ISO 13482 / 10218 / TS 15066)

Hardware short-list (2026 prices)

·Manipulators: Franka FR3 (~$15-20K) · UR5e/UR10e ($25-50K) · ABB GoFa · KUKA LBR iisy
·Mobile: Hello Robot Stretch 3 ($24,950) · Boston Dynamics Spot ($74.5K base) · Unitree Go2 ($1.6-13K)
·Humanoids: Unitree G1 ($13.5-18K dev) · Apptronik Apollo · Figure 03 (~$20K target) · 1X Neo ($20K or $499/mo)
·Compute: Jetson Orin (deploy) · Jetson Thor T5000 ($3,499 devkit · 2070 FP4 TFLOPS · 128GB) · RTX 6000 Ada (train)

Eval suites by task class

·LIBERO — long-horizon manipulation (130 tasks, 5 suites — Goal/Object/Spatial/10/100)
·SIMPLER — sim-to-real proxy (Visual Matching + Variant Aggregation, MMRV + Pearson)
·RoboCasa — 100 tasks (25 atomic + 75 composite) in 120 kitchens, 100K+ trajectories
·CALVIN — long-horizon language-conditioned
·Isaac Lab-Arena (CES 2026) — parallelised GPU eval; integrates LIBERO + RoboCasa
·Open-X-Embodiment — pretraining (22 embodiments, 21 institutions, 527 skills, 160K+ episodes)

Safety standards by deployment

CHEATSHEET · 02Framework picks · 2026

Sim — pick by task class

·Isaac Lab 2.3 — production default; whole-body control, SkillGen+Mimic, Lab-Arena benchmarking
·MuJoCo 3.8 + MJX-Warp — JAX-native, fast contact, MPC-friendly (April 2026 release fixed contact perf)
·Genesis 0.4 — multi-physics (rigid+MPM+SPH+FEM+PBD+fluid); 43M FPS Franka claim
·Habitat 3.0 — embodied navigation + HRI; not for fine manipulation
·RoboCasa — kitchen / household tasks (uses RoboSuite+MuJoCo+Omniverse)
·PyBullet — legacy; only if a paper requires it

Policy — pick by task + scale

·Pi0.5 — VLA, 3.3B, Apache 2.0 + Gemma; LeRobot 0.5 default; generalist with action chunks
·Pi0-FAST — autoregressive VLA with FAST tokenizer + Gemma 300M (faster but lower ceiling)
·OpenVLA-7B / OpenVLA-OFT — Apache 2.0; OFT is 26x faster action gen via parallel decoding
·GR00T N1.7 — NVIDIA humanoid foundation policy (3B, Cosmos-Reason2-2B backbone, 20K hours EgoScale data, 1 GPU 16GB+)
·RDT-1B — Robotics Diffusion Transformer; long-horizon bimanual
·Wall-X / X-VLA — Qwen2.5-VL + flow matching / Florence2 (LeRobot 0.5)
·Diffusion Policy — single-task; the strong baseline you compare against
·Gemini Robotics-ER 1.6 — closed; on-device variant adapts with 50-100 demos

Dataset — pick by embodiment

·LeRobotDataset v3 (HF) — 2026 default (Parquet + video chunks + episode metadata)
·Open-X-Embodiment — 22 embodiments, 527 skills, 160K+ episodes — for generalisation training
·DROID — 76K episodes Franka teleop — for manipulation pretraining
·BridgeData v2 — WidowX manipulator scale dataset
·RoboCasa-Atomic — 100K simulated episodes for sim pretraining

Runtime — production

·ROS2 Jazzy Jalisco (current LTS until 2029) → Lyrical Luth (May 2026, 5y LTS)
·Fast DDS (default) · Cyclone DDS (low-jitter on Jetson) · RTI Connext (regulated)
·ONNX Runtime / TensorRT for policy inference (NVFP4/FP8 on Jetson Thor)
·Foxglove 2.0 — Series B Nov 2025; Data Search & Curation + BYOS launched April 2026
·Isaac ROS GEMs — NITROS-accelerated CUDA pipelines (Visual SLAM, FoundationPose, cuMotion)

World models & synthetic data

·Cosmos Predict 2.5-2B (Dec 2025) — text/image/video → world rollouts; uses Cosmos-Reason1 text encoder
·Cosmos Transfer 2.5 (Feb 2026) — multi-control video gen with depth/seg/LiDAR/HDMap conditioning
·Cosmos Reason 2 (Feb 2026) — VLM that scores rollouts; 256K context, #1 Physical AI Bench
·DreamerV3 — model-based RL; classic baseline
·RoboGen — automated task gen (LLM proposes; runs on Genesis)

Avoid / migrate