Hello VLM — image to typed JSON
A 30-line VLM playground that turns any image into a Pydantic-typed answer. Onboarding-grade.
# docker-compose.yml — hello-vlm
services:
vlm:
image: python:3.12-slim
working_dir: /app
volumes:
- ./src:/app/src:ro
- ./samples:/app/samples:ro
- ./requirements.txt:/app/requirements.txt:ro
environment:
OPENAI_API_KEY: ${OPENAI_API_KEY:?set OPENAI_API_KEY in your shell}
VLM_MODEL: ${VLM_MODEL:-gpt-5}
command: >-
bash -c "pip install -q -r requirements.txt &&
python -m src.run --image samples/cat.jpg"
Drop in front of any image-input feature: a CMS that auto-tags uploads, a Slack bot that summarises pasted screenshots, a pre-moderation filter, or an admin tool that reads error screenshots. Change the Pydantic schema in src/schemas.py to match your domain — that's your wire contract.
Hello VLM — image to typed JSON
A 30-line VLM playground that turns any image into a Pydantic-typed answer. Onboarding-grade.
# docker-compose.yml — hello-vlm
services:
vlm:
image: python:3.12-slim
working_dir: /app
volumes:
- ./src:/app/src:ro
- ./samples:/app/samples:ro
- ./requirements.txt:/app/requirements.txt:ro
environment:
OPENAI_API_KEY: ${OPENAI_API_KEY:?set OPENAI_API_KEY in your shell}
VLM_MODEL: ${VLM_MODEL:-gpt-5}
command: >-
bash -c "pip install -q -r requirements.txt &&
python -m src.run --image samples/cat.jpg"
Drop in front of any image-input feature: a CMS that auto-tags uploads, a Slack bot that summarises pasted screenshots, a pre-moderation filter, or an admin tool that reads error screenshots. Change the Pydantic schema in src/schemas.py to match your domain — that's your wire contract.