Hello GenAI
First chat completion with per-call token + cost telemetry printed to stdout.
snap/genai-foundation:helloRepo · genai-foundation-hello
$git clonehttps://github.com/snap-dev/genai-foundation-hello.git
docker-compose.yml
services:
hello:
image: python:3.12-slim
container_name: hello-genai
working_dir: /app
volumes:
- ./app:/app
environment:
OPENAI_API_KEY: ${OPENAI_API_KEY:-sk-demo-replace-me}
OPENAI_MODEL: gpt-4o-mini
LOG_LEVEL: INFO
PRICE_INPUT_PER_1K: "0.00015"
PRICE_OUTPUT_PER_1K: "0.00060"
command: >
sh -c "pip install --no-cache-dir openai==1.54.4 tiktoken==0.8.0 &&
python /app/main.py"
restart: "no"
Run
~/genai-foundation-hello · zsh
$ docker compose up --build
[hello-genai] booting model=gpt-4o-mini
[hello-genai] prompt="Explain transformers in 1 sentence."
[hello-genai] reply="Transformers are neural nets that use self-attention to weigh every token against every other token in parallel."
[hello-genai] usage prompt_tokens=18 completion_tokens=29 total=47
[hello-genai] cost_usd=0.000020 latency_ms=842
[hello-genai] done exit=0
What you'll observe
Container exits 0 after a single completion call
stdout includes a non-empty assistant reply between 10 and 200 tokens
Token usage line shows prompt_tokens, completion_tokens and total parsed from the API response
Per-call cost in USD is computed from PRICE_INPUT_PER_1K and PRICE_OUTPUT_PER_1K env vars
Lift this to your work
Drop the cost-logging wrapper into any internal service that calls an LLM and you instantly get per-request $ amounts in your logs. That unlocks a Grafana panel for spend-per-feature in an afternoon, instead of waiting weeks for a finance ticket against your provider invoice.
Hello GenAI
First chat completion with per-call token + cost telemetry printed to stdout.
snap/genai-foundation:helloRepo · genai-foundation-hello
$git clonehttps://github.com/snap-dev/genai-foundation-hello.git
docker-compose.yml
services:
hello:
image: python:3.12-slim
container_name: hello-genai
working_dir: /app
volumes:
- ./app:/app
environment:
OPENAI_API_KEY: ${OPENAI_API_KEY:-sk-demo-replace-me}
OPENAI_MODEL: gpt-4o-mini
LOG_LEVEL: INFO
PRICE_INPUT_PER_1K: "0.00015"
PRICE_OUTPUT_PER_1K: "0.00060"
command: >
sh -c "pip install --no-cache-dir openai==1.54.4 tiktoken==0.8.0 &&
python /app/main.py"
restart: "no"
Run
~/genai-foundation-hello · zsh
$ docker compose up --build
[hello-genai] booting model=gpt-4o-mini
[hello-genai] prompt="Explain transformers in 1 sentence."
[hello-genai] reply="Transformers are neural nets that use self-attention to weigh every token against every other token in parallel."
[hello-genai] usage prompt_tokens=18 completion_tokens=29 total=47
[hello-genai] cost_usd=0.000020 latency_ms=842
[hello-genai] done exit=0
What you'll observe
Container exits 0 after a single completion call
stdout includes a non-empty assistant reply between 10 and 200 tokens
Token usage line shows prompt_tokens, completion_tokens and total parsed from the API response
Per-call cost in USD is computed from PRICE_INPUT_PER_1K and PRICE_OUTPUT_PER_1K env vars
Lift this to your work
Drop the cost-logging wrapper into any internal service that calls an LLM and you instantly get per-request $ amounts in your logs. That unlocks a Grafana panel for spend-per-feature in an afternoon, instead of waiting weeks for a finance ticket against your provider invoice.