OpenAI SDK Adapter
Use this adapter when your application calls the OpenAI SDK directly.
This is the best fit when your runtime owns direct calls such as:
responses.create- streaming responses
chat.completions.create
Integration shape
The core pattern is direct SDK wrapping:
- build one shared service client at process startup
- build or wrap one shared OpenAI client
- bind request-scoped gate context around each model call
- let the wrapper enforce
authorize -> execute -> commit/cancel
This is a good fit because usage is returned by the SDK response itself.
What to carry per request
For each call, provide:
request_idprincipal_idorbilling_account_id- vendor or model metadata
- optional
feature_family_code - optional
budget_id
Streaming note
For streaming:
- authorize before opening the stream
- consume the stream normally
- commit only when the final usage-bearing response is available
Typical usage model
In most cases:
- top-level
quantity_minorshould be total tokens meters[]should separate input, output, and cached tokens when pricing differs across them
Artifact roles
vluna_adapter.*- This is the file to copy into your codebase if you already use the direct OpenAI SDK pattern.
- Expect to make small, targeted changes for feature-code mapping, identity wiring, logging, and local conventions.
example.*- This is only a demo of how the adapter is invoked from app code.
- Use it to understand startup wiring, request context binding, streaming shape, and tool-call patterns.
Downloadable artifacts
Python:
TypeScript:
When to choose something else
- If your app uses the OpenAI Agents runtime, use: OpenAI Agents SDK
- If your framework owns the model execution graph, use: LangGraph or Google ADK