Inference Post

Inference Post

curl --request POST \
  --url https://api.example.com/inference/v1{path} \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "<string>",
  "messages": [],
  "timeout": 123,
  "temperature": 123,
  "top_p": 123,
  "n": 123,
  "stop": "<string>",
  "max_completion_tokens": 123,
  "max_tokens": 123,
  "modalities": [
    "<unknown>"
  ],
  "presence_penalty": 123,
  "frequency_penalty": 123,
  "stream": true,
  "logit_bias": {},
  "user": "<string>",
  "response_format": {},
  "seed": 123,
  "tools": [
    "<unknown>"
  ],
  "tool_choice": "<string>",
  "logprobs": true,
  "top_logprobs": 123,
  "parallel_tool_calls": true,
  "extra_headers": {},
  "functions": [
    "<unknown>"
  ],
  "function_call": "<string>",
  "api_version": "<string>",
  "prompt": "<string>",
  "template_vars": {},
  "vertex_credentials": "<string>"
}
'

{
  "detail": [
    {
      "loc": [
        "<string>"
      ],
      "msg": "<string>",
      "type": "<string>"
    }
  ]
}

POST

inference

{path}

Inference Post

curl --request POST \
  --url https://api.example.com/inference/v1{path} \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "<string>",
  "messages": [],
  "timeout": 123,
  "temperature": 123,
  "top_p": 123,
  "n": 123,
  "stop": "<string>",
  "max_completion_tokens": 123,
  "max_tokens": 123,
  "modalities": [
    "<unknown>"
  ],
  "presence_penalty": 123,
  "frequency_penalty": 123,
  "stream": true,
  "logit_bias": {},
  "user": "<string>",
  "response_format": {},
  "seed": 123,
  "tools": [
    "<unknown>"
  ],
  "tool_choice": "<string>",
  "logprobs": true,
  "top_logprobs": 123,
  "parallel_tool_calls": true,
  "extra_headers": {},
  "functions": [
    "<unknown>"
  ],
  "function_call": "<string>",
  "api_version": "<string>",
  "prompt": "<string>",
  "template_vars": {},
  "vertex_credentials": "<string>"
}
'

{
  "detail": [
    {
      "loc": [
        "<string>"
      ],
      "msg": "<string>",
      "type": "<string>"
    }
  ]
}

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Path Parameters

path

string

required

Body

application/json

model

string

required

messages

any[]

timeout

temperature

number | null

top_p

number | null

integer | null

stop

max_completion_tokens

integer | null

max_tokens

integer | null

modalities

any[] | null

presence_penalty

number | null

frequency_penalty

number | null

stream

boolean | null

logit_bias

Logit Bias · object

user

string | null

response_format

Response Format · object

seed

integer | null

tools

any[] | null

tool_choice

logprobs

boolean | null

top_logprobs

integer | null

parallel_tool_calls

boolean | null

extra_headers

Extra Headers · object

functions

any[] | null

function_call

string | null

api_version

string | null

prompt

string | null

Reference to a Weave Prompt object (e.g., 'weave:///entity/project/object/prompt_name:version'). If provided, the messages from this prompt will be prepended to the messages in this request. Template variables in the prompt messages can be substituted using the template_vars parameter.

template_vars

Template Vars · object

Dictionary of template variables to substitute in prompt messages. Variables in messages like '{variable_name}' will be replaced with the corresponding values. Applied to both prompt messages (if prompt is provided) and regular messages.

vertex_credentials

string | null

JSON string of Vertex AI service account credentials. When provided for vertex_ai models (e.g. vertex_ai/gemini-2.5-pro), used for authentication instead of api_key. Not persisted in trace storage.

Response

Successful Response

Inference Get

Obj Create

⌘I

Get Started

Guides

Cookbooks

Reference

Details & Support

Open Source

Community

Authorizations

Path Parameters

Body

Response