Documentation

LM Studio REST API

Streaming events

When you chat with a model with stream set to true, the response is sent as a stream of events using Server-Sent Events (SSE).

Streaming events let you render chat responses incrementally over Server‑Sent Events (SSE). When you call POST /api/v1/chat with stream: true, the server emits a series of named events that you can consume. These events arrive in order and may include multiple deltas (for reasoning and message content), tool call boundaries and payloads, and any errors encountered. The stream always begins with chat.start and concludes with chat.end, which contains the aggregated result equivalent to a non‑streaming response.

List of event types that can be sent in an /api/v1/chat response stream:

  • chat.start
  • model_load.start
  • model_load.progress
  • model_load.end
  • prompt_processing.start
  • prompt_processing.progress
  • prompt_processing.end
  • reasoning.start
  • reasoning.delta
  • reasoning.end
  • tool_call.start
  • tool_call.arguments
  • tool_call.success
  • tool_call.failure
  • message.start
  • message.delta
  • message.end
  • error
  • chat.end

Events will be streamed out in the following raw format:

event: <event type>
data: <JSON event data>

chat.start

An event that is emitted at the start of a chat response stream.

model_instance_id : string

Unique identifier for the loaded model instance that will generate the response.

type : "chat.start"

The type of the event. Always chat.start.

{
  "type": "chat.start",
  "model_instance_id": "openai/gpt-oss-20b"
}

model_load.start

Signals the start of a model being loaded to fulfill the chat request. Will not be emitted if the requested model is already loaded.

model_instance_id : string

Unique identifier for the model instance being loaded.

type : "model_load.start"

The type of the event. Always model_load.start.

{
  "type": "model_load.start",
  "model_instance_id": "openai/gpt-oss-20b"
}

model_load.progress

Progress of the model load.

model_instance_id : string

Unique identifier for the model instance being loaded.

progress : number

Progress of the model load as a float between 0 and 1.

type : "model_load.progress"

The type of the event. Always model_load.progress.

{
  "type": "model_load.progress",
  "model_instance_id": "openai/gpt-oss-20b",
  "progress": 0.65
}

model_load.end

Signals a successfully completed model load.

model_instance_id : string

Unique identifier for the model instance that was loaded.

load_time_seconds : number

Time taken to load the model in seconds.

type : "model_load.end"

The type of the event. Always model_load.end.

{
  "type": "model_load.end",
  "model_instance_id": "openai/gpt-oss-20b",
  "load_time_seconds": 12.34
}

prompt_processing.start

Signals the start of the model processing a prompt.

type : "prompt_processing.start"

The type of the event. Always prompt_processing.start.

{
  "type": "prompt_processing.start"
}

prompt_processing.progress

Progress of the model processing a prompt.

progress : number

Progress of the prompt processing as a float between 0 and 1.

type : "prompt_processing.progress"

The type of the event. Always prompt_processing.progress.

{
  "type": "prompt_processing.progress",
  "progress": 0.5
}

prompt_processing.end

Signals the end of the model processing a prompt.

type : "prompt_processing.end"

The type of the event. Always prompt_processing.end.

{
  "type": "prompt_processing.end"
}

reasoning.start

Signals the model is starting to stream reasoning content.

type : "reasoning.start"

The type of the event. Always reasoning.start.

{
  "type": "reasoning.start"
}

reasoning.delta

A chunk of reasoning content. Multiple deltas may arrive.

content : string

Reasoning text fragment.

type : "reasoning.delta"

The type of the event. Always reasoning.delta.

{
  "type": "reasoning.delta",
  "content": "Need to"
}

reasoning.end

Signals the end of the reasoning stream.

type : "reasoning.end"

The type of the event. Always reasoning.end.

{
  "type": "reasoning.end"
}

tool_call.start

Emitted when the model starts a tool call.

tool : string

Name of the tool being called.

provider_info : object

Information about the tool provider. Discriminated union upon possible provider types.

Plugin provider info : object

Present when the tool is provided by a plugin.

type : "plugin"

Provider type.

plugin_id : string

Identifier of the plugin.

Ephemeral MCP provider info : object

Present when the tool is provided by a ephemeral MCP server.

type : "ephemeral_mcp"

Provider type.

server_label : string

Label of the MCP server.

type : "tool_call.start"

The type of the event. Always tool_call.start.

{
  "type": "tool_call.start",
  "tool": "model_search",
  "provider_info": {
    "type": "ephemeral_mcp",
    "server_label": "huggingface"
  }
}

tool_call.arguments

Arguments streamed for the current tool call.

tool : string

Name of the tool being called.

arguments : object

Arguments passed to the tool. Can have any keys/values depending on the tool definition.

provider_info : object

Information about the tool provider. Discriminated union upon possible provider types.

Plugin provider info : object

Present when the tool is provided by a plugin.

type : "plugin"

Provider type.

plugin_id : string

Identifier of the plugin.

Ephemeral MCP provider info : object

Present when the tool is provided by a ephemeral MCP server.

type : "ephemeral_mcp"

Provider type.

server_label : string

Label of the MCP server.

type : "tool_call.arguments"

The type of the event. Always tool_call.arguments.

{
  "type": "tool_call.arguments",
  "tool": "model_search",
  "arguments": {
    "sort": "trendingScore",
    "limit": 1
  },
  "provider_info": {
    "type": "ephemeral_mcp",
    "server_label": "huggingface"
  }
}

tool_call.success

Result of the tool call, along with the arguments used.

tool : string

Name of the tool that was called.

arguments : object

Arguments that were passed to the tool.

output : string

Raw tool output string.

provider_info : object

Information about the tool provider. Discriminated union upon possible provider types.

Plugin provider info : object

Present when the tool is provided by a plugin.

type : "plugin"

Provider type.

plugin_id : string

Identifier of the plugin.

Ephemeral MCP provider info : object

Present when the tool is provided by a ephemeral MCP server.

type : "ephemeral_mcp"

Provider type.

server_label : string

Label of the MCP server.

type : "tool_call.success"

The type of the event. Always tool_call.success.

{
  "type": "tool_call.success",
  "tool": "model_search",
  "arguments": {
    "sort": "trendingScore",
    "limit": 1
  },
  "output": "[{\"type\":\"text\",\"text\":\"Showing first 1 models...\"}]",
  "provider_info": {
    "type": "ephemeral_mcp",
    "server_label": "huggingface"
  }
}

tool_call.failure

Indicates that the tool call failed.

reason : string

Reason for the tool call failure.

metadata : object

Metadata about the invalid tool call.

type : "invalid_name" | "invalid_arguments"

Type of error that occurred.

tool_name : string

Name of the tool that was attempted to be called.

arguments (optional) : object

Arguments that were passed to the tool (only present for invalid_arguments errors).

provider_info (optional) : object

Information about the tool provider (only present for invalid_arguments errors).

type : "plugin" | "ephemeral_mcp"

Provider type.

plugin_id (optional) : string

Identifier of the plugin (when type is "plugin").

server_label (optional) : string

Label of the MCP server (when type is "ephemeral_mcp").

type : "tool_call.failure"

The type of the event. Always tool_call.failure.

{
  "type": "tool_call.failure",
  "reason": "Cannot find tool with name open_browser.",
  "metadata": {
    "type": "invalid_name",
    "tool_name": "open_browser"
  }
}

message.start

Signals the model is about to stream a message.

type : "message.start"

The type of the event. Always message.start.

{
  "type": "message.start"
}

message.delta

A chunk of message content. Multiple deltas may arrive.

content : string

Message text fragment.

type : "message.delta"

The type of the event. Always message.delta.

{
  "type": "message.delta",
  "content": "The current"
}

message.end

Signals the end of the message stream.

type : "message.end"

The type of the event. Always message.end.

{
  "type": "message.end"
}

error

An error occurred during streaming. The final payload will still be sent in chat.end with whatever was generated.

error : object

Error information.

type : "invalid_request" | "unknown" | "mcp_connection_error" | "plugin_connection_error" | "not_implemented" | "model_not_found" | "job_not_found" | "internal_error"

High-level error type.

message : string

Human-readable error message.

code (optional) : string

More detailed error code (e.g., validation issue code).

param (optional) : string

Parameter associated with the error, if applicable.

type : "error"

The type of the event. Always error.

{
  "type": "error",
  "error": {
    "type": "invalid_request",
    "message": "\"model\" is required",
    "code": "missing_required_parameter",
    "param": "model"
  }
}

chat.end

Final event containing the full aggregated response, equivalent to the non-streaming POST /api/v1/chat response body.

result : object

Final response with model_instance_id, output, stats, and optional response_id. See non-streaming chat docs for more details.

type : "chat.end"

The type of the event. Always chat.end.

{
  "type": "chat.end",
  "result": {
    "model_instance_id": "openai/gpt-oss-20b",
    "output": [
      { "type": "reasoning", "content": "Need to call function." },
      {
        "type": "tool_call",
        "tool": "model_search",
        "arguments": { "sort": "trendingScore", "limit": 1 },
        "output": "[{\"type\":\"text\",\"text\":\"Showing first 1 models...\"}]",
        "provider_info": { "type": "ephemeral_mcp", "server_label": "huggingface" }
      },
      { "type": "message", "content": "The current top‑trending model is..." }
    ],
    "stats": {
      "input_tokens": 329,
      "total_output_tokens": 268,
      "reasoning_output_tokens": 5,
      "tokens_per_second": 43.73,
      "time_to_first_token_seconds": 0.781
    },
    "response_id": "resp_02b2017dbc06c12bfc353a2ed6c2b802f8cc682884bb5716"
  }
}

This page's source is available on GitHub