OpenAI Compatible Chat

Connect your AI agents to any LLM that exposes an OpenAI-compatible chat API, including self-hosted models like Ollama and vLLM.

Overview

The OpenAI Compatible connector allows you to use any LLM provider that implements the OpenAI Chat Completions API format. This includes self-hosted solutions like Ollama and vLLM, as well as cloud services that offer OpenAI-compatible endpoints.

The connector supports:

Chat completions with streaming
Function calling (tool use)

Setting up the connector

To add an OpenAI Compatible connector, complete the following steps:

Navigate to the Squid Console and select your application.
Click the Connectors tab.
Click Available Connectors and find the OpenAI Compatible Chat connector. Then click Add Connector.
Provide the following details:

Connector ID: A unique ID of your choice (e.g., my-ollama). This is the integrationId you will reference in code.
Base URL: The publicly accessible URL of the OpenAI-compatible API. The Squid backend must be able to reach this URL, so it cannot be a localhost address unless you are developing locally.
API Key (optional): An API key for authentication. Some providers, such as local Ollama instances, do not require an API key.
Models: A JSON array defining the models available through this connector. Each model requires the following fields:

Field	Type	Description
`modelName`	string	The model identifier used in API calls
`displayName`	string	A human-readable name for the model
`maxOutputTokens`	number	Maximum number of tokens the model can generate in a response
`contextWindowTokens`	number	Total context window size in tokens

Example:

[
  {
    "modelName": "llama3",
    "displayName": "Llama 3",
    "maxOutputTokens": 4096,
    "contextWindowTokens": 8192
  }
]

Click Add Connector.

Using the connector

Once configured, use the connector with an AI agent by specifying the connector ID and model name:

Client code
await squid.ai().agent('my-agent').updateModel({
  integrationId: 'my-ollama',
  model: 'llama3',
});

You can also override the model on a per-request basis:

Client code
const response = await squid
  .ai()
  .agent('my-agent')
  .ask('Hello!', {
    model: {
      integrationId: 'my-ollama',
      model: 'llama3',
    },
  });

Common configurations

Ollama

Ollama runs open-source models locally.

Base URL: The publicly accessible URL of your Ollama instance (e.g., https://ollama.your-domain.com)
API Key: Not required
Models: Depends on which models you have pulled locally (e.g., llama3, mistral, codellama)

vLLM

vLLM is a high-throughput inference engine with an OpenAI-compatible server.

Base URL: The publicly accessible URL of your vLLM server (e.g., https://vllm.your-domain.com)
API Key: Depends on your vLLM configuration
Models: The model(s) you started vLLM with

Connect your AI agents to any LLM that exposes an OpenAI-compatible chat API, including self-hosted models like Ollama and vLLM.​

Overview​

Setting up the connector​

Using the connector​

Common configurations​

Ollama​

vLLM​