OpenAI Compatible Chat
Connect your AI agents to any LLM that exposes an OpenAI-compatible chat API, including self-hosted models like Ollama and vLLM.
Overview
The OpenAI Compatible connector allows you to use any LLM provider that implements the OpenAI Chat Completions API format. This includes self-hosted solutions like Ollama and vLLM, as well as cloud services that offer OpenAI-compatible endpoints.
The connector supports:
- Chat completions with streaming
- Function calling (tool use)
Setting up the connector
To add an OpenAI Compatible connector, complete the following steps:
- Navigate to the Squid Console and select your application.
- Click the Connectors tab.
- Click Available Connectors and find the OpenAI Compatible Chat connector. Then click Add Connector.
- Provide the following details:
- Connector ID: A unique ID of your choice (e.g.,
my-ollama). This is theintegrationIdyou will reference in code. - Base URL: The publicly accessible URL of the OpenAI-compatible API. The Squid backend must be able to reach this URL, so it cannot be a
localhostaddress unless you are developing locally. - API Key (optional): An API key for authentication. Some providers, such as local Ollama instances, do not require an API key.
- Models: A JSON array defining the models available through this connector. Each model requires the following fields:
| Field | Type | Description |
|---|---|---|
modelName | string | The model identifier used in API calls |
displayName | string | A human-readable name for the model |
maxOutputTokens | number | Maximum number of tokens the model can generate in a response |
contextWindowTokens | number | Total context window size in tokens |
Example:
[
{
"modelName": "llama3",
"displayName": "Llama 3",
"maxOutputTokens": 4096,
"contextWindowTokens": 8192
}
]
- Click Add Connector.
Using the connector
Once configured, use the connector with an AI agent by specifying the connector ID and model name:
Client code
await squid.ai().agent('my-agent').updateModel({
integrationId: 'my-ollama',
model: 'llama3',
});
You can also override the model on a per-request basis:
Client code
const response = await squid
.ai()
.agent('my-agent')
.ask('Hello!', {
model: {
integrationId: 'my-ollama',
model: 'llama3',
},
});
Common configurations
Ollama
Ollama runs open-source models locally.
- Base URL: The publicly accessible URL of your Ollama instance (e.g.,
https://ollama.your-domain.com) - API Key: Not required
- Models: Depends on which models you have pulled locally (e.g.,
llama3,mistral,codellama)
vLLM
vLLM is a high-throughput inference engine with an OpenAI-compatible server.
- Base URL: The publicly accessible URL of your vLLM server (e.g.,
https://vllm.your-domain.com) - API Key: Depends on your vLLM configuration
- Models: The model(s) you started vLLM with