Skip to main content

Guardrails

Define constraints on AI agent responses to enforce content policies, prevent data leaks, and maintain professional communication standards.

Why Use Guardrails

Your AI agent has access to broad knowledge and can generate a wide range of responses. Without constraints, it might:

  • Reveal sensitive user data like Social Security numbers or email addresses
  • Respond to questions outside its intended scope
  • Use unprofessional or offensive language

Guardrails let you define boundaries so the agent stays on-topic, professional, and compliant with your policies. Enable a preset with a single flag, or write your own custom policy:

Backend code
await this.squid.ai().agent('support-agent').updateGuardrails({
disablePii: true, // Blocks personally identifiable information
professionalTone: true, // Enforces formal language
});

Overview

Guardrails are configurable constraints injected into the AI agent's system prompt with the highest priority. When a guardrail is enabled, the agent receives specific instructions to filter its responses accordingly.

There are two types of guardrails:

  • Preset guardrails - Four built-in policies that can be toggled on or off
  • Custom guardrails - Free-form text instructions for policies not covered by the presets

Guardrails can be configured three ways:

MethodBest for
Squid Console (Agent Studio)Quick setup and non-developer users
Backend SDKProgrammatic control from your application code
REST APIExternal systems or automation pipelines

How guardrails work

When an agent processes a message, enabled guardrails are embedded as high-priority instructions in the system prompt sent to the underlying LLM. The agent treats these instructions with the highest priority, filtering its response to comply with all active guardrail policies before returning to the user.

Core Concepts

Preset guardrails

GuardrailKeyWhat it does
Profanity filterdisableProfanityFilters out profanity, vulgar language, and offensive terms. Enforces respectful and neutral language.
PII protectiondisablePiiPrevents output of personally identifiable information including SSNs, bank account numbers, addresses, names, dates of birth, emails, phone numbers, passport/driver license numbers, and credit card numbers.
Topic restrictionoffTopicAnswersLimits the agent to respond only to questions within its defined scope. The agent politely declines out-of-scope questions.
Professional toneprofessionalToneMaintains a professional, courteous, and neutral tone with clear and precise language, avoiding slang and casual expressions.

Each preset is a boolean flag. Set it to true to enable or false to disable.

Custom guardrails

Custom guardrails accept free-form text describing additional policies. Use them for domain-specific rules not covered by the presets, such as:

  • "Always cite sources when making factual claims"
  • "Do not discuss competitor products"
  • "Respond only in Spanish"

Merge behavior

The updateGuardrails method merges new settings with existing ones rather than replacing them. If an agent already has disablePii: true and you call updateGuardrails({ professionalTone: true }), both guardrails remain active. To disable a guardrail, explicitly set it to false.

Configuring Guardrails in the Studio

To configure guardrails for an agent via the Squid Console:

  1. Navigate to the Agent Studio tab in the left sidebar
  2. Select the agent you want to configure
  3. Click on the Settings tab
  4. Scroll to the Agent Guardrails section
  5. Toggle any of the four preset guardrails on or off
  6. (Optional) Enter a custom guardrail policy in the text field

Configuring Guardrails in the Backend SDK

Update preset guardrails

Use updateGuardrails to enable or disable preset guardrails. Settings are merged with existing values:

Backend code
// Enable PII protection and professional tone
await this.squid.ai().agent('AGENT_ID').updateGuardrails({
disablePii: true,
professionalTone: true,
});

To enable all four presets at once:

Backend code
await this.squid.ai().agent('AGENT_ID').updateGuardrails({
disableProfanity: true,
professionalTone: true,
disablePii: true,
offTopicAnswers: true,
});

To disable a specific guardrail while keeping others:

Backend code
// Disable profanity filter; other guardrails remain unchanged
await this.squid.ai().agent('AGENT_ID').updateGuardrails({
disableProfanity: false,
});

Add a custom guardrail

Use updateCustomGuardrails to set a custom guardrail policy:

Backend code
await this.squid.ai().agent('AGENT_ID').updateCustomGuardrails('Always cite sources when making factual claims.');

Calling this method again replaces the previous custom guardrail text.

Remove a custom guardrail

Use deleteCustomGuardrail to remove the custom guardrail policy:

Backend code
await this.squid.ai().agent('AGENT_ID').deleteCustomGuardrail();

This only removes the custom guardrail. Preset guardrails are not affected.

Configuring Guardrails via the REST API

All API endpoints use the base URL for your Squid Cloud application. For more on constructing API URLs, see the API documentation.

Update preset guardrails

Send a POST request to update preset guardrails:

POST /squid-api/v1/ai/agent/updateGuardrails

{
"agentId": "your-agent-id",
"guardrails": {
"disablePii": true,
"professionalTone": true,
"offTopicAnswers": true,
"disableProfanity": true
}
}

Update custom guardrail

POST /squid-api/v1/ai/agent/updateCustomGuardrails

{
"agentId": "your-agent-id",
"customGuardrail": "Always cite sources when making factual claims."
}

Delete custom guardrail

POST /squid-api/v1/ai/agent/deleteCustomGuardrail

{
"agentId": "your-agent-id"
}

Error Handling

Common errors

ErrorCauseSolution
Agent not foundThe specified agentId does not existVerify the agent ID in the Squid Console
UnauthorizedInvalid or missing API keyEnsure you are using a valid app or agent API key
Cannot perform operation on built-in agentAttempted to modify the built-in agent's guardrailsCreate a custom agent instead of modifying the built-in agent

Behavior notes

  • updateCustomGuardrails silently returns without changes if the provided custom guardrail string is empty.
  • deleteCustomGuardrail silently returns without error if the agent does not exist or has no custom guardrail set.
  • Preset guardrail values are simple booleans and require no content validation.

Best Practices

  1. Start with presets before writing custom rules. The four built-in guardrails cover the most common compliance needs. Only add custom guardrails for domain-specific policies.
  2. Be specific in custom guardrails. Vague instructions like "be safe" are less effective than explicit rules like "Do not reveal internal pricing formulas."
  3. Combine guardrails for layered protection. Enable disablePii alongside a custom guardrail that specifies additional data handling rules for your domain.
  4. Test guardrail behavior. After enabling guardrails, test the agent with prompts designed to trigger each policy to verify the constraints are working as expected.
  5. Remember merge semantics. Calling updateGuardrails merges with existing settings. To disable a guardrail, you must explicitly set it to false.