Guardrails
Define constraints on AI agent responses to enforce content policies, prevent data leaks, and maintain professional communication standards.
Why Use Guardrails
Your AI agent has access to broad knowledge and can generate a wide range of responses. Without constraints, it might:
- Reveal sensitive user data like Social Security numbers or email addresses
- Respond to questions outside its intended scope
- Use unprofessional or offensive language
Guardrails let you define boundaries so the agent stays on-topic, professional, and compliant with your policies. Enable a preset with a single flag, or write your own custom policy:
await this.squid.ai().agent('support-agent').updateGuardrails({
disablePii: true, // Blocks personally identifiable information
professionalTone: true, // Enforces formal language
});
Overview
Guardrails are configurable constraints injected into the AI agent's system prompt with the highest priority. When a guardrail is enabled, the agent receives specific instructions to filter its responses accordingly.
There are two types of guardrails:
- Preset guardrails - Four built-in policies that can be toggled on or off
- Custom guardrails - Free-form text instructions for policies not covered by the presets
Guardrails can be configured three ways:
| Method | Best for |
|---|---|
| Squid Console (Agent Studio) | Quick setup and non-developer users |
| Backend SDK | Programmatic control from your application code |
| REST API | External systems or automation pipelines |
How guardrails work
When an agent processes a message, enabled guardrails are embedded as high-priority instructions in the system prompt sent to the underlying LLM. The agent treats these instructions with the highest priority, filtering its response to comply with all active guardrail policies before returning to the user.
Core Concepts
Preset guardrails
| Guardrail | Key | What it does |
|---|---|---|
| Profanity filter | disableProfanity | Filters out profanity, vulgar language, and offensive terms. Enforces respectful and neutral language. |
| PII protection | disablePii | Prevents output of personally identifiable information including SSNs, bank account numbers, addresses, names, dates of birth, emails, phone numbers, passport/driver license numbers, and credit card numbers. |
| Topic restriction | offTopicAnswers | Limits the agent to respond only to questions within its defined scope. The agent politely declines out-of-scope questions. |
| Professional tone | professionalTone | Maintains a professional, courteous, and neutral tone with clear and precise language, avoiding slang and casual expressions. |
Each preset is a boolean flag. Set it to true to enable or false to disable.
Custom guardrails
Custom guardrails accept free-form text describing additional policies. Use them for domain-specific rules not covered by the presets, such as:
- "Always cite sources when making factual claims"
- "Do not discuss competitor products"
- "Respond only in Spanish"
Merge behavior
The updateGuardrails method merges new settings with existing ones rather than replacing them. If an agent already has disablePii: true and you call updateGuardrails({ professionalTone: true }), both guardrails remain active. To disable a guardrail, explicitly set it to false.
Configuring Guardrails in the Studio
To configure guardrails for an agent via the Squid Console:
- Navigate to the Agent Studio tab in the left sidebar
- Select the agent you want to configure
- Click on the Settings tab
- Scroll to the Agent Guardrails section
- Toggle any of the four preset guardrails on or off
- (Optional) Enter a custom guardrail policy in the text field
Configuring Guardrails in the Backend SDK
Update preset guardrails
Use updateGuardrails to enable or disable preset guardrails. Settings are merged with existing values:
// Enable PII protection and professional tone
await this.squid.ai().agent('AGENT_ID').updateGuardrails({
disablePii: true,
professionalTone: true,
});
To enable all four presets at once:
await this.squid.ai().agent('AGENT_ID').updateGuardrails({
disableProfanity: true,
professionalTone: true,
disablePii: true,
offTopicAnswers: true,
});
To disable a specific guardrail while keeping others:
// Disable profanity filter; other guardrails remain unchanged
await this.squid.ai().agent('AGENT_ID').updateGuardrails({
disableProfanity: false,
});
Add a custom guardrail
Use updateCustomGuardrails to set a custom guardrail policy:
await this.squid.ai().agent('AGENT_ID').updateCustomGuardrails('Always cite sources when making factual claims.');
Calling this method again replaces the previous custom guardrail text.
Remove a custom guardrail
Use deleteCustomGuardrail to remove the custom guardrail policy:
await this.squid.ai().agent('AGENT_ID').deleteCustomGuardrail();
This only removes the custom guardrail. Preset guardrails are not affected.
Configuring Guardrails via the REST API
All API endpoints use the base URL for your Squid Cloud application. For more on constructing API URLs, see the API documentation.
Update preset guardrails
Send a POST request to update preset guardrails:
POST /squid-api/v1/ai/agent/updateGuardrails
{
"agentId": "your-agent-id",
"guardrails": {
"disablePii": true,
"professionalTone": true,
"offTopicAnswers": true,
"disableProfanity": true
}
}
Update custom guardrail
POST /squid-api/v1/ai/agent/updateCustomGuardrails
{
"agentId": "your-agent-id",
"customGuardrail": "Always cite sources when making factual claims."
}
Delete custom guardrail
POST /squid-api/v1/ai/agent/deleteCustomGuardrail
{
"agentId": "your-agent-id"
}
Error Handling
Common errors
| Error | Cause | Solution |
|---|---|---|
| Agent not found | The specified agentId does not exist | Verify the agent ID in the Squid Console |
| Unauthorized | Invalid or missing API key | Ensure you are using a valid app or agent API key |
| Cannot perform operation on built-in agent | Attempted to modify the built-in agent's guardrails | Create a custom agent instead of modifying the built-in agent |
Behavior notes
updateCustomGuardrailssilently returns without changes if the provided custom guardrail string is empty.deleteCustomGuardrailsilently returns without error if the agent does not exist or has no custom guardrail set.- Preset guardrail values are simple booleans and require no content validation.
Best Practices
- Start with presets before writing custom rules. The four built-in guardrails cover the most common compliance needs. Only add custom guardrails for domain-specific policies.
- Be specific in custom guardrails. Vague instructions like "be safe" are less effective than explicit rules like "Do not reveal internal pricing formulas."
- Combine guardrails for layered protection. Enable
disablePiialongside a custom guardrail that specifies additional data handling rules for your domain. - Test guardrail behavior. After enabling guardrails, test the agent with prompts designed to trigger each policy to verify the constraints are working as expected.
- Remember merge semantics. Calling
updateGuardrailsmerges with existing settings. To disable a guardrail, you must explicitly set it tofalse.