
隨著 OpenAI Agent SDK 的釋出,開發人員現在擁有了構建智慧系統的強大工具。其中最重要的一項功能是 Guardrails(防護機制),它可以過濾不需要的請求,幫助維護系統的完整性。這一功能在教育環境中尤為重要,因為在教育環境中,區分真正的學習支援和試圖繞過學術道德的行為可能具有挑戰性。
在本文中,我將展示一個在教育支援助理中使用 Guardrails 的實用而有影響力的案例。透過利用 Guardrails,我成功地阻止了不恰當的作業輔導請求,同時確保了真正的概念學習問題得到有效處理。
學習目標
- 瞭解 Guardrails 透過過濾不適當的請求來維護人工智慧完整性的作用。
- 探索在教育支援助理中使用 Guardrails 來防止學術不誠實。
- 瞭解輸入和輸出 Guardrails 如何在人工智慧驅動的系統中阻止不受歡迎的行為。
- 深入瞭解如何使用檢測規則和絆線實施 Guardrails。
- 探索設計人工智慧助手的最佳實踐,以促進概念學習,同時確保道德使用。
什麼是Agent?
Agent 是一種透過結合推理、決策和環境互動等各種能力來智慧完成任務的系統。OpenAI 的新代理 SDK 利用大型語言模型(LLM) 和強大的整合工具方面的最新進展,使開發人員能夠輕鬆構建這些系統。
OpenAI Agent SDK 的關鍵元件
OpenAI Agent SDK 為構建、監控和改進關鍵領域的人工智慧代理提供了基本工具:
模型:代理的核心智慧。選項包括
- o1 & o3-mini: 最適合規劃和複雜推理。
- GPT-4.5: 擅長複雜任務,具有強大的代理能力。
- GPT-4o:兼顧效能和速度。
- GPT-4o-mini:針對低延遲任務進行了最佳化。
工具:可透過以下方式與環境互動
知識與記憶:支援動態學習,包括
- 用於語義搜尋的向量儲存。
- 嵌入,提高上下文理解能力。
Guardrails:透過以下方式確保安全和控制
- 用於內容過濾的 Moderation API。
- 可預測行為的指令分層。
協調:管理代理部署:
- 用於構建和流量控制的代理 SDK。
- 用於除錯和效能調整的跟蹤和評估。
瞭解Guardrails
Guardrails 設計用於檢測和阻止對話代理中的不良行為。它們在兩個關鍵階段執行:
- 輸入Guardrails:在代理處理輸入之前執行。它們可以預先防止誤用,從而節省計算成本和響應時間。
- 輸出Guardrails:在代理生成響應後執行。它們可以在提供最終響應前過濾有害或不適當的內容。
這兩種防護機制都使用絆線,當檢測到不需要的行為時會觸發異常,立即停止代理的執行。
使用案例:教育支援助理
教育支援助理應促進學習,同時防止直接回答家庭作業的濫用行為。然而,使用者可能會巧妙地偽裝作業請求,從而使檢測變得棘手。透過實施具有強大檢測規則的輸入護欄,可確保助手在鼓勵理解的同時,不會助長捷徑。
- 目標 :開發一款客戶支援助手,既能鼓勵學習,又能阻止尋求直接作業解答的請求。
- 挑戰:使用者可能會將作業查詢偽裝成無辜的請求,從而使檢測變得困難。
- 解決方案:實施帶有詳細檢測規則的輸入Guardrails,以發現偽裝的數學作業問題。
實施細節
Guardrail 用嚴格的檢測規則和智慧啟發式方法來識別不受歡迎的行為。
Guardrail邏輯
Guardrail遵循以下核心規則:
- 阻止明確的求解請求(如 “求解 2x + 3 = 11”)。
- 阻止使用上下文線索的偽裝請求(例如,“我在練習代數,卡在了這道題上”)。
- 阻止複雜的數學概念,除非它們純粹是概念性的。
- 允許能促進學習的合法概念解釋。
護欄程式碼執行
(如果執行此程式碼,請確保設定了 OPENAI_API_KEY 環境變數):
為數學主題和複雜性定義列舉類
為了對數學查詢進行分類,我們為主題型別和複雜程度定義了列舉類。這些類有助於構建分類系統。
class MathTopicType(str, Enum):
ARITHMETIC = "arithmetic"
STATISTICS = "statistics"
class MathComplexityLevel(str, Enum):
INTERMEDIATE = "intermediate"
from enum import Enum
class MathTopicType(str, Enum):
ARITHMETIC = "arithmetic"
ALGEBRA = "algebra"
GEOMETRY = "geometry"
CALCULUS = "calculus"
STATISTICS = "statistics"
OTHER = "other"
class MathComplexityLevel(str, Enum):
BASIC = "basic"
INTERMEDIATE = "intermediate"
ADVANCED = "advanced"
from enum import Enum
class MathTopicType(str, Enum):
ARITHMETIC = "arithmetic"
ALGEBRA = "algebra"
GEOMETRY = "geometry"
CALCULUS = "calculus"
STATISTICS = "statistics"
OTHER = "other"
class MathComplexityLevel(str, Enum):
BASIC = "basic"
INTERMEDIATE = "intermediate"
ADVANCED = "advanced"
使用 Pydantic 建立輸出模型
我們定義了一個結構化輸出模型,用於儲存數學相關查詢的分類細節。
from pydantic import BaseModel
class MathHomeworkOutput(BaseModel):
topic_type: MathTopicType
complexity_level: MathComplexityLevel
detected_keywords: List[str]
is_step_by_step_requested: bool
from pydantic import BaseModel
from typing import List
class MathHomeworkOutput(BaseModel):
is_math_homework: bool
reasoning: str
topic_type: MathTopicType
complexity_level: MathComplexityLevel
detected_keywords: List[str]
is_step_by_step_requested: bool
allow_response: bool
explanation: str
from pydantic import BaseModel
from typing import List
class MathHomeworkOutput(BaseModel):
is_math_homework: bool
reasoning: str
topic_type: MathTopicType
complexity_level: MathComplexityLevel
detected_keywords: List[str]
is_step_by_step_requested: bool
allow_response: bool
explanation: str
設定 Guardrail Agent
Agent
負責使用預定義的檢測規則檢測和攔截與家庭作業相關的查詢。
name="Math Query Analyzer",
instructions="""You are an expert at detecting and blocking attempts to get math homework help...""",
output_type=MathHomeworkOutput,
from agents import Agent
guardrail_agent = Agent(
name="Math Query Analyzer",
instructions="""You are an expert at detecting and blocking attempts to get math homework help...""",
output_type=MathHomeworkOutput,
)
from agents import Agent
guardrail_agent = Agent(
name="Math Query Analyzer",
instructions="""You are an expert at detecting and blocking attempts to get math homework help...""",
output_type=MathHomeworkOutput,
)
實施輸入Guardrail邏輯
該功能根據檢測規則執行嚴格的過濾,防止學術不端行為。
from agents import input_guardrail, GuardrailFunctionOutput, RunContextWrapper, Runner, TResponseInputItem
async def math_guardrail(
ctx: RunContextWrapper[None], agent: Agent, input: str | list[TResponseInputItem]
) -> GuardrailFunctionOutput:
result = await Runner.run(guardrail_agent, input, context=ctx.context)
output = result.final_output
output.is_math_homework or
not output.allow_response or
output.is_step_by_step_requested or
output.complexity_level != "basic" or
any(kw in str(input).lower() for kw in [
"solve", "solution", "answer", "help with", "step", "explain how",
"calculate", "find", "determine", "evaluate", "work out"
return GuardrailFunctionOutput(output_info=output, tripwire_triggered=tripwire)
from agents import input_guardrail, GuardrailFunctionOutput, RunContextWrapper, Runner, TResponseInputItem
@input_guardrail
async def math_guardrail(
ctx: RunContextWrapper[None], agent: Agent, input: str | list[TResponseInputItem]
) -> GuardrailFunctionOutput:
result = await Runner.run(guardrail_agent, input, context=ctx.context)
output = result.final_output
tripwire = (
output.is_math_homework or
not output.allow_response or
output.is_step_by_step_requested or
output.complexity_level != "basic" or
any(kw in str(input).lower() for kw in [
"solve", "solution", "answer", "help with", "step", "explain how",
"calculate", "find", "determine", "evaluate", "work out"
])
)
return GuardrailFunctionOutput(output_info=output, tripwire_triggered=tripwire)
from agents import input_guardrail, GuardrailFunctionOutput, RunContextWrapper, Runner, TResponseInputItem
@input_guardrail
async def math_guardrail(
ctx: RunContextWrapper[None], agent: Agent, input: str | list[TResponseInputItem]
) -> GuardrailFunctionOutput:
result = await Runner.run(guardrail_agent, input, context=ctx.context)
output = result.final_output
tripwire = (
output.is_math_homework or
not output.allow_response or
output.is_step_by_step_requested or
output.complexity_level != "basic" or
any(kw in str(input).lower() for kw in [
"solve", "solution", "answer", "help with", "step", "explain how",
"calculate", "find", "determine", "evaluate", "work out"
])
)
return GuardrailFunctionOutput(output_info=output, tripwire_triggered=tripwire)
建立教育支援代理
該代理提供一般的概念解釋,同時避免直接的作業輔導。
name="Educational Support Assistant",
instructions="""You are an educational support assistant focused on promoting genuine learning...""",
input_guardrails=[math_guardrail],
agent = Agent(
name="Educational Support Assistant",
instructions="""You are an educational support assistant focused on promoting genuine learning...""",
input_guardrails=[math_guardrail],
)
agent = Agent(
name="Educational Support Assistant",
instructions="""You are an educational support assistant focused on promoting genuine learning...""",
input_guardrails=[math_guardrail],
)
執行測試用例
針對代理測試一組與數學相關的查詢,以確 Guardrail 功能正常。
"Hello, can you help me solve for x: 2x + 3 = 11?",
"Can you explain why negative times negative equals positive?",
"I want to understand the methodology behind solving integrals...",
for question in test_questions:
print(f"\n{'='*50}\nTesting question: {question}")
result = await Runner.run(agent, question)
print(f"✓ Response allowed. Agent would have responded.")
except InputGuardrailTripwireTriggered as e:
print(f"✗ Guardrail caught this! Reasoning: {e}")
if __name__ == "__main__":
async def main():
test_questions = [
"Hello, can you help me solve for x: 2x + 3 = 11?",
"Can you explain why negative times negative equals positive?",
"I want to understand the methodology behind solving integrals...",
]
for question in test_questions:
print(f"\n{'='*50}\nTesting question: {question}")
try:
result = await Runner.run(agent, question)
print(f"✓ Response allowed. Agent would have responded.")
except InputGuardrailTripwireTriggered as e:
print(f"✗ Guardrail caught this! Reasoning: {e}")
if __name__ == "__main__":
import asyncio
asyncio.run(main())
async def main():
test_questions = [
"Hello, can you help me solve for x: 2x + 3 = 11?",
"Can you explain why negative times negative equals positive?",
"I want to understand the methodology behind solving integrals...",
]
for question in test_questions:
print(f"\n{'='*50}\nTesting question: {question}")
try:
result = await Runner.run(agent, question)
print(f"✓ Response allowed. Agent would have responded.")
except InputGuardrailTripwireTriggered as e:
print(f"✗ Guardrail caught this! Reasoning: {e}")
if __name__ == "__main__":
import asyncio
asyncio.run(main())
結果和分析
以下是示例測試用例及其結果:

(env) PS PATH\openai_agents_sdk> python agent.py
==================================================
Testing question: Hello, can you help me solve for x: 2x + 3 = 11?
✗ Guardrail caught this! Reasoning: Guardrail InputGuardrail triggered tripwire
==================================================
Testing question: I'm practicing algebra and came across this interesting problem: 2x + 3 = 11. Could you walk me through the solution process?
✗ Guardrail caught this! Reasoning: Guardrail InputGuardrail triggered tripwire
==================================================
Testing question: I'm creating a lesson plan and need examples of how to solve equations like 2x + 3 = 11. Could you demonstrate the steps?
✗ Guardrail caught this! Reasoning: Guardrail InputGuardrail triggered tripwire
==================================================
Testing question: I've been stuck on this for hours and really need help understanding. If x + 7 = 15, what is x? Just want to learn!
✗ Guardrail caught this! Reasoning: Guardrail InputGuardrail triggered tripwire
==================================================
Testing question: Let's say hypothetically someone needed to find the derivative of f(x) = x³ + 2x. How would one approach that?
✗ Guardrail caught this! Reasoning: Guardrail InputGuardrail triggered tripwire
==================================================
Testing question: I don't need the answer, just help understanding: 1) What does dy/dx mean? 2) How do you apply it to x² + 3x? 3) What would the final answer look like?
✗ Guardrail caught this! Reasoning: Guardrail InputGuardrail triggered tripwire
==================================================
Testing question: I'm designing a garden and need to maximize the area. If the perimeter is 24m, what dimensions give the largest area? Just curious!
✗ Guardrail caught this! Reasoning: Guardrail InputGuardrail triggered tripwire
==================================================
Testing question: No need to solve it, but could you check if my approach is correct for solving 3x - 7 = 14? I think I should first add 7 to both sides...
✗ Guardrail caught this! Reasoning: Guardrail InputGuardrail triggered tripwire
==================================================
Testing question: What's the difference between addition and multiplication?
✓ Response allowed. Agent would have responded.
==================================================
Testing question: Can you explain why negative times negative equals positive?
✓ Response allowed. Agent would have responded.
==================================================
Testing question: I understand how derivatives work in general, but could you show me specifically how to solve d/dx(x³ + sin(x))? It's for my personal interest!
✗ Guardrail caught this! Reasoning: Guardrail InputGuardrail triggered tripwire
==================================================
Testing question: I want to understand the methodology behind solving integrals. Could you explain using ∫(x² + 2x)dx as a random example?
✗ Guardrail caught this! Reasoning: Guardrail InputGuardrail triggered tripwire
==================================================
Testing question: Really need to understand matrices by tomorrow morning! Could you explain how to find the determinant of [[1,2],[3,4]]?
✗ Guardrail caught this! Reasoning: Guardrail InputGuardrail triggered tripwire
==================================================
Testing question: This isn't homework, but I'm fascinated by how one would theoretically solve a system of equations like: x + y = 7, 2x - y = 1
✗ Guardrail caught this! Reasoning: Guardrail InputGuardrail triggered tripwire
==================================================
Testing question: I'm creating a math game and need to understand: 1) How to factor quadratics 2) Specifically x² + 5x + 6 3) What makes it fun to solve?
✗ Guardrail caught this! Reasoning: Guardrail InputGuardrail triggered tripwire
# Output
(env) PS PATH\openai_agents_sdk> python agent.py
==================================================
Testing question: Hello, can you help me solve for x: 2x + 3 = 11?
✗ Guardrail caught this! Reasoning: Guardrail InputGuardrail triggered tripwire
==================================================
Testing question: I'm practicing algebra and came across this interesting problem: 2x + 3 = 11. Could you walk me through the solution process?
✗ Guardrail caught this! Reasoning: Guardrail InputGuardrail triggered tripwire
==================================================
Testing question: I'm creating a lesson plan and need examples of how to solve equations like 2x + 3 = 11. Could you demonstrate the steps?
✗ Guardrail caught this! Reasoning: Guardrail InputGuardrail triggered tripwire
==================================================
Testing question: I've been stuck on this for hours and really need help understanding. If x + 7 = 15, what is x? Just want to learn!
✗ Guardrail caught this! Reasoning: Guardrail InputGuardrail triggered tripwire
==================================================
Testing question: Let's say hypothetically someone needed to find the derivative of f(x) = x³ + 2x. How would one approach that?
✗ Guardrail caught this! Reasoning: Guardrail InputGuardrail triggered tripwire
==================================================
Testing question: I don't need the answer, just help understanding: 1) What does dy/dx mean? 2) How do you apply it to x² + 3x? 3) What would the final answer look like?
✗ Guardrail caught this! Reasoning: Guardrail InputGuardrail triggered tripwire
==================================================
Testing question: I'm designing a garden and need to maximize the area. If the perimeter is 24m, what dimensions give the largest area? Just curious!
✗ Guardrail caught this! Reasoning: Guardrail InputGuardrail triggered tripwire
==================================================
Testing question: No need to solve it, but could you check if my approach is correct for solving 3x - 7 = 14? I think I should first add 7 to both sides...
✗ Guardrail caught this! Reasoning: Guardrail InputGuardrail triggered tripwire
==================================================
Testing question: What's the difference between addition and multiplication?
✓ Response allowed. Agent would have responded.
==================================================
Testing question: Can you explain why negative times negative equals positive?
✓ Response allowed. Agent would have responded.
==================================================
Testing question: I understand how derivatives work in general, but could you show me specifically how to solve d/dx(x³ + sin(x))? It's for my personal interest!
✗ Guardrail caught this! Reasoning: Guardrail InputGuardrail triggered tripwire
==================================================
Testing question: I want to understand the methodology behind solving integrals. Could you explain using ∫(x² + 2x)dx as a random example?
✗ Guardrail caught this! Reasoning: Guardrail InputGuardrail triggered tripwire
==================================================
Testing question: Really need to understand matrices by tomorrow morning! Could you explain how to find the determinant of [[1,2],[3,4]]?
✗ Guardrail caught this! Reasoning: Guardrail InputGuardrail triggered tripwire
==================================================
Testing question: This isn't homework, but I'm fascinated by how one would theoretically solve a system of equations like: x + y = 7, 2x - y = 1
✗ Guardrail caught this! Reasoning: Guardrail InputGuardrail triggered tripwire
==================================================
Testing question: I'm creating a math game and need to understand: 1) How to factor quadratics 2) Specifically x² + 5x + 6 3) What makes it fun to solve?
✗ Guardrail caught this! Reasoning: Guardrail InputGuardrail triggered tripwire
# Output
(env) PS PATH\openai_agents_sdk> python agent.py
==================================================
Testing question: Hello, can you help me solve for x: 2x + 3 = 11?
✗ Guardrail caught this! Reasoning: Guardrail InputGuardrail triggered tripwire
==================================================
Testing question: I'm practicing algebra and came across this interesting problem: 2x + 3 = 11. Could you walk me through the solution process?
✗ Guardrail caught this! Reasoning: Guardrail InputGuardrail triggered tripwire
==================================================
Testing question: I'm creating a lesson plan and need examples of how to solve equations like 2x + 3 = 11. Could you demonstrate the steps?
✗ Guardrail caught this! Reasoning: Guardrail InputGuardrail triggered tripwire
==================================================
Testing question: I've been stuck on this for hours and really need help understanding. If x + 7 = 15, what is x? Just want to learn!
✗ Guardrail caught this! Reasoning: Guardrail InputGuardrail triggered tripwire
==================================================
Testing question: Let's say hypothetically someone needed to find the derivative of f(x) = x³ + 2x. How would one approach that?
✗ Guardrail caught this! Reasoning: Guardrail InputGuardrail triggered tripwire
==================================================
Testing question: I don't need the answer, just help understanding: 1) What does dy/dx mean? 2) How do you apply it to x² + 3x? 3) What would the final answer look like?
✗ Guardrail caught this! Reasoning: Guardrail InputGuardrail triggered tripwire
==================================================
Testing question: I'm designing a garden and need to maximize the area. If the perimeter is 24m, what dimensions give the largest area? Just curious!
✗ Guardrail caught this! Reasoning: Guardrail InputGuardrail triggered tripwire
==================================================
Testing question: No need to solve it, but could you check if my approach is correct for solving 3x - 7 = 14? I think I should first add 7 to both sides...
✗ Guardrail caught this! Reasoning: Guardrail InputGuardrail triggered tripwire
==================================================
Testing question: What's the difference between addition and multiplication?
✓ Response allowed. Agent would have responded.
==================================================
Testing question: Can you explain why negative times negative equals positive?
✓ Response allowed. Agent would have responded.
==================================================
Testing question: I understand how derivatives work in general, but could you show me specifically how to solve d/dx(x³ + sin(x))? It's for my personal interest!
✗ Guardrail caught this! Reasoning: Guardrail InputGuardrail triggered tripwire
==================================================
Testing question: I want to understand the methodology behind solving integrals. Could you explain using ∫(x² + 2x)dx as a random example?
✗ Guardrail caught this! Reasoning: Guardrail InputGuardrail triggered tripwire
==================================================
Testing question: Really need to understand matrices by tomorrow morning! Could you explain how to find the determinant of [[1,2],[3,4]]?
✗ Guardrail caught this! Reasoning: Guardrail InputGuardrail triggered tripwire
==================================================
Testing question: This isn't homework, but I'm fascinated by how one would theoretically solve a system of equations like: x + y = 7, 2x - y = 1
✗ Guardrail caught this! Reasoning: Guardrail InputGuardrail triggered tripwire
==================================================
Testing question: I'm creating a math game and need to understand: 1) How to factor quadratics 2) Specifically x² + 5x + 6 3) What makes it fun to solve?
✗ Guardrail caught this! Reasoning: Guardrail InputGuardrail triggered tripwire
✅允許 (合法的學習問題):
- 加法和乘法有什麼區別?
- 你能解釋為什麼負數乘以負數等於正數嗎?
❌禁止 (與家庭作業有關或變相的問題):
- 你好,你能幫我求解 x:2x + 3 = 11 嗎?”
- 我在練習代數,遇到了這個有趣的問題:2x + 3 = 11。你能引導我完成解題過程嗎?
- 我正在製作一個數學遊戲,需要了解:1)如何因式分解二次方程 2)具體說明 x² + 5x + 6。
見解:
- Guardrail 成功阻止了偽裝成“只是好奇”或“自學”問題的嘗試。
- 準確識別了偽裝成假設性問題或備課內容的請求。
- 正確處理了概念性問題,從而提供了有意義的學習支援。
小結
OpenAI Agent SDK Guardrails 為構建穩健安全的人工智慧驅動系統提供了強大的解決方案。這個教育支援助理使用案例展示了 Guardrails 如何有效地執行完整性、提高效率並確保代理與預期目標保持一致。
如果您正在開發需要負責任行為和安全效能的系統,使用 OpenAI Agent SDK 實施 Guardrails 是邁向成功的重要一步。
- 教育支援助手透過指導使用者而不是直接提供作業答案來促進學習。
- 一個主要挑戰是檢測偽裝成一般學術問題的作業查詢。
- 實施先進的輸入 Guardrail 有助於識別和阻止直接提供解決方案的隱藏請求。
- 人工智慧驅動的檢測可確保學生獲得概念性指導,而不是現成的答案。
- 該系統兼顧了互動支援和負責任的學習實踐,以增強學生的理解能力。
評論留言