构建从屏幕截图自动检测代码错误的多代理系统

构建从屏幕截图自动检测代码错误的多代理系统

人工智能能否仅通过分析屏幕截图就能检测并修复编码错误?答案是肯定的。这种创新方法利用人工智能和推理从图像中识别编码错误,提出准确的解决方案,并解释其背后的逻辑。该系统的核心是一个分散的多代理系统,在该系统中,自主代理(如人工智能模型、工具或服务)协同工作。每个代理都会收集数据、做出本地化决策,并为解决复杂的调试任务做出贡献。通过将这一过程自动化,开发人员可以节省时间、提高准确性,并避免手动在线搜索解决方案的麻烦。

  • 了解“推理多代理系统”及其如何自动检测错误并根据屏幕截图生成解决方案。
  • 探索人工智能在提高多代理系统(Multi-Agent System with Reasoning)软件调试效率方面的作用。
  • 了解 Griptape 如何通过模块化工作流程简化多代理系统的开发。
  • 利用人工智能模型实施多代理系统,从截图中检测编码错误。
  • 利用视觉语言模型和基于推理的 LLM 自动检测和解释错误。
  • 构建和部署专门从事网络搜索、推理和图像分析的人工智能代理。
  • 开发结构化工作流,以高效提取、分析和解决编码错误。
  • 优化多代理系统实施的安全性、可扩展性和可靠性。

多代理系统简介

多代理系统(MAS)是由众多交互式智能代理组成的复杂框架,每个代理都拥有独特的技能和目标。这些代理可以采取各种形式,包括软件应用程序、机器人实体、无人机、传感器甚至人类,或这些元素的混合体。多智能体系统的主要目的是通过利用集体智能、协作和代理间协调努力的力量,应对单个代理难以独立应对的挑战。

多代理系统的显著特点

  • 自主性:每个代理都具有一定程度的自治功能,可根据对周围环境的局部了解做出选择。
  • 权力下放:权力分散到各个代理,即使某些部分出现故障,系统也能保持运行。
  • 自我组织:代理有能力根据突发行为进行自我调整和安排,从而实现有效的任务分配和冲突管理。
  • 实时功能:MAS 可对动态状况做出迅速反应,无需人工监督,因此非常适合应急响应和交通管制等场景。

多代理系统的一些实例

多代理系统通过实现自主代理之间的智能协作,正在改变各行各业。以下是一些实际案例,展示了它们在现实世界中的应用。

  • 动态解决查询的代理:这是一个复杂的多代理系统,旨在有效解决客户咨询。它首先利用其广泛的知识库,并在必要时从集成工具中检索相关信息,以提供准确的答案。
  • 动态分配票单:这一先进的多代理系统可将收到的支持票据自动分配给最合适的代理,从而简化客户支持部门的票据管理工作流程。它利用生成式人工智能,根据类别、严重程度和代理专业化等既定标准对每张票单进行评估。
  • 分析知识库缺口的代理:这个专门的多代理系统旨在通过准确定位需要在现有文章中进行更好覆盖的重复性支持挑战来提高知识库的效率。通过使用生成式人工智能,该代理可检查支持票单和客户咨询的趋势,以确定需要改进的领域。

使用Griptape构建多代理系统

Griptape 框架通过模块化设计和安全的工作流程平衡了可预测性和创造性,从而简化了协作式人工智能代理的开发过程。

代理专业化与协调

Griptape 使开发人员能够定义具有不同角色的代理,例如

  • 研究型代理:使用网络搜索和刮擦等工具收集数据
  • 写作代理:将洞察力转化为针对特定受众的叙述
  • 分析代理:根据预定义模式或业务规则验证输出结果

代理通过工作流进行交互,在保持依赖关系的同时并行执行任务。例如,一个研究代理的发现可以触发多个写作代理同时生成内容

工作流设计

该框架支持两种方法:

  • 顺序流水线:用于线性任务执行(例如,数据摄取 → 分析 → 报告)
  • 基于 DAG 的工作流:适用于复杂的分支逻辑,代理可根据中间输出进行动态调整

安全性和可扩展性

主要保障措施包括

  • 非提示数据处理:最大限度地减少 LLM 交互过程中敏感信息的暴露
  • 权限控制:根据代理角色限制工具的使用
  • 云端集成:通过 Griptape Cloud 等服务独立部署代理,实现横向扩展

实施最佳实践

  • 使用规则集强制执行代理行为(例如,格式化标准、道德准则)
  • 充分利用内存类型: 短期记忆用于创意任务,长期记忆用于结构化流程
  • 在本地测试工作流程,然后再部署到分布式环境中

Griptape 的模块化架构通过优先使用 Python 代码进行逻辑定义,减少了对提示工程的依赖,使其成为客户支持自动化和实时数据分析管道等企业级应用的理想选择。

开发用于解决编码错误的多代理系统的实践实施

在本教程中,我们将创建一个多代理系统,旨在从编码截图中自动检测错误,具体以 Python 为例。该系统不仅能识别错误,还能为用户提供清晰的解错说明。在整个过程中,我们将结合使用视觉语言模型和基于推理的大型语言模型,以增强我们的多代理框架的功能。

第 1 步:安装和导入必要的库

首先,我们将安装以下所有必需的库:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
!pip install griptape
!sudo apt update
!sudo apt install -y pciutils
!pip install langchain-ollama
!curl -fsSL https://ollama.com/install.sh | sh
!pip install ollama==0.4.2
!pip install "duckduckgo-search>=7.0.1"
import os
from griptape.drivers.prompt.ollama import OllamaPromptDriver
import requests
from griptape.drivers.file_manager.local import LocalFileManagerDriver
from griptape.drivers.prompt.openai import OpenAiChatPromptDriver
from griptape.loaders import ImageLoader
from griptape.structures import Agent
from griptape.tools import FileManagerTool, ImageQueryTool
from griptape.tasks import PromptTask, StructureRunTask
from griptape.drivers.structure_run.local import LocalStructureRunDriver
from griptape.structures import Agent, Workflow
from griptape.drivers.web_search.duck_duck_go import DuckDuckGoWebSearchDriver
from griptape.structures import Agent
from griptape.tools import PromptSummaryTool, WebSearchTool
!pip install griptape !sudo apt update !sudo apt install -y pciutils !pip install langchain-ollama !curl -fsSL https://ollama.com/install.sh | sh !pip install ollama==0.4.2 !pip install "duckduckgo-search>=7.0.1" import os from griptape.drivers.prompt.ollama import OllamaPromptDriver import requests from griptape.drivers.file_manager.local import LocalFileManagerDriver from griptape.drivers.prompt.openai import OpenAiChatPromptDriver from griptape.loaders import ImageLoader from griptape.structures import Agent from griptape.tools import FileManagerTool, ImageQueryTool from griptape.tasks import PromptTask, StructureRunTask from griptape.drivers.structure_run.local import LocalStructureRunDriver from griptape.structures import Agent, Workflow from griptape.drivers.web_search.duck_duck_go import DuckDuckGoWebSearchDriver from griptape.structures import Agent from griptape.tools import PromptSummaryTool, WebSearchTool
!pip install griptape
!sudo apt update
!sudo apt install -y pciutils
!pip install langchain-ollama
!curl -fsSL https://ollama.com/install.sh | sh
!pip install ollama==0.4.2
!pip install "duckduckgo-search>=7.0.1"
import os
from griptape.drivers.prompt.ollama import OllamaPromptDriver
import requests
from griptape.drivers.file_manager.local import LocalFileManagerDriver
from griptape.drivers.prompt.openai import OpenAiChatPromptDriver
from griptape.loaders import ImageLoader
from griptape.structures import Agent
from griptape.tools import FileManagerTool, ImageQueryTool
from griptape.tasks import PromptTask, StructureRunTask
from griptape.drivers.structure_run.local import LocalStructureRunDriver
from griptape.structures import Agent, Workflow
from griptape.drivers.web_search.duck_duck_go import DuckDuckGoWebSearchDriver
from griptape.structures import Agent
from griptape.tools import PromptSummaryTool, WebSearchTool

第 2 步:运行Ollama服务器并提取模型

以下代码将启动 ollama 服务器。我们还从 ollama 提取了“minicpm-v”模型,以便使用该视觉模型从手写笔记中提取文本。

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
import threading
import subprocess
import time
def run_ollama_serve():
subprocess.Popen(["ollama", "serve"])
thread = threading.Thread(target=run_ollama_serve)
thread.start()
time.sleep(5)
!ollama pull minicpm-v
import threading import subprocess import time def run_ollama_serve(): subprocess.Popen(["ollama", "serve"]) thread = threading.Thread(target=run_ollama_serve) thread.start() time.sleep(5) !ollama pull minicpm-v
import threading
import subprocess
import time
def run_ollama_serve():
subprocess.Popen(["ollama", "serve"])
thread = threading.Thread(target=run_ollama_serve)
thread.start()
time.sleep(5)
!ollama pull minicpm-v

现在,我们还需要设置以下开放式人工智能应用程序接口密钥,以便与 Griptape 上的 Ollama 模型聊天

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
import os
os.environ["OPENAI_API_KEY"] = ""
import os os.environ["OPENAI_API_KEY"] = ""
import os
os.environ["OPENAI_API_KEY"] = ""

我们还将利用功能强大的 LLM,通过解释编码错误和所提供解决方案的充分上下文来协助推理。为此,我们使用了 phi4-mini 模型。

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
!ollama pull phi4-mini
!ollama pull phi4-mini
!ollama pull phi4-mini

第 3 步:创建分析截图的代理

我们首先创建一个代理来分析 Python 编码错误的屏幕截图。该代理在后台利用了视觉语言模型 (minicpm-v)。

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
images_dir = os.getcwd()
def analyze_screenshots():
driver = LocalFileManagerDriver(workdir=images_dir)
return Agent(
tools=[
FileManagerTool(file_manager_driver=driver),
ImageQueryTool(
prompt_driver=OllamaPromptDriver(model="minicpm-v"), image_loader=ImageLoader(file_manager_driver=driver)
),
])
images_dir = os.getcwd() def analyze_screenshots(): driver = LocalFileManagerDriver(workdir=images_dir) return Agent( tools=[ FileManagerTool(file_manager_driver=driver), ImageQueryTool( prompt_driver=OllamaPromptDriver(model="minicpm-v"), image_loader=ImageLoader(file_manager_driver=driver) ), ])
images_dir = os.getcwd()
def analyze_screenshots():
driver = LocalFileManagerDriver(workdir=images_dir)
return Agent(
tools=[
FileManagerTool(file_manager_driver=driver),
ImageQueryTool(
prompt_driver=OllamaPromptDriver(model="minicpm-v"), image_loader=ImageLoader(file_manager_driver=driver)
),
])

第 4 步:创建网络搜索和推理代理

然后,我们创建了两个代理,一个用于搜索编码错误的可能解决方案,另一个用于推理错误及其解决方案。

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
def websearching_agent():
return Agent(
tools=[WebSearchTool(web_search_driver=DuckDuckGoWebSearchDriver()), PromptSummaryTool(off_prompt=False)],
)
def reasoning_agent():
return Agent(
prompt_driver=OllamaPromptDriver(
model="phi4-mini",
))
def websearching_agent(): return Agent( tools=[WebSearchTool(web_search_driver=DuckDuckGoWebSearchDriver()), PromptSummaryTool(off_prompt=False)], ) def reasoning_agent(): return Agent( prompt_driver=OllamaPromptDriver( model="phi4-mini", ))
def websearching_agent():
return Agent(
tools=[WebSearchTool(web_search_driver=DuckDuckGoWebSearchDriver()), PromptSummaryTool(off_prompt=False)],
)
def reasoning_agent():
return Agent(
prompt_driver=OllamaPromptDriver(
model="phi4-mini",
))

第 5 步:定义分析截图、查找解决方案和提供推理的任务

我们使用代码片段截图(见第 6 步)进行自动评估。我们将其保存在当前工作目录下,名为“sample.jpg”。这是一份手写答卷。这个代理系统首先会从编码截图中提取错误,并找出可能的解决方案。然后,它将提供错误及其解决方案背后的充分理由。

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
image_file_name = "pythonerror1.jpg"
team = Workflow()
screenshotanalysis_task= StructureRunTask(
(
"""Extract IN TEXT FORMAT ALL THE LINES FROM THE GIVEN SCREEN SHOT %s"""%(image_file_name),
),
id="research",
structure_run_driver=LocalStructureRunDriver(
create_structure=analyze_screenshots,
),
)
findingsolution_task =StructureRunTask(
(
"""FIND SOLUTION TO ONLY THE CODING ERRORS FOUND in the TEXT {{ parent_outputs["research"] }}. DO NOT INCLUDE ANY ADDITIONAL JUNK NON CODING LINES WHILE FINDING THE SOLUTION.
""",
),id="evaluate",
structure_run_driver=LocalStructureRunDriver(
create_structure=websearching_agent,
)
)
reasoningsolution_task = StructureRunTask(
(
"""ADD TO THE PREVIOUS OUTPUT, EXPANDED VERSION OF REASONING ON HOW TO SOLVE THE ERROR BASED ON {{ parent_outputs["evaluate"] }}.
DO INCLUDE THE WHOLE OUTPUT FROM THE PREVIOUS AGENT {{ parent_outputs["evaluate"] }} AS WELL IN THE FINAL OUTPUT.
""",
),
structure_run_driver=LocalStructureRunDriver(
create_structure=reasoning_agent,
)
)
image_file_name = "pythonerror1.jpg" team = Workflow() screenshotanalysis_task= StructureRunTask( ( """Extract IN TEXT FORMAT ALL THE LINES FROM THE GIVEN SCREEN SHOT %s"""%(image_file_name), ), id="research", structure_run_driver=LocalStructureRunDriver( create_structure=analyze_screenshots, ), ) findingsolution_task =StructureRunTask( ( """FIND SOLUTION TO ONLY THE CODING ERRORS FOUND in the TEXT {{ parent_outputs["research"] }}. DO NOT INCLUDE ANY ADDITIONAL JUNK NON CODING LINES WHILE FINDING THE SOLUTION. """, ),id="evaluate", structure_run_driver=LocalStructureRunDriver( create_structure=websearching_agent, ) ) reasoningsolution_task = StructureRunTask( ( """ADD TO THE PREVIOUS OUTPUT, EXPANDED VERSION OF REASONING ON HOW TO SOLVE THE ERROR BASED ON {{ parent_outputs["evaluate"] }}. DO INCLUDE THE WHOLE OUTPUT FROM THE PREVIOUS AGENT {{ parent_outputs["evaluate"] }} AS WELL IN THE FINAL OUTPUT. """, ), structure_run_driver=LocalStructureRunDriver( create_structure=reasoning_agent, ) )
image_file_name = "pythonerror1.jpg"
team = Workflow()
screenshotanalysis_task= StructureRunTask(
(
"""Extract IN TEXT FORMAT ALL THE LINES FROM THE GIVEN SCREEN SHOT %s"""%(image_file_name),
),
id="research",
structure_run_driver=LocalStructureRunDriver(
create_structure=analyze_screenshots,
),
)
findingsolution_task =StructureRunTask(
(
"""FIND SOLUTION TO ONLY THE CODING ERRORS FOUND in the TEXT {{ parent_outputs["research"] }}. DO NOT INCLUDE ANY ADDITIONAL JUNK NON CODING  LINES WHILE FINDING THE SOLUTION.
""",
),id="evaluate",
structure_run_driver=LocalStructureRunDriver(
create_structure=websearching_agent,
)
)
reasoningsolution_task = StructureRunTask(
(
"""ADD TO THE PREVIOUS OUTPUT, EXPANDED VERSION OF REASONING ON HOW TO SOLVE THE ERROR BASED ON {{ parent_outputs["evaluate"] }}.
DO INCLUDE THE WHOLE OUTPUT FROM THE PREVIOUS AGENT {{ parent_outputs["evaluate"] }}  AS WELL IN THE FINAL OUTPUT.
""",
),
structure_run_driver=LocalStructureRunDriver(
create_structure=reasoning_agent,
)
)

第 6 步:执行工作流程

现在,我们将执行工作流程。

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
screenshotanalysis_task.add_child(findingsolution_task)
findingsolution_task.add_child(reasoningsolution_task)
screenshotanalysis_task.add_child(reasoningsolution_task)
team = Workflow(
tasks=[screenshotanalysis_task,findingsolution_task,reasoningsolution_task],
)
answer = team.run()
print(answer.output)
screenshotanalysis_task.add_child(findingsolution_task) findingsolution_task.add_child(reasoningsolution_task) screenshotanalysis_task.add_child(reasoningsolution_task) team = Workflow( tasks=[screenshotanalysis_task,findingsolution_task,reasoningsolution_task], ) answer = team.run() print(answer.output)
screenshotanalysis_task.add_child(findingsolution_task)
findingsolution_task.add_child(reasoningsolution_task)
screenshotanalysis_task.add_child(reasoningsolution_task)  
team  = Workflow(
tasks=[screenshotanalysis_task,findingsolution_task,reasoningsolution_task],
)
answer = team.run()
print(answer.output)

输入截图

输入代码片段截图

代理系统的输出

Certainly! Here is an expanded explanation of how you can solve this error in Python:When working with strings and integers together, it's important that both elements are either numbers (integers or floats) for numerical operations like addition. In your case, you're trying to concatenate a string ("hello world") The error occurs because Python does not allow direct concatenation of strings and integers without explicitly handling them as separate types first (i.e., by conversion). The solution is straightforward: convert both elements to com Here's an expanded explanation along with your corrected code:```pythontry:# Initialize variable 'a' as 1234 (an integer)a = 1234# Convert 'a' from int to str and then concatenate" hello world" print(str(a) + "hello world")except Exception as error: # Catch any exceptions that might occur print("Oops! An exception has occured: ", error)# Print the type of the caught exception for debugging purposes. print("Exception TYPE:", type (error))# Explicitly stating what class TypeError is expected in this context,# though it's redundant since we've already captured and printed it above.print("Exception TYPE: <class 'TypeError'>")In summary, converting an integer to a string before concatenation solves the issue by ensuring both elements are strings. This allows for seamless addition (concatenation) of these two pieces into one coherent output.Remember that this approach is not limited just to adding integers and strings; it's applicable whenever you need to concatenate different data types in Python, provided they can be converted or handled as compatible formats first.

从上述输出中可以看出,不仅错误得到了正确的解释和充分的推理,而且解决方案也提供了足够的推理。

使用其他查询进行分析

让我们用另一个查询截图试试不同的 Python 编码问题

Python 编码问题截图

代理系统的输出

Certainly! Here is an expanded explanation of how we can solve any potential issues with this corrected version:1. **Syntax Correction**: The original error was due to improper indentation, which caused a `IndentationError. In Python, proper block structure and consistent indentations are crucial for the interpreter. 2. **Handling ZeroDivisionError Properly**:- A division by zero is an exceptional case that needs special handling.- Using try-except blocks allows us gracefully handle this exception without crashing our program unexpectedly (i.e., `ZeroDivisionError").3. **Logging and Debugging Information**:- The use of the logging module (log.debug`) helps in tracking what values are being processed, which is useful for debugging.Here's a complete example with proper indentation:```pythonimport logging# Configure logger to display debug level messages.logging.basicConfig(level-logging.DEBUG)log = logging.getLogger(____name___)def division(a, b):log.debug(f"Dividing {a} by {b}.")try:return a / b # Attempt the division operation hereexcept ZeroDivisionError as e: # Catch and handle zero-division error specifically. log.exception("Oh noes!") # Log an exception message with traceback# Example usage of function to demonstrate logging outputif __name_ "_main_":print(division(10, 2))try:division (5, 0) except Exception as e:pass# Should work fine# This should raise a ZeroDivisionError and log it.# Catch any unexpected exceptions (though unlikely here)

从上述输出中可以看出,错误得到了正确的解释和充分的说明。此外,还提供了避免该错误的替代编码解决方案,并给出了充分的理由。

让我们用另一个不同 Python 编码问题的查询截图来试试。

不同 Python 编码问题的查询截图

代理系统的输出

Certainly! Here's an expanded version of how to solve this error based on the previous output:1. **Identify Missing Dependency:**The error message indicates that there is a missing dependency, specifically for the `packaging module which setuptools requires.2. **Install 'packaging' Package Manually (if not already installed):*****bashpip install packaging3. **Upgrade Setuptools to Ensure Compatibility:**Upgrading setuptools can help resolve any compatibility issues and ensure that all dependencies are correctly managed:***bashpip install --upgrade setuptools4. **Re-run the Installation Command:**After installing `packaging` manually (if it wasn't installed previously) or upgrading setuptools, re-execute your original installation command to see if this resolves any issues.5. **Verify Dependency Resolution and Reinstallation Attempts:**If you encounter further errors related specifically to missing dependencies after following the above steps:- Check for additional required packages by reviewing error messages.- Install those specific requirements using pip, e.g., `pip install <missing-package-name>`.6. **Check Environment Consistency:**Ensure that your Python environment is consistent and not conflicting with other installations or virtual environments:***bash# List installed packages to verify consistency across different setups (if applicable)pip list# If using a specific version of setuptools, ensure it's correctly configured:7. **Consult Documentation:**Refer to the official documentation for both `packaging and `setuptools if you encounter persistent issues or need more detailed guidance on resolving complex dependency problems.8. **Seek Community Help (if needed):**If after following these steps, you're still facing difficulties:- Post a question with specific error messages in relevant forums like Stack Overflow.- Provide details about your environment setup and the commands you've run for better assistance from community members or experts.By carefully addressing each step above based on what you encounter during installation attempts (as indicated by any new errors), you'll be able to resolve missing dependencies effectively. This systematic approach ensures that all required packages are correctly installed

小结

集成多代理系统 (MAS) 从屏幕截图中自动检测编码错误可显著提高开发人员的效率。通过利用人工智能和 Griptape 等工具,这种方法可以提供及时、准确的解决方案,并提供详细的推理,从而为开发人员节省宝贵的时间。此外,MAS 的灵活性和可扩展性可应用于各行各业,实现无缝任务管理并提高工作效率。

  • 集成自动化系统,从屏幕截图中识别编码错误,通过提供带有详细推理的准确解决方案,减少手动错误搜索的需要,从而为开发人员节省大量时间。
  • MAS 是一种去中心化架构,使用自主代理协作解决复杂问题,增强了跨行业的任务管理和可扩展性。
  • Griptape 框架简化了多代理系统的开发,提供模块化设计、代理专业化、安全工作流和可扩展性,是企业级人工智能解决方案的理想选择。
  • MAS 可动态适应不断变化的条件,是编码错误检测、客户支持自动化和数据分析等实时应用的理想选择。

评论留言