AI_Agent工作原理和架构

qq_40126397

818人浏览 · 2026-02-25 09:38:13

qq_40126397 · 2026-02-25 09:38:13 发布

AI Agent 的工作原理和架构是什么？

从前端开发者视角深入理解 AI Agent 的核心机制和设计模式

开篇

最近在深入研究 AI Agent 时，作为一个前端开发者，我发现这个概念其实和我们熟悉的前端架构有很多相似之处。刚开始接触时，满脑子都是：Agent 到底是什么？它和普通的 AI 模型调用有什么区别？为什么现在大家都在讨论 Agent？

经过一段时间的学习和实践，在英博云平台部署了几个 Agent 应用后，我终于对 Agent 的工作原理和架构有了清晰的认识。这篇文章我想从前端开发者的角度，用我们熟悉的概念来解释 AI Agent 是如何工作的。

这篇文章能帮你什么？

理解 AI Agent 的核心工作原理
掌握 Agent 的基本架构设计
学会从零开始构建一个简单的 Agent
了解实际部署 Agent 时的关键要点

背景知识：AI Agent 到底是什么？

从"被动调用"到"主动执行"

在没有 Agent 之前，我们使用 AI 模型的方式是这样的：

// 传统的 AI 模型调用方式
const response = await fetch('https://api.openai.com/v1/chat/completions', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${API_KEY}`,
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    model: 'gpt-4',
    messages: [{ role: 'user', content: '帮我查一下今天的天气' }]
  })
});

// 返回：我无法查询天气，我只能基于训练数据回答...

这就像我们在前端调用一个纯函数，给它输入，它给你输出，仅此而已。它不会主动去做任何事情。

而 AI Agent 的出现，让 AI 从"被动回答"变成了"主动执行"：

// AI Agent 的方式
const agent = new AIAgent({
  tools: [weatherTool, searchTool, calculatorTool]
});

const result = await agent.execute('帮我查一下今天的天气');

// Agent 会自动：
// 1. 理解你的需求
// 2. 决定调用 weatherTool
// 3. 获取结果
// 4. 返回格式化的答案：今天北京天气晴，温度 25°C

这就像什么呢？ 如果说传统 AI 调用是一个纯函数，那 Agent 就是一个带有状态管理和副作用处理的完整应用。

核心原理：Agent 是如何工作的？

基本工作流程

AI Agent 的核心工作流程可以概括为一个循环：感知(Perception) → 思考(Reasoning) → 行动(Action) → 反馈(Feedback)

让我用前端的概念来类比：

// 这就像 React 的事件循环
class AIAgent {
  constructor(config) {
    this.llm = config.llm;           // 大语言模型（大脑）
    this.tools = config.tools;       // 可用工具（手和脚）
    this.memory = config.memory;     // 记忆（状态管理）
    this.maxIterations = config.maxIterations || 10;
  }

  async execute(userInput) {
    // 初始化记忆
    this.memory.add({
      role: 'user',
      content: userInput
    });

    let iteration = 0;

    // 类似于 React 的渲染循环
    while (iteration < this.maxIterations) {
      // 1. 感知 - 理解当前状态
      const context = this.memory.getContext();

      // 2. 思考 - LLM 决策下一步行动
      const decision = await this.llm.think({
        context,
        tools: this.tools.map(t => t.description)
      });

      // 3. 判断：是否完成任务？
      if (decision.action === 'FINISH') {
        return decision.finalAnswer;
      }

      // 4. 行动 - 执行工具
      const tool = this.tools.find(t => t.name === decision.tool);
      const result = await tool.execute(decision.params);

      // 5. 反馈 - 更新记忆（类似 setState）
      this.memory.add({
        role: 'tool',
        tool: decision.tool,
        result: result
      });

      iteration++;
    }

    throw new Error('Agent 达到最大迭代次数');
  }
}

关键组件详解

1. LLM（Large Language Model）- 大脑

LLM 是 Agent 的核心，负责理解、推理、决策。就像前端应用的业务逻辑层。

// LLM 的思考过程
async think({ context, tools }) {
  const prompt = `
你是一个 AI Agent，需要帮助用户完成任务。

对话历史：
${context}

可用工具：
${tools.map(t => `- ${t.name}: ${t.description}`).join('\n')}

请决定：
1. 如果任务已完成，返回 {"action": "FINISH", "finalAnswer": "..."}
2. 如果需要调用工具，返回 {"action": "USE_TOOL", "tool": "工具名", "params": {...}}
`;

  const response = await this.callLLM(prompt);
  return JSON.parse(response);
}

2. Tools（工具）- 手和脚

Tools 让 Agent 能够与外部世界交互。这就像前端的 API 调用、浏览器 API、第三方库等。

// 天气查询工具示例
const weatherTool = {
  name: 'get_weather',
  description: '查询指定城市的天气信息',
  parameters: {
    type: 'object',
    properties: {
      city: {
        type: 'string',
        description: '城市名称，如"北京"、"上海"'
      }
    },
    required: ['city']
  },

  async execute(params) {
    // 调用天气 API
    const response = await fetch(
      `https://api.weather.com/v1/current?city=${params.city}`
    );
    const data = await response.json();

    return {
      temperature: data.temp,
      condition: data.condition,
      humidity: data.humidity
    };
  }
};

3. Memory（记忆）- 状态管理

Memory 存储对话历史和中间结果，类似前端的状态管理（Redux、Zustand 等）。

// 简单的记忆实现
class Memory {
  constructor() {
    this.messages = [];
  }

  add(message) {
    this.messages.push({
      ...message,
      timestamp: Date.now()
    });
  }

  getContext() {
    // 类似于 Redux 的 selector
    return this.messages
      .map(m => `[${m.role}]: ${JSON.stringify(m.content)}`)
      .join('\n');
  }

  // 类似于 localStorage 持久化
  save() {
    localStorage.setItem('agent_memory', JSON.stringify(this.messages));
  }

  load() {
    const saved = localStorage.getItem('agent_memory');
    if (saved) {
      this.messages = JSON.parse(saved);
    }
  }
}

实战：构建一个完整的 Agent

让我们从零开始构建一个真实可用的 Agent。这个 Agent 可以帮助我们查天气、搜索资料、做计算。

第一步：定义工具集

// tools.js
export const tools = [
  {
    name: 'get_weather',
    description: '查询指定城市的实时天气',
    parameters: {
      type: 'object',
      properties: {
        city: { type: 'string', description: '城市名称' }
      },
      required: ['city']
    },
    async execute({ city }) {
      // 实际项目中调用真实天气 API
      // 这里用模拟数据
      return {
        city,
        temperature: 25,
        condition: '晴天',
        humidity: 60
      };
    }
  },

  {
    name: 'calculator',
    description: '执行数学计算',
    parameters: {
      type: 'object',
      properties: {
        expression: { type: 'string', description: '数学表达式，如 "2 + 3 * 4"' }
      },
      required: ['expression']
    },
    async execute({ expression }) {
      try {
        // 注意：生产环境需要安全的表达式解析
        const result = eval(expression);
        return { result };
      } catch (error) {
        return { error: '计算表达式无效' };
      }
    }
  },

  {
    name: 'web_search',
    description: '在互联网上搜索信息',
    parameters: {
      type: 'object',
      properties: {
        query: { type: 'string', description: '搜索关键词' }
      },
      required: ['query']
    },
    async execute({ query }) {
      // 实际项目中调用搜索 API（Google、Bing 等）
      return {
        results: [
          { title: '搜索结果1', snippet: '相关内容...' },
          { title: '搜索结果2', snippet: '相关内容...' }
        ]
      };
    }
  }
];

第二步：实现 Agent 核心逻辑

// agent.js
import { tools } from './tools.js';

class SimpleAgent {
  constructor({ apiKey, model = 'gpt-4' }) {
    this.apiKey = apiKey;
    this.model = model;
    this.tools = tools;
    this.memory = [];
    this.maxIterations = 10;
  }

  async callLLM(messages) {
    const response = await fetch('https://api.openai.com/v1/chat/completions', {
      method: 'POST',
      headers: {
        'Authorization': `Bearer ${this.apiKey}`,
        'Content-Type': 'application/json'
      },
      body: JSON.stringify({
        model: this.model,
        messages,
        temperature: 0.7
      })
    });

    const data = await response.json();
    return data.choices[0].message.content;
  }

  buildPrompt() {
    const toolsDesc = this.tools.map(t =>
      `- ${t.name}: ${t.description}\n  参数: ${JSON.stringify(t.parameters)}`
    ).join('\n');

    return `你是一个智能助手，可以使用以下工具来帮助用户：

${toolsDesc}

请按以下格式响应：
1. 如果需要使用工具：
   {"action": "use_tool", "tool": "工具名", "params": {...}, "reasoning": "为什么使用这个工具"}

2. 如果任务完成：
   {"action": "finish", "answer": "最终答案", "reasoning": "为什么认为任务完成"}

对话历史：
${this.memory.map(m => `${m.role}: ${m.content}`).join('\n')}

请决策下一步行动：`;
  }

  async execute(userInput) {
    console.log(`\n👤 用户: ${userInput}\n`);

    // 添加用户输入到记忆
    this.memory.push({
      role: 'user',
      content: userInput
    });

    let iteration = 0;

    while (iteration < this.maxIterations) {
      iteration++;
      console.log(`🔄 迭代 ${iteration}:`);

      // 构建提示词
      const systemMessage = {
        role: 'system',
        content: this.buildPrompt()
      };

      // LLM 决策
      const decision = await this.callLLM([systemMessage]);

      let parsedDecision;
      try {
        // 提取 JSON（可能包含在其他文本中）
        const jsonMatch = decision.match(/\{[\s\S]*\}/);
        parsedDecision = JSON.parse(jsonMatch[0]);
      } catch (error) {
        console.error('❌ 解析 LLM 响应失败:', decision);
        break;
      }

      console.log(`💭 推理: ${parsedDecision.reasoning}`);

      // 判断行动类型
      if (parsedDecision.action === 'finish') {
        console.log(`\n✅ 任务完成!\n`);
        return parsedDecision.answer;
      }

      if (parsedDecision.action === 'use_tool') {
        const tool = this.tools.find(t => t.name === parsedDecision.tool);

        if (!tool) {
          console.error(`❌ 工具 ${parsedDecision.tool} 不存在`);
          break;
        }

        console.log(`🔧 使用工具: ${tool.name}`);
        console.log(`📝 参数: ${JSON.stringify(parsedDecision.params)}`);

        // 执行工具
        try {
          const result = await tool.execute(parsedDecision.params);
          console.log(`📊 结果: ${JSON.stringify(result)}\n`);

          // 添加工具执行结果到记忆
          this.memory.push({
            role: 'tool',
            tool: parsedDecision.tool,
            content: JSON.stringify(result)
          });
        } catch (error) {
          console.error(`❌ 工具执行失败: ${error.message}`);
          this.memory.push({
            role: 'tool',
            tool: parsedDecision.tool,
            content: `错误: ${error.message}`
          });
        }
      }
    }

    throw new Error('达到最大迭代次数，任务未完成');
  }
}

export default SimpleAgent;

第三步：使用 Agent

// main.js
import SimpleAgent from './agent.js';

async function main() {
  const agent = new SimpleAgent({
    apiKey: process.env.OPENAI_API_KEY
  });

  try {
    // 测试案例1：需要使用工具
    const answer1 = await agent.execute('北京今天天气怎么样？');
    console.log(`🤖 Agent: ${answer1}`);

    // 测试案例2：需要组合多个工具
    const answer2 = await agent.execute('帮我计算 15 * 23，然后搜索这个数字的含义');
    console.log(`🤖 Agent: ${answer2}`);

  } catch (error) {
    console.error('❌ 错误:', error.message);
  }
}

main();

运行效果示例：

👤 用户: 北京今天天气怎么样？

🔄 迭代 1:
💭 推理: 需要查询北京的天气信息
🔧 使用工具: get_weather
📝 参数: {"city":"北京"}
📊 结果: {"city":"北京","temperature":25,"condition":"晴天","humidity":60}

🔄 迭代 2:
💭 推理: 已经获取到天气信息，可以回答用户
✅ 任务完成!

🤖 Agent: 北京今天天气不错，晴天，温度25°C，湿度60%。

Agent 的架构模式

在实践中，我总结了几种常见的 Agent 架构模式：

1. ReAct 模式（Reasoning + Acting）

这是最常见的 Agent 架构，就是我们上面实现的模式：

推理(Reasoning) → 行动(Acting) → 观察(Observation) → 推理 → ...

特点：

每次决策前都会思考
适合需要多步骤推理的任务
可解释性强

类比前端： 就像 Redux 的 middleware 链，每个 action 都经过一系列的处理

2. Plan-and-Execute 模式

先制定完整计划，再逐步执行：

class PlanExecuteAgent {
  async execute(userInput) {
    // 第一步：制定计划
    const plan = await this.llm.generatePlan(userInput);
    // 例如：["查询天气", "总结天气信息", "给出穿衣建议"]

    // 第二步：逐步执行计划
    const results = [];
    for (const step of plan.steps) {
      const result = await this.executeStep(step);
      results.push(result);
    }

    // 第三步：整合结果
    return this.llm.summarize(results);
  }
}

类比前端： 像 saga 模式，先定义完整的流程图，再按步骤执行

3. Reflexion 模式（自我反思）

Agent 会评估自己的行动结果，并进行改进：

class ReflexionAgent {
  async execute(userInput) {
    let attempts = 0;
    let lastResult = null;

    while (attempts < 3) {
      // 执行任务
      const result = await this.tryExecute(userInput, lastResult);

      // 自我评估
      const evaluation = await this.llm.evaluate(result);

      if (evaluation.isGood) {
        return result;
      }

      // 反思并改进
      lastResult = {
        attempt: result,
        feedback: evaluation.feedback
      };
      attempts++;
    }
  }
}

类比前端： 像单元测试 + 持续集成，不断测试和改进

在英博云平台部署 Agent

在英博云平台部署 Agent 应用时，我踩过一些坑，分享一下经验：

部署架构

┌─────────────────────────────────────────┐
│          前端应用 (Next.js)              │
│   - 用户界面                             │
│   - 对话管理                             │
└─────────────────────────────────────────┘
                   ↓ HTTPS
┌─────────────────────────────────────────┐
│       Agent 服务 (Node.js + Express)     │
│   - Agent 逻辑                           │
│   - 工具集成                             │
│   - 会话管理                             │
└─────────────────────────────────────────┘
                   ↓
┌─────────────────────────────────────────┐
│         英博云模型部署                    │
│   - LLM API (Claude/GPT)                │
│   - 向量数据库                           │
│   - 日志监控                             │
└─────────────────────────────────────────┘

关键配置

// agent-service/config.js
export const config = {
  // 英博云部署的模型端点
  llm: {
    endpoint: process.env.EBCLOUD_LLM_ENDPOINT,
    apiKey: process.env.EBCLOUD_API_KEY,
    model: 'claude-sonnet-4.5' // 在英博云部署的模型
  },

  // Agent 配置
  agent: {
    maxIterations: 10,
    timeout: 60000, // 60秒超时
    retryAttempts: 3
  },

  // 工具配置
  tools: {
    weatherAPI: process.env.WEATHER_API_KEY,
    searchAPI: process.env.SEARCH_API_KEY
  }
};

性能优化要点

并行工具调用

如果 Agent 决定调用多个独立的工具，可以并行执行：

// 优化前：串行执行
const weather = await weatherTool.execute({ city: '北京' });
const news = await newsTool.execute({ city: '北京' });

// 优化后：并行执行
const [weather, news] = await Promise.all([
  weatherTool.execute({ city: '北京' }),
  newsTool.execute({ city: '北京' })
]);

记忆截断

对话历史过长会导致 token 消耗过大，需要截断：

class Memory {
  getContext(maxTokens = 4000) {
    let tokens = 0;
    const messages = [];

    // 从最新的消息开始往前取
    for (let i = this.messages.length - 1; i >= 0; i--) {
      const msg = this.messages[i];
      const msgTokens = this.estimateTokens(msg);

      if (tokens + msgTokens > maxTokens) {
        break;
      }

      messages.unshift(msg);
      tokens += msgTokens;
    }

    return messages;
  }
}

流式响应

使用 SSE (Server-Sent Events) 实现流式响应，提升用户体验：

// Express 路由
app.post('/api/agent/stream', async (req, res) => {
  res.setHeader('Content-Type', 'text/event-stream');
  res.setHeader('Cache-Control', 'no-cache');
  res.setHeader('Connection', 'keep-alive');

  const agent = new SimpleAgent({
    apiKey: config.llm.apiKey,
    onStep: (step) => {
      // 每一步都推送给前端
      res.write(`data: ${JSON.stringify(step)}\n\n`);
    }
  });

  try {
    const result = await agent.execute(req.body.input);
    res.write(`data: ${JSON.stringify({ type: 'done', result })}\n\n`);
  } catch (error) {
    res.write(`data: ${JSON.stringify({ type: 'error', error: error.message })}\n\n`);
  } finally {
    res.end();
  }
});

踩坑分享

问题1：Agent 陷入循环

现象： Agent 重复调用相同的工具，无法完成任务

原因： LLM 没有意识到工具已经被调用过，或者工具返回的结果格式不清晰

解决方案：

class SmartMemory {
  constructor() {
    this.messages = [];
    this.toolCallHistory = new Map(); // 记录工具调用历史
  }

  async shouldCallTool(toolName, params) {
    const key = `${toolName}-${JSON.stringify(params)}`;

    // 检查是否最近调用过
    if (this.toolCallHistory.has(key)) {
      const lastCall = this.toolCallHistory.get(key);
      if (Date.now() - lastCall.timestamp < 60000) { // 1分钟内
        return {
          shouldCall: false,
          cachedResult: lastCall.result
        };
      }
    }

    return { shouldCall: true };
  }

  recordToolCall(toolName, params, result) {
    const key = `${toolName}-${JSON.stringify(params)}`;
    this.toolCallHistory.set(key, {
      timestamp: Date.now(),
      result
    });
  }
}

问题2：工具执行失败导致 Agent 崩溃

现象： 某个工具执行出错，整个 Agent 就停止工作了

解决方案：

async executeToolSafely(tool, params) {
  try {
    const result = await Promise.race([
      tool.execute(params),
      new Promise((_, reject) =>
        setTimeout(() => reject(new Error('Tool timeout')), 30000)
      )
    ]);

    return {
      success: true,
      result
    };
  } catch (error) {
    console.error(`Tool ${tool.name} failed:`, error);

    // 返回友好的错误信息给 LLM
    return {
      success: false,
      error: `工具执行失败: ${error.message}。请尝试其他方法。`
    };
  }
}

问题3：Token 消耗过快

现象： 在英博云平台运行 Agent，发现 token 消耗速度远超预期

原因：

每次迭代都把完整对话历史发给 LLM
工具返回的数据过于详细

解决方案：

// 1. 总结历史对话
async summarizeHistory() {
  if (this.memory.messages.length > 10) {
    const summary = await this.llm.summarize(
      this.memory.messages.slice(0, -3) // 保留最近3条
    );

    this.memory.messages = [
      { role: 'system', content: `历史对话摘要: ${summary}` },
      ...this.memory.messages.slice(-3)
    ];
  }
}

// 2. 精简工具返回结果
function sanitizeToolResult(result) {
  if (typeof result === 'string' && result.length > 500) {
    return result.substring(0, 500) + '... (结果已截断)';
  }
  return result;
}

问题4：LLM 不按格式返回

现象： LLM 返回的内容不是标准的 JSON，导致解析失败

解决方案：

function extractJSON(text) {
  // 尝试多种提取方式

  // 方式1：直接解析
  try {
    return JSON.parse(text);
  } catch {}

  // 方式2：提取代码块中的 JSON
  const codeBlockMatch = text.match(/```json\s*([\s\S]*?)\s*```/);
  if (codeBlockMatch) {
    try {
      return JSON.parse(codeBlockMatch[1]);
    } catch {}
  }

  // 方式3：提取任何 JSON 对象
  const jsonMatch = text.match(/\{[\s\S]*\}/);
  if (jsonMatch) {
    try {
      return JSON.parse(jsonMatch[0]);
    } catch {}
  }

  // 方式4：使用 LLM 修正格式
  throw new Error('Unable to parse LLM response');
}

总结

经过这段时间的学习和实践，我对 AI Agent 的理解可以总结为：

核心要点：

Agent = LLM + Tools + Memory + Loop
- LLM 是大脑，负责推理和决策
- Tools 是手脚，让 Agent 能与世界交互
- Memory 是记忆，存储上下文和状态
- Loop 是循环，持续"感知-思考-行动"
从前端视角理解 Agent
- LLM 就像业务逻辑层
- Tools 就像 API 调用和浏览器 API
- Memory 就像状态管理（Redux/Zustand）
- 整个 Agent 就像一个自主决策的应用
实践建议
- 从简单的 ReAct 模式开始
- 工具设计要清晰、原子化
- 做好错误处理和重试机制
- 注意 token 消耗和性能优化
部署要点
- 使用流式响应提升体验
- 实现对话历史截断
- 工具调用要设置超时
- 记录日志便于调试

后续学习计划：

深入研究多 Agent 协作（Multi-Agent System）
学习更高级的记忆机制（向量数据库、知识图谱）
探索 Agent 的自我学习和改进能力
在英博云平台上部署更复杂的 Agent 应用

对你的建议：

如果你也是前端开发者，想入门 AI Agent：

先理解基本概念，用前端知识类比
动手实现一个简单的 Agent，感受完整流程
在英博云平台或其他平台实际部署
遇到问题不要慌，大部分坑都有成熟的解决方案

参考资源

如果这篇文章对你有帮助，欢迎交流讨论！AI Agent 这个领域还很年轻，大家一起探索。

CSDN-OPC开发者社区

这里是“一人公司”的成长家园。我们提供从产品曝光、技术变现到法律财税的全栈内容，并连接云服务、办公空间等稀缺资源，助你专注创造，无忧运营。

更多推荐

第1篇：Agent开发全景图 —— 从零构建完整的技术认知框架

在纯粹的LLM应用时代，我们与模型的交互模式是“输入——输出”的单次问答。模型像一个无所不知但被困在瓶子里的精灵，你问它答，但它无法主动采取行动、无法查阅最新资料、也无法记住你上周说过什么。AI Agent（智能体）则打破了这层屏障。它不再是简单的“文本生成器”，而是一个能够自主规划、调用工具、记忆上下文并与环境交互的智能实体。Agent = LLM（大脑） + 规划（Planning） + 记忆