我要投稿

Agent 记忆拆解 | Gemini CLI

发布日期：2025-07-08 07:51:11 浏览次数： 1834

作者：Max的技术摸鱼时间

微信搜一搜，关注“Max的技术摸鱼时间”

Gemini CLI 是 Google 最近推出的一个开源命令行 AI 助手工具，专为开发者设计。它能够理解代码、执行复杂查询、自动化任务，并利用 Gemini 的多模态能力（如图像识别）生成创意内容，下面我们开始分析Gemini CLI 记忆的实现方式。

1. 短期记忆

Gemini CLI没有常见的短期记忆形式（比如最近沟通的N条记录），它采用的是全部对话历史数据，每一次与Gemini 沟通的过程中，会传入所有对话历史记录。为了解决记忆丢失问题，Gemini CLI会对历史对话进行压缩。

注意：1. Gemini CLI中的对话历史只会存储到内存中，重启即丢失。
2. Gemini CLI中最多进行100轮对话，防止AI出现无限循环。

2. 对话压缩（情景记忆）

当对话历史的 token 数量达到模型限制的 70% 时触发压缩。压缩后的对话数据通过User消息注入到后续的对话历史中。这个设计理念与图尔文的长期记忆-情景记忆的定义一致，表示模型从过去的对话中学习到的历史经验和教训。

记忆示例：

<compressed_chat_history>    <overall_goal>用户的高层目标</overall_goal>    <key_knowledge>关键知识</key_knowledge>    <file_system_state>文件系统状态</file_system_state>    <recent_actions>最近的行动</recent_actions>    <current_plan>当前计划</current_plan></compressed_chat_history>

压缩对话的Prompt如下：

You are the component that summarizes internal chat history into a given structure.

When the conversation history grows too large, you will be invoked to distill the entire history into a concise, structured XML snapshot. This snapshot is CRITICAL, as it will become the Agent's *only* memory of the past. The agent will resume its work based solely on this snapshot. All crucial details, plans, errors, and user directives MUST be preserved.

First, you will think through the entire history in a private <scratchpad>. Review the user's overall goal, the agent's actions, tool outputs, file modifications, and any unresolved questions. Identify every piece of information that is essential for future actions.

After your reasoning is complete, generate the final <compressed_chat_history> XML object. Be incredibly dense with information. Omit any irrelevant conversational filler.

The structure MUST be as follows:

<compressed_chat_history>
    <overall_goal>
        <!-- A single, concise sentence describing the user's high-level objective. -->
        <!-- Example: "Refactor the authentication service to use a new JWT library." -->
    </overall_goal>

    <key_knowledge>
        <!-- Crucial facts, conventions, and constraints the agent must remember based on the conversation history and interaction with the user. Use bullet points. -->
        <!-- Example:
         - Build Command: \`npm run build\`
         - Testing: Tests are run with \`npm test\`. Test files must end in \`.test.ts\`.
         - API Endpoint: The primary API endpoint is \`https://api.example.com/v2\`.

        -->
    </key_knowledge>

    <file_system_state>
        <!-- List files that have been created, read, modified, or deleted. Note their status and critical learnings. -->
        <!-- Example:
         - CWD: \`/home/user/project/src\`
         - READ: \`package.json\` - Confirmed 'axios' is a dependency.
         - MODIFIED: \`services/auth.ts\` - Replaced 'jsonwebtoken' with 'jose'.
         - CREATED: \`tests/new-feature.test.ts\` - Initial test structure for the new feature.
        -->
    </file_system_state>

    <recent_actions>
        <!-- A summary of the last few significant agent actions and their outcomes. Focus on facts. -->
        <!-- Example:
         - Ran \`grep 'old_function'\` which returned 3 results in 2 files.
         - Ran \`npm run test\`, which failed due to a snapshot mismatch in \`UserProfile.test.ts\`.
         - Ran \`ls -F static/\` and discovered image assets are stored as \`.webp\`.
        -->
    </recent_actions>

    <current_plan>
        <!-- The agent's step-by-step plan. Mark completed steps. -->
        <!-- Example:
         1. [DONE] Identify all files using the deprecated 'UserAPI'.
         2. [IN PROGRESS] Refactor \`src/components/UserProfile.tsx\` to use the new 'ProfileAPI'.
         3. [TODO] Refactor the remaining files.
         4. [TODO] Update tests to reflect the API change.
        -->
    </current_plan>
</compressed_chat_history>

3. 用户偏好（语义记忆）

Gemini CLI要记录用户的偏好，两种触发方式：
1. 用户主动说：“记住我当前的选择”
2. 模型自动识别：“好的，我已经记住了你喜欢吃苹果。”

实现方式：
Gemini CLI 偏好记忆通过调用记忆工具（Function Calling技术）实现的，数据会存储到Gemini.md的文件中

工具描述如下：

Saves a specific piece of information or fact to your long-term memory

Use this tool:
- When the user explicitly asks you to remember something
- When the user states a clear, concise fact about themselves

Do NOT use this tool:
- To remember conversational context that is only relevant for the current session
- To save long, complex, or rambling pieces of text