基于AutoGen与Moonshot API的多Agent体验

Moonshot API介绍

Moonshot是国内公司-月之暗面（Moonshot AI）的最新LLM大模型工具，而Kimi智能助手是一款功能强大、定位独特的对话式AI产品。它近期在国产LLM大模型中以可无损读取和分析长达200万字的超长文本，实现智能总结、翻译等而出名。Kimi已成为笔者必备的阅读利器，最近期望基于LLM大模型尝试一些落地工程试验。因此首选了Moonshot这款LLM API。Moonshot API采用了和OpenAI一样的协议标准，相对收费价格也相对便宜很多。

当前官方提供API支持的模型有：

• moonshot-v1-8k: 它是一个长度为 8k 的模型，适用于生成短文本。
• moonshot-v1-32k: 它是一个长度为 32k 的模型，适用于生成长文本。
• moonshot-v1-128k: 它是一个长度为 128k 的模型，适用于生成超长文本。

要想使用Moonshot API（首次注册会赠送15元，足够用一段时间的体验），则需要先申请API token：https://platform.moonshot.cn/console/api-keys

下面是Moonshot API的简单测试案例：

需要确保使用的 python 版本至少为 3.7.1， openai 的 sdk 版本（pip install --upgrade 'openai>=1.0'）不低于 1.0.0。

然后需要手动在当前目录下OAI_CONFIG_LIST文件中配置LLM API token配置文件(这是后续AutoGen规范配置)，如下所示：

[
    {
        "model": "moonshot-v1-8k",
        "api_key": "[替换你的API秘钥]",
        "base_url": "https://api.moonshot.cn/v1",
        "api_type": "openai"        
    }    
]

然后运行如下推理代码：

from openai import OpenAI
from autogen import config_list_from_json

config_list = config_list_from_json(env_or_file="OAI_CONFIG_LIST")

client = OpenAI(
    api_key=config_list[0]['api_key'],
    base_url=config_list[0]['base_url'],
)
 
history = [
    {"role": "system", "content": "你是 Kimi，由 Moonshot AI 提供的人工智能助手，你更擅长中文和英文的对话。你会为用户提供安全，有帮助，准确的回答。同时，你会拒绝一切涉及恐怖主义，种族歧视，黄色暴力等问题的回答。Moonshot AI 为专有名词，不可翻译成其他语言。"}
]
 
def chat(query, history):
    history += [{
        "role": "user", 
        "content": query
    }]
    completion = client.chat.completions.create(
        model="moonshot-v1-8k",
        messages=history,
        temperature=0.3,
    )
    result = completion.choices[0].message.content
    history += [{
        "role": "assistant",
        "content": result
    }]
    return result
 
print(chat("地球的自转周期是多少？", history))
print(chat("月球呢？", history))

输出效果：

AutoGen简介

AutoGen 是微软开源的一个创新的多Agent框架，AutoGen能够协调多个可以相互对话的LLM Agent合力一起来解决用户问题的多Agent框架。AutoGen的Agent代理是可以用户定制和对话的，并且允许用户在其中无缝的参与引导LLM Agent。更多信息可参考之前文章：AutoGenStudio：AutoGen与Kimi API使用指南，开启多Agent下一代LLM智能应用

AutoGen 主要特点

• 简化多代理对话: AutoGen 使得基于多代理对话构建下一代 LLM 应用程序变得轻而易举。它简化了复杂 LLM 工作流的编排、自动化和优化。
• 最大化性能: 它最大化了 LLM 模型的性能，并克服了当前LLM模型幻觉缺陷。
• 支持多样化对话模式: 借助可定制和可对话的代理，开发者可以使用 AutoGen 构建涉及对话自主性、代理数量和代理对话拓扑的复杂工作流程的广泛对话模式。
• 提供不同复杂度的工作系统集合: 这些系统覆盖了不同领域和复杂度的广泛应用，展示了 AutoGen 可以轻松支持多样化对话模式的能力。
• 由合作研究驱动: AutoGen 由来自微软、宾夕法尼亚州立大学和华盛顿大学的合作研究研究支持。

AutoGen 框架的核心在于其能够将多个代理的对话能力集成到 LLM 应用程序中，从而创造出能够自我组织和解决问题的系统。这些代理不仅能够相互交流，还能够与人类用户进行互动，使得应用程序更加灵活和适应性强。开发者可以根据自己的需求定制代理的行为和对话模式，从而创造出适合特定任务或工作流程的解决方案。

基于Moonshot API的AutoGen体验

本文演示代码可使用Colab在线体验：https://github.com/greengerong/awesome-llm/blob/main/colab/autogen/autogen_kimi.ipynb

首先需要通过下面CLI命令安装autogen依赖库：

!mkdir autogen
%cd /content/autogen
!touch OAI_CONFIG_LIST
!pip install pyautogen
!pip install -qqq matplotlib numpy

然后运行下面Agent对话，这里使用的是两个Agent写作绘制TSLA和META股价图表分析问题：

import autogen
import datetime
import os
from autogen import AssistantAgent, UserProxyAgent,config_list_from_json,ConversableAgent
import tempfile
from autogen.coding import DockerCommandLineCodeExecutor,LocalCommandLineCodeExecutor
from IPython.display import Image

config_list = config_list_from_json(env_or_file="OAI_CONFIG_LIST")
llm_config={"config_list": config_list}

# Create a temporary directory to store the code files.
temp_dir = tempfile.TemporaryDirectory()

# Create a local command line code executor.
executor = LocalCommandLineCodeExecutor(
    timeout=10,  # Timeout for each code execution in seconds.
    work_dir=temp_dir.name,  # Use the temporary directory to store the code files.
)

# Create an agent with code executor configuration.
code_executor_agent = ConversableAgent(    
    "code_executor_agent",
    system_message="Reply 'TERMINATE' in the end when code execute success.",
    llm_config=False,  # Turn off LLM for this agent.
    code_execution_config={"executor": executor},  # Use the local command line code executor.
    human_input_mode="NEVER",  # ALWAYS: Always take human input for this agent for safety.  
    is_termination_msg=lambda msg: msg.get("content") is not None and "Great!" in msg["content"],     
)

# The code writer agent's system message is to instruct the LLM on how to use
# the code executor in the code executor agent.
code_writer_system_message = """You are a helpful AI assistant.
Solve tasks using your coding and language skills.
In the following cases, suggest python code (in a python coding block) or shell script (in a sh coding block) for the user to execute.
1. When you need to collect info, use the code to output the info you need, for example, browse or search the web, download/read a file, print the content of a webpage or a file, get the current date/time, check the operating system. After sufficient info is printed and the task is ready to be solved based on your language skill, you can solve the task by yourself.
2. When you need to perform some task with code, use the code to perform the task and output the result. Finish the task smartly.
Solve the task step by step if you need to. If a plan is not provided, explain your plan first. Be clear which step uses code, and which step uses your language skill.
When using code, you must indicate the script type in the code block. The user cannot provide any other feedback or perform any other action beyond executing the code you suggest. The user can't modify your code. So do not suggest incomplete code which requires users to modify. Don't use a code block if it's not intended to be executed by the user.
If you want the user to save the code in a file before executing it, put # filename: <filename> inside the code block as the first line. Don't include multiple code blocks in one response. Do not ask users to copy and paste the result. Instead, use 'print' function for the output when relevant. Check the execution result returned by the user.
If the result indicates there is an error, fix the error and output the code again. Suggest the full code instead of partial code or code changes. If the error can't be fixed or if the task is not solved even after the code is executed successfully, analyze the problem, revisit your assumption, collect additional info you need, and think of a different approach to try.
When you find an answer, verify the answer carefully. Include verifiable evidence in your response if possible.
Reply 'TERMINATE' in the end when everything is done.
"""

code_writer_agent = ConversableAgent(
    "code_writer_agent",
    system_message=code_writer_system_message,
    llm_config=llm_config,
    code_execution_config=False,  # Turn off code execution for this agent.     
)

today = datetime.datetime.now().strftime("%Y-%m-%d")
chat_result = code_executor_agent.initiate_chat(
    code_writer_agent,
    message=f"Today is {today}. Write Python code to plot TSLA's and META's "
    "stock price gains YTD, and save the plot to a file named 'stock_gains.png'.Reply successfully saved when everything is done",
)

输出效果：

Agent问题分析过程：

从图中我们能看见编程Agent根据问题请求LLM生成代码，并由代码执行Agent执行代码的协作过程，直到问题得以解决，整个过程无需人工干预。

最后，AutoGen基于LLM的多Agent协同问题自主规划、自主工具使用，解决问题能力，这将是构建下一代 LLM 应用程序的LLM工程化落地方案。

推荐新闻

RAG系列04：使用ReRank进行重排序

本文介绍了重排序的原理和两种主流的重排序方法：基于重排模型和基于 LLM。文章指出，重排序是对检索到的上下文进行再次筛选的过程，类似于排序过程中的粗排和精排。在检索增强生成中，精排的术语就叫重排序。文章还介绍了使用 Cohere 提供的在线模型、bge-reranker-base 和 bge-reranker-large 等开源模型以及 LLM 实现重排序的方法。最后，文章得出结论：使用重排模型的方法轻量级、开销较小；而使用 LLM 的方法在多个基准测试上表现良好，但成本较高，且只有在使用 ChatGPT 和 GPT-4 时表现良好，如使用其他开源模型，如 FLAN-T5 和 Vicuna-13B 时，其性能就不那么理想。因此，在实际项目中，需要做出特定的权衡。

LangGPT论文：面向大语言模型的自然语言编程框架（中文版）

大语言模型 (Large Language Models, LLMs) 在不同领域都表现出了优异的性能。然而，对于非AI专家来说，制定高质量的提示来引导 LLMs 是目前AI应用领域的一项重要挑战。

第三篇：要真正入门AI，OpenAI的官方Prompt工程指南肯定还不够，您必须了解的强大方法论和框架！！！

自从ChatGPT（全名：Chat Generative Pre-trained Transformer）于2022年11月30日发布以来，一个新兴的行业突然兴起，那就是提示工程（Prompt engineering），可谓如日冲天。从简单的文章扩写，到RAG，ChatGPT展现了前所未有的惊人能力。

（三）12个RAG痛点及其解决方案

痛点9:结构化数据QA 痛点10:从复杂 PDF 中提取数据痛点11:后备模型痛点12:LLM安全

（二）12个RAG痛点及其解决方案

痛点5:格式错误痛点6:不正确的特异性痛点7:不完整痛点8:数据摄取可扩展性