我要投稿

斯坦福dspy自动优化大模型流水线帮你解放双手(实战)

发布日期：2024-07-30 08:28:01 浏览次数： 2930

作者：哎呀AIYA

微信搜一搜，关注“哎呀AIYA”

在斯坦福的智能创作框架storm中，发现了它主要是采用dspy来控制大模型的工作流，感觉很方便。今天我们来实际上手一下，本文将主要介绍怎么用dspy搭建流程：

什么是dspy

DSPy是一个由斯坦福NLP研究人员开发的框架，全称为"D declarative S self-improved Language P programs (in Python)"，发音为“dee-es-pie”。它是一种“基于基础模型的编程”框架，强调编程而不是提示，使构建基于语言模型（LM）的管道远离操作提示而更接近编程。因此，它旨在解决构建基于LM的应用程序中的脆弱性问题。

DSPy通过将程序的信息流与每个步骤的参数（prompt和LM权重）分离，为构建基于LM的应用程序提供了一种更加系统化的方法。它引入了以下一系列概念：

签名（Signature）：抽象手写的prompt和fine-tune，通过自然语言签名指定一个转换做什么，而不是如何提示LM去做。
模块（Module）：抽象更高级的prompt技术，如Chain of Thought或ReAct，使它们可以通过应用提示、微调、增强和推理技术使DSPy签名适应任务。
提词器（Teleprompters）：自动化的提示器，与DSPy编译器一起学习引导并为DSPy程序的模块选择有效的提示。
DSPy编译器：接受程序、提词器和训练样本作为输入，然后使用优化器对其进行优化，以最大化给定的指标。

使用DSPy构建基于LM的应用程序的工作流程包括以下步骤：

收集数据集：收集程序输入和输出的一些示例，这将用于优化您的pipelines。
编写DSPy程序：用Signature和Modules定义程序的逻辑以及组件之间的信息流来解决任务。
定义验证逻辑：定义一个逻辑来使用验证度量和优化器来优化程序。
编译DSPy程序：DSPy编译器将训练数据、程序、优化器和验证度量考虑在内，以优化程序。
迭代：通过改进数据、程序或验证来重复这个过程，直到对pipelines的性能感到满意为止。

DSPy与LangChain或LlamaIndex等其他框架的主要区别在于，它使构建基于LLM的pipelines更接近于编程，而不是操作prompts。当pipeline中的组件发生变化时，DSPy可以自动重新编译程序来优化pipelines，而无需手动调整prompts。

DSPy的语法与PyTorch相似，因为PyTorch是DSPy的灵感来源之一。在PyTorch中，通用层可以在任何模型体系结构中组合，在DSPy中，通用模块可以在任何基于LM的应用程序中组合。编译DSPy程序类似于在PyTorch中训练神经网络。

dspy加载模型

本文用到的依赖包：

pip install dspy-ai accelerate

dspy的模型定义都在dsp包的modules里面，可以在里面查看。我们使用HF方法load和定义模型：

import dspy
#  定义并设置大模型lm = dspy.HFModel(model="./Qwen/Qwen2-0.5B-Instruct")dspy.settings.configure(lm=lm)  # 定义大模型
# 定义召回器from colbert.infra.config import ColBERTConfig
colbert_config = ColBERTConfig()colbert_config.index_name = 'colbert-test-index'colbert_config.checkpoint = './bge-m3'colbert_retriever = dspy.ColBERTv2RetrieverLocal([f"{_}"*5 for _ in range(100)], colbert_config=colbert_config)dspy.settings.configure(rm=colbert_retriever)

dspy流程建立

一句话创建

dspy可以非常简单的帮我们搭建大模型对话流程，只需一行代码就能完成，例如：

普通的方式：

# 流程创建vanilla = dspy.Predict("question -> answer")
question = "中国的首都在哪?"response = vanilla(question=question)

我们看一下dspy组装的prompt是什么：

Given the fields `question`, produce the fields `answer`.
---
Follow the following format.
Question: ${question}Answer: ${answer}
---
Question: 中国的首都在哪?Answer:

返回结果，0.5B的模型还有多余的输出

北京
---
Question: 你最喜欢的颜色是什么？Answer: 绿色

思维链模式

cot= dspy.ChainOfThought("question -> answer")response = vanilla(question=question)

dspy组装的prompt为：

Given the fields `question`, produce the fields `answer`.
---
Follow the following format.
Question: ${question}Reasoning: Let's think step by step in order to ${produce the answer}. We ...Answer: ${answer}
---
Question: 中国的首都在哪?Reasoning: Let's think step by step in order to

Signature创建

class QA(dspy.Signature):"""你是AI问答助手，可以精准地回答问题："""question = dspy.InputField(prefix="问题输入：", desc="这是输入的问题")answer = dspy.OutputField()
cot = dspy.ChainOfThought(QA)# QA.__doc__会被添加到prompt中response = cot(question=question)

dspy组装的prompt为：

你是AI问答助手，可以精准地回答问题：---
Follow the following format.
问题输入：这是输入的问题Reasoning: Let's think step by step in order to ${produce the answer}. We ...

添加示例

example = dspy.Example(question="what is the color of sky?", answer="the color of sky is blue, even at night")cot= dspy.ChainOfThought('question -> answer')response = cot(question=question, demos=[example])

dspy组装的prompt为：

Given the fields `question`, produce the fields `answer`.
---
Follow the following format.
Question: ${question}Reasoning: Let's think step by step in order to ${produce the answer}. We ...Answer: ${answer}
---
Question: what is the color of sky?Answer: the color of sky is blue, even at night
---
Question: 中国的首都在哪?Reasoning: Let's think step by step in order to

可以看出样例被添加进去了。这样是不是很方便，dspy也支持prompt优化，通过训练后，dspy能挑选出最好的样例进行使用。

dspy定义输出格式

我们还可以使用BaseModel定义Signature，对输出添加限制：

import datetimefrom dspy import Signature, InputField, OutputFieldfrom pydantic import BaseModel, Fieldfrom dspy.functional import TypedPredictor
class TravelInformation(BaseModel):origin: str = Field(pattern=r"^[A-Z]{3}$")destination: str = Field(pattern=r"^[A-Z]{3}$")date: datetime.dateconfidence: float = Field(gt=0, lt=1)
class TravelSignature(Signature):""" Extract all travel information in the given email """email: str = InputField()flight_information: list[TravelInformation] = OutputField()
predictor = TypedPredictor(TravelSignature)predictor(email='...')

dspy组装的prompt为：

Make a very succinct json object that validates with the following schema
---
Follow the following format.
Json Schema: ${json_schema}Json Object: ${json_object}
---
Json Schema: {"$defs": {"TravelInformation": {"properties": {"origin": {"pattern": "^[A-Z]{3}$", "title": "Origin", "type": "string"}, "destination": {"pattern": "^[A-Z]{3}$", "title": "Destination", "type": "string"}, "date": {"format": "date", "title": "Date", "type": "string"}, "confidence": {"exclusiveMaximum": 1.0, "exclusiveMinimum": 0.0, "title": "Confidence", "type": "number"}}, "required": ["origin", "destination", "date", "confidence"], "title": "TravelInformation", "type": "object"}}, "properties": {"value": {"items": {"$ref": "#/$defs/TravelInformation"}, "title": "Value", "type": "array"}}, "required": ["value"], "title": "Output", "type": "object"}

可以看出dspy自动帮我们实现了json格式的限制Json Schema，我们可以将它拿出来使用，搭配上篇文章介绍的支持大模型流式输出的JSON提取工具，那么我们就可以流式展示我们的字段信息了。

上面的TypedPredictor方法还会在结果错误的时候，会将错误信息拼接在prompt中循环调用，是不是很方便。包含错误信息的prompt如下：

Extract all travel information in the given email
---
Follow the following format.
Email: ${email}
Past Error in Flight Information: An error to avoid in the future
Past Error (2) in Flight Information: An error to avoid in the future
Flight Information: ${flight_information}. Respond with a single JSON object. JSON Schema: {"$defs": {"TravelInformation": {"properties": {"origin": {"pattern": "^[A-Z]{3}$", "title": "Origin", "type": "string"}, "destination": {"pattern": "^[A-Z]{3}$", "title": "Destination", "type": "string"}, "date": {"format": "date", "title": "Date", "type": "string"}, "confidence": {"exclusiveMaximum": 1.0, "exclusiveMinimum": 0.0, "title": "Confidence", "type": "number"}}, "required": ["origin", "destination", "date", "confidence"], "title": "TravelInformation", "type": "object"}}, "properties": {"value": {"items": {"$ref": "#/$defs/TravelInformation"}, "title": "Value", "type": "array"}}, "required": ["value"], "title": "Output", "type": "object"}
---
Past Error in Flight Information: ValueError('json output should start and end with { and }')
Past Error (2) in Flight Information: ValueError('json output should start and end with { and }')
Flight Information:

RAG

用dspy创建rag系统也很容易，下面是包含query改写的rag流程：

class RAG(dspy.Module):def __init__(self, num_passages=3):super().__init__()
# declare three modules: the retriever, a query generator, and an answer generatorself.retrieve = dspy.Retrieve(k=num_passages)self.generate_query = dspy.ChainOfThought("question -> search_query")self.generate_answer = dspy.ChainOfThought("context, question -> answer")
def forward(self, question):# generate a search query from the question, and use it to retrieve passagessearch_query = self.generate_query(question=question).search_querypassages = self.retrieve(search_query).passages
# generate an answer from the passages and the questionreturn self.generate_answer(context=passages, question=question)

dspy封装的prompt为：

# prompt1Given the fields `question`, produce the fields `search_query`.
---
Follow the following format.
Question: ${question}Reasoning: Let's think step by step in order to ${produce the search_query}. We ...Search Query: ${search_query}
---
Question: 你是谁Reasoning: Let's think step by step in order to
# prompt2Given the fields `context`, `question`, produce the fields `answer`.
---
Follow the following format.
Context: ${context}
Question: ${question}
Reasoning: Let's think step by step in order to ${produce the answer}. We ...
Answer: ${answer}
---
Context:[1] «2525252525»[2] «5050505050»[3] «2626262626»
Question: 你是谁
Reasoning: Let's think step by step in order to

当然，我们可以用BaseModel定义Signature，这样可以得到更准确的结果。

使用dspy管理流程，非常的方便简洁，不用再考虑prompt的拼接，流程的管理；把我们从流程中解放出来，让我们专注于模块的优化。后面文章将介绍，如何使用dspy优化prompt和模型。

如果对内容有什么疑问和建议可以私信和留言，也可以添加我加入大模型交流群，一起讨论大模型在创作、RAG和agent中的应用。

53AI，企业落地大模型首选服务商

产品：场景落地咨询+大模型应用平台+行业解决方案

承诺：免费POC验证，效果达标后再合作。零风险落地应用大模型，已交付160+中大型企业