微信扫码
添加专属顾问
我要投稿
掌握下一代RAG技术,构建高效AI应用。 核心内容: 1. RAG应用架构与核心组件介绍 2. 环境准备与Ollama服务启动方法 3. 模型下载与Elasticsearch部署指南
RAG 应用架构概述
Cloud Native
Ollama:本地大模型运行引擎,大模型时代的 Docker,支持快速体验部署大模型。
Spring AI Alibaba:Spring AI 增强,集成 DashScope 模型平台,快速构建大模型应用。
环境准备
Cloud Native
services:ollama:container_name: ollamaimage: ollama/ollama:latestports:11434:11434:image:ghcr.io/open-webui/open-webui:maincontainer_name: open-webuiports:3005:8080environment:'OLLAMA_BASE_URL=http://host.docker.internal:11434'# 允许容器访问宿主机网络extra_hosts:host.docker.internal:host-gateway
docker exec -it ollama ollama pull deepseek-r1:8bdocker exec -it ollama ollama pull nomic-embed-text:latest
services:elasticsearch:image: docker.elastic.co/elasticsearch/elasticsearch:8.16.1container_name: elasticsearchprivileged: trueenvironment:"cluster.name=elasticsearch""discovery.type=single-node""ES_JAVA_OPTS=-Xms512m -Xmx1096m"bootstrap.memory_lock=truevolumes:./config/es.yaml:/usr/share/elasticsearch/config/elasticsearch.ymlports:"9200:9200""9300:9300"deploy:resources:limits:cpus: "2"memory: 1000Mreservations:memory: 200M
cluster.name: docker-esnode.name: es-node-1network.host: 0.0.0.0network.publish_host: 0.0.0.0http.port: 9200http.cors.enabled: truehttp.cors.allow-origin: "*"bootstrap.memory_lock: true# 关闭认证授权 es 8.x 默认开启xpack.security.enabled: false
项目配置
Cloud Native
<!-- Spring Boot Web Starter --><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-web</artifactId><version>3.3.4</version></dependency><!-- Spring AI Ollama Starter --><dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-ollama-spring-boot-starter</artifactId><version>1.0.0-M5</version></dependency><!-- 向量存储 --><dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-elasticsearch-store</artifactId><version>1.0.0-M5</version></dependency><!-- PDF 解析 --><dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-pdf-document-reader</artifactId><version>1.0.0-M5</version></dependency>
spring:ai:# ollama 配置ollama:: http://127.0.0.1:11434chat:model: deepseek-r1:8bembedding:model: nomic-embed-text:latest# 向量数据库配置vectorstore:elasticsearch:: ollama-rag-embedding-indexsimilarity: cosinedimensions: 768elasticsearch:uris: http://127.0.0.1:9200
<dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-elasticsearch-store</artifactId><version>1.0.0-M5</version></dependency>
你是一个MacOS专家,请基于以下上下文回答:---------------------{question_answer_context}---------------------请结合给定上下文和提供的历史信息,用中文 Markdown 格式回答,若答案不在上下文中请明确告知。
核心实现
Cloud Native
public class KnowledgeInitializer implements ApplicationRunner {// 注入 VectorStore 实例,负责向量化数据的增查操作private final VectorStore vectorStore;// 向量数据库客户端,此处使用 esprivate final ElasticsearchClient elasticsearchClient;// .....@Overridepublic void run(ApplicationArguments args) {// 1. load pdf resources.List<Resource> pdfResources = loadPdfResources();// 2. parse pdf resources to Documents.List<Document> documents = parsePdfResource(pdfResources);// 3. import to ES.importToES(documents);}private List<Document> parsePdfResource(List<Resource> pdfResources) {// 按照指定策略切分文本并转为 Document 资源对象for (Resource springAiResource : pdfResources) {// 1. parse documentDocumentReader reader = new PagePdfDocumentReader(springAiResource);List<Document> documents = reader.get();logger.info("{} documents loaded", documents.size());// 2. split trunksList<Document> splitDocuments = new TokenTextSplitter().apply(documents);logger.info("{} documents split", splitDocuments.size());// 3. add res listresList.addAll(splitDocuments);}}// ......}
public class AIRagService {// 引入 system prompt tmpl("classpath:/prompts/system-qa.st")private Resource systemResource;// 注入相关 bean 实例private final ChatModel ragChatModel;private final VectorStore vectorStore;// 文本过滤,增强向量检索精度private static final String textField = "content";// ......public Flux<String> retrieve(String prompt) {// 加载 prompt tmplString promptTemplate = getPromptTemplate(systemResource);// 启用混合搜索,包括嵌入和全文搜索SearchRequest searchRequest = SearchRequest.builder().topK(4).similarityThresholdAll().build();// build chatClient,发起大模型服务调用。return ChatClient.builder(ragChatModel).build().prompt().advisors(new QuestionAnswerAdvisor(vectorStore,searchRequest,promptTemplate)).user(prompt).stream().content();}}
public class AIRagController {public AIRagService aiRagService;public Flux<String> chat(String prompt,HttpServletResponse response) {// 解决 stream 模式下响应乱码问题。response.setCharacterEncoding("UTF-8");if (!StringUtils.hasText(prompt)) {return Flux.just("prompt is null.");}return aiRagService.retrieve(prompt);}}
请求演示
Cloud Native
这里以 我现在是一个mac新手,我想配置下 mac 的触控板,让他变得更好用,你有什么建议吗?问题为例,可以看到直接调用模型的回答是比较官方,实用性不高。
RAG 优化
Cloud Native
spring:application:name: ollama-ragai:dashscope:: ${AI_DASHSCOPE_API_KEY}chat:options:model: deepseek-r1embedding:enabled: falseollama:: http://127.0.0.1:11434chat:model: deepseek-r1:8benabled: falseembedding:model: nomic-embed-text:latestvectorstore:elasticsearch:: ollama-rag-embedding-indexsimilarity: cosinedimensions: 768elasticsearch:uris: http://127.0.0.1:9200
<!-- Spring AI Alibaba DashScope --><dependency><groupId>com.alibaba.cloud.ai</groupId><artifactId>spring-ai-alibaba-starter</artifactId><version>1.0.0-M6.1</version></dependency>
public Flux<String> retrieve(String prompt) {// Get the vector store prompt tmpl.String promptTemplate = getPromptTemplate(systemResource);// Enable hybrid search, both embedding and full text searchSearchRequest searchRequest = SearchRequest.builder().topK(4).similarityThresholdAll().build();// Build ChatClient with retrieval rerank advisor:ChatClient runtimeChatClient = ChatClient.builder(chatModel).defaultAdvisors(new RetrievalRerankAdvisor(vectorStore,rerankModel,searchRequest,promptTemplate,0.1)).build();// Spring AI RetrievalAugmentationAdvisorAdvisor retrievalAugmentationAdvisor = RetrievalAugmentationAdvisor.builder().queryTransformers(RewriteQueryTransformer.builder().chatClientBuilder(ChatClient.builder(ragChatModel).build().mutate()).build()).documentRetriever(VectorStoreDocumentRetriever.builder().similarityThreshold(0.50).vectorStore(vectorStore).build()).build();// Retrieve and llm generatereturn ragClient.prompt().advisors(retrievalAugmentationAdvisor).user(prompt).stream().content();}
https://java2ai.com/docs/1.0.0-M5.1/tutorials/rag/
问题排查
Cloud Native
<repositories><repository><id>spring-milestones</id><name>Spring Milestones</name><url>https://repo.spring.io/milestone</url><snapshots><enabled>false</enabled></snapshots></repository><repository><id>spring-snapshots</id><name>Spring Snapshots</name><url>https://repo.spring.io/snapshot</url><releases><enabled>false</enabled></releases></repository></repositories>
总结
Cloud Native
53AI,企业落地大模型首选服务商
产品:场景落地咨询+大模型应用平台+行业解决方案
承诺:免费POC验证,效果达标后再合作。零风险落地应用大模型,已交付160+中大型企业
2025-09-15
2025-09-02
2025-08-05
2025-08-18
2025-08-25
2025-08-25
2025-08-25
2025-09-03
2025-08-20
2025-09-08
2025-10-04
2025-09-30
2025-09-10
2025-09-10
2025-09-03
2025-08-28
2025-08-25
2025-08-20