微信扫码
添加专属顾问
我要投稿
选择大模型部署工具不再难,本文以DeepSeek-R1 32B模型为例,详解Ollama和llama.cpp的选型指南。 核心内容: 1. Ollama和llama.cpp作为大模型部署工具的背景和区别 2. Ollama和llama.cpp的技术关系和底层实现 3. 基于DeepSeek-R1 32B模型的Ollama和llama.cpp性能评测与部署实践
FROM ./bartowski/DeepSeek-R1-Distill-Qwen-32B-Q5_K_M.gguf
ollama create my-deepseek-r1-32b-gguf -f .\deepseek-r1-32b.gguf
ollama run my-deepseek-r1-32b-gguf:latest
NAME ID SIZE PROCESSOR UNTILmy-deepseek-r1-32b-gguf:latest ad9f11c41b7a 25 GB 87%/13% CPU/GPU 3 minutes from now
https://github.com/ggml-org/llama.cpp/blob/master/docs/build.md#git-bash-mingw64
build/bin/Release/llama-cli -m "/path/to/DeepSeek-R1-Distill-Qwen-32B-Q5_K_M.gguf" -ngl 100 -c 16384 -t 10 -n -2 -cnv
ggml_vulkan: Device memory allocation of size 1025355776 failed.ggml_vulkan: vk::Device::allocateMemory: ErrorOutOfDeviceMemoryllama_model_load: error loading model: unable to allocate Vulkan0 bufferllama_model_load_from_file_impl: failed to load modelcommon_init_from_params: failed to load model 'D:/llm/Model/bartowski/DeepSeek-R1-Distill-Qwen-32B-Q5_K_M.gguf'main: error: unable to load model
// Given a model and one or more GPU targets, predict how many layers and bytes we can load, and the total size// The GPUs provided must all be the same Libraryfunc EstimateGPULayers(gpus []discover.GpuInfo, f *ggml.GGML, projectors []string, opts api.Options) MemoryEstimate { // Graph size for a partial offload, applies to all GPUs var graphPartialOffload uint64 // Graph size when all layers are offloaded, applies to all GPUs var graphFullOffload uint64 // Final graph offload once we know full or partial var graphOffload uint64 ...53AI,企业落地大模型首选服务商
产品:场景落地咨询+大模型应用平台+行业解决方案
承诺:免费POC验证,效果达标后再合作。零风险落地应用大模型,已交付160+中大型企业
2025-10-30
开源可信MCP,AICC机密计算新升级!
2025-10-30
OpenAI 开源了推理安全模型-gpt-oss-safeguard-120b 和 gpt-oss-safeguard-20b
2025-10-29
刚刚,OpenAI 再次开源!安全分类模型 gpt-oss-safeguard 准确率超越 GPT-5
2025-10-29
AI本地知识库+智能体系列:手把手教你本地部署 n8n,一键实现自动采集+智能处理!
2025-10-29
n8n如何调用最近爆火的deepseek OCR?
2025-10-29
OpenAI终于快要上市了,也直面了这23个灵魂拷问。
2025-10-29
保姆级教程:我用Coze干掉了最烦的周报
2025-10-29
维基百科,终结了!马斯克开源版上线,用AI重写「真相」
2025-08-20
2025-09-07
2025-08-05
2025-08-20
2025-08-26
2025-08-22
2025-09-06
2025-08-06
2025-10-20
2025-08-22
2025-10-29
2025-10-28
2025-10-13
2025-09-29
2025-09-17
2025-09-09
2025-09-08
2025-09-07