DeepSeek-R1 本地部署
Nvidia显卡管理 nvidia-smi
lspci | grep -i nvidia
lspci -v -s 00:06.0
Fan:显示风扇转速,数值在0到100%之间,是计算机的期望转速,如果计算机不是通过风扇冷却或者风扇坏了,显示出来就是N/A;
Temp:显卡内部的温度,单位是摄氏度;
Perf:表征性能状态,从P0到P12,P0表示最大性能,P12表示状态最小性能;
Pwr:能耗表示;
Bus-Id:涉及GPU总线的相关信息;
Disp.A:是Display Active的意思,表示GPU的显示是否初始化;
Memory Usage:显存的使用率;
Volatile GPU-Util:浮动的GPU利用率;
Compute M:计算模式;
watch -n 5 nvidia-smi
nvidia-smi -L
nvidia-smi --query-gpu=index,name,uuid,serial --format=csv
nvidia-smi -i 0 -q
nvidia-smi dmon
nvidia-smi pmon
nvidia-smi -pm 1
nvidia-smi -e 1
nvidia-smi -r -i 0
DeepSeek-R1 本地部署
curl -fsSL https://ollama.com/install.sh | sh
sudo usermod -aG ollama $USER # 将当前用户加入ollama组newgrp ollama # 刷新用户组
systemctl start ollama # 启动服务systemctl enable ollama # 开机自启
ollama --version # 输出版本号即成功
vim /etc/systemd/system/ollama.service
[Service]
Environment="OLLAMA_HOST=0.0.0.0"
Environment="OLLAMA_ORIGINS=*"
systemctl daemon-reload
systemctl restart ollama
CUDA_VISIBLE_DEVICES=0,1
systemctl start ollama
ollama -v
http://172.16.100.36:11434
ollama pull deepseek-r1:7b # 官方推荐模型名称
ollama run deepseek-r1:7b
ollama run deepseek-r1:14b
ollama --help
Available Commands:
serve Start ollama
create Create a model from a Modelfile
show Show information for a model
run Run a model
stop Stop a running model
pull Pull a model from a registry
push Push a model to a registry
list List models
ps List running models
cp Copy a model
rm Remove a model
help Help about any command
OLLAMA_NUM_THREADS=4
ollama run deepseek-r1 # 限制4线程
ollama pull deepseek-r1:7b-q4_0 # 4-bit量化版
gcc --version
sudo apt install gcc
sudo yum install build-essential
wget https://developer.download.nvidia.com/compute/cuda/12.6.3/local_installers/cuda_12.6.3_560.35.05_linux.run
sudo sh cuda_12.6.3_560.35.05_linux.run
nvidia-smi
nvidia-smi # 查看GPU利用率
sudo ./Chatbox-1.9.5-x86_64.AppImage