Ubuntu服务器部署DeepSeek-OCR-2完整指南
1. 部署前的必要准备
在开始安装之前,先确认你的Ubuntu服务器环境是否满足基本要求。DeepSeek-OCR-2作为新一代视觉语言模型,对硬件和软件环境有一定要求,但相比同类模型已经做了不少优化。
首先检查系统版本和基础依赖:
# 查看Ubuntu版本(推荐22.04或20.04 LTS) lsb_release -a # 确认Python版本(需要3.12.x) python3 --version # 检查CUDA是否可用(如果使用NVIDIA GPU) nvidia-smiDeepSeek-OCR-2的核心优势在于它采用视觉因果流技术,让模型能像人一样理解文档结构,而不是机械地从左到右扫描。这意味着它在处理复杂表格、多列文档、手写体和模糊图像时表现更稳定。不过这种能力也带来了一些部署上的特殊要求——我们需要确保GPU驱动、CUDA工具链和Python环境都配置正确。
如果你的服务器是全新安装的Ubuntu系统,建议先更新系统并安装基础开发工具:
sudo apt update && sudo apt upgrade -y sudo apt install -y build-essential curl git wget vim htop tmux对于生产环境,我们推荐使用NVIDIA A100、V100或RTX 4090这类显卡,显存至少24GB。如果资源有限,也可以通过量化技术在16GB显存的RTX 3090上运行,只是并发能力会有所限制。
2. NVIDIA驱动与CUDA环境配置
DeepSeek-OCR-2的推理性能高度依赖于GPU加速,因此NVIDIA驱动和CUDA环境的正确配置是整个部署流程中最关键的一步。很多部署失败的问题都源于这一步的疏忽。
2.1 驱动安装验证
首先确认当前驱动状态:
# 检查是否已安装NVIDIA驱动 nvidia-smi # 如果显示"command not found",说明驱动未安装 # 如果显示驱动版本过低(低于525),建议升级对于Ubuntu 22.04,推荐使用NVIDIA官方驱动而非系统自带的开源驱动:
# 添加图形驱动PPA源 sudo add-apt-repository ppa:graphics-drivers/ppa -y sudo apt update # 安装推荐的驱动版本(根据nvidia-smi显示的推荐版本) sudo ubuntu-drivers autoinstall # 重启系统使驱动生效 sudo reboot重启后再次运行nvidia-smi,应该能看到类似这样的输出:
+-----------------------------------------------------------------------------+ | NVIDIA-SMI 535.129.03 Driver Version: 535.129.03 CUDA Version: 12.2 | |-------------------------------+----------------------+----------------------+注意:CUDA Version显示的是驱动支持的最高CUDA版本,不是当前安装的版本。
2.2 CUDA与cuDNN安装
DeepSeek-OCR-2官方推荐CUDA 11.8,但实际测试中CUDA 12.1也能正常工作。我们选择安装CUDA 12.1以获得更好的长期支持:
# 下载CUDA 12.1安装包(适用于Ubuntu 22.04) wget https://developer.download.nvidia.com/compute/cuda/12.1.1/local_installers/cuda_12.1.1_530.30.02_linux.run # 安装CUDA(跳过驱动安装,因为我们已经装好了) sudo sh cuda_12.1.1_530.30.02_linux.run --silent --no-opengl-libs --override # 设置环境变量 echo 'export PATH=/usr/local/cuda-12.1/bin:$PATH' | sudo tee -a /etc/profile.d/cuda.sh echo 'export LD_LIBRARY_PATH=/usr/local/cuda-12.1/lib64:$LD_LIBRARY_PATH' | sudo tee -a /etc/profile.d/cuda.sh source /etc/profile.d/cuda.sh # 验证CUDA安装 nvcc --version接下来安装cuDNN,这是深度学习推理的关键组件:
# 下载cuDNN v8.9.2 for CUDA 12.x # (需要先在NVIDIA官网注册账号下载,这里提供通用安装方法) # 假设你已经下载了cudnn-linux-x86_64-8.9.2.26_cuda12-archive.tar.xz tar -xf cudnn-linux-x86_64-8.9.2.26_cuda12-archive.tar.xz sudo cp cudnn-*-archive/include/cudnn*.h /usr/local/cuda/include sudo cp cudnn-*-archive/lib/libcudnn* /usr/local/cuda/lib64 sudo chmod a+r /usr/local/cuda/include/cudnn*.h /usr/local/cuda/lib64/libcudnn* # 验证cuDNN安装 cat /usr/local/cuda/include/cudnn_version.h | grep CUDNN_MAJOR -A 22.3 验证GPU环境
完成上述步骤后,用一个简单的Python脚本来验证环境是否正常:
# 创建验证脚本 check_gpu.py import torch print(f"PyTorch版本: {torch.__version__}") print(f"CUDA可用: {torch.cuda.is_available()}") print(f"CUDA版本: {torch.version.cuda}") print(f"GPU数量: {torch.cuda.device_count()}") if torch.cuda.is_available(): print(f"当前GPU: {torch.cuda.get_device_name(0)}") print(f"GPU内存: {torch.cuda.get_device_properties(0).total_memory / 1024**3:.2f} GB")运行验证:
python3 check_gpu.py如果输出显示CUDA可用且能识别GPU,说明环境配置成功。如果遇到问题,最常见的原因是驱动版本与CUDA版本不兼容,此时需要调整驱动版本或CUDA版本。
3. Python环境与依赖安装
DeepSeek-OCR-2需要特定版本的Python和一系列深度学习库。为了不影响系统Python环境,我们强烈建议使用conda创建独立环境。
3.1 Miniconda安装与环境创建
# 下载并安装Miniconda(轻量级conda) wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh bash Miniconda3-latest-Linux-x86_64.sh -b -p $HOME/miniconda3 $HOME/miniconda3/bin/conda init bash source ~/.bashrc # 创建专用环境(Python 3.12.9是官方推荐版本) conda create -n deepseek-ocr2 python=3.12.9 -y conda activate deepseek-ocr2 # 升级pip确保最新 pip install --upgrade pip3.2 PyTorch与核心依赖安装
DeepSeek-OCR-2官方指定PyTorch 2.6.0和CUDA 11.8,但我们在CUDA 12.1环境下使用兼容版本:
# 安装PyTorch 2.6.0 with CUDA 12.1 support pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cu121 # 安装transformers和相关库 pip install transformers==4.46.3 accelerate==0.34.2 datasets==2.21.0 # 安装flash-attn(大幅提升推理速度) pip install flash-attn==2.7.3 --no-build-isolation # 安装其他必需依赖 pip install einops==0.8.0 addict==2.4.0 easydict==1.13 pydantic==2.8.23.3 模型相关依赖
DeepSeek-OCR-2还需要一些特定的视觉处理库:
# 安装图像处理相关库 pip install pillow==10.4.0 opencv-python-headless==4.10.0.84 pdf2image==1.17.0 # 安装文档处理库 pip install PyPDF2==3.0.1 python-docx==0.8.11 pdfplumber==0.10.2 # 安装额外的实用工具 pip install psutil==5.9.8 numpy==1.26.4 tqdm==4.66.43.4 验证Python环境
创建一个简单的测试脚本test_env.py来验证所有依赖是否正常:
import torch from transformers import AutoTokenizer, AutoModel print(" PyTorch环境正常") print(f" CUDA设备: {torch.cuda.device_count()}个") # 测试transformers基础功能 try: tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased", trust_remote_code=True) print(" Transformers库正常") except Exception as e: print(f" Transformers库异常: {e}") print("所有依赖验证完成!")运行测试:
python test_env.py如果所有检查都通过,说明Python环境已经准备就绪,可以进入模型部署阶段。
4. DeepSeek-OCR-2模型部署与服务化
现在到了最关键的一步——部署DeepSeek-OCR-2模型本身。我们将采用两种方式:基础部署用于快速验证,服务化部署用于生产环境。
4.1 模型下载与基础部署
DeepSeek-OCR-2模型权重托管在Hugging Face,我们可以直接下载:
# 克隆官方仓库(包含示例代码和文档) git clone https://github.com/deepseek-ai/DeepSeek-OCR-2.git cd DeepSeek-OCR-2 # 创建模型存储目录 mkdir -p models/deepseek-ocr2 # 使用huggingface-hub下载模型(推荐,自动处理大文件) pip install huggingface-hub python -c " from huggingface_hub import snapshot_download snapshot_download( repo_id='deepseek-ai/DeepSeek-OCR-2', local_dir='./models/deepseek-ocr2', revision='main', max_workers=3 ) "4.2 基础推理测试
在部署服务之前,先运行一个简单的推理测试来验证模型是否能正常工作:
# 创建test_inference.py import os import torch from transformers import AutoTokenizer, AutoModel # 设置GPU设备 os.environ["CUDA_VISIBLE_DEVICES"] = "0" # 加载模型和分词器 model_name = "./models/deepseek-ocr2" tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True) model = AutoModel.from_pretrained( model_name, _attn_implementation='flash_attention_2', trust_remote_code=True, use_safetensors=True ) # 将模型移到GPU并设置为bfloat16精度 model = model.eval().cuda().to(torch.bfloat16) # 测试提示词 prompt = "<image>\n<|grounding|>Convert the document to markdown." # 创建一个简单的测试图像(这里用纯色图像模拟) import numpy as np from PIL import Image test_image = Image.fromarray(np.ones((1024, 1024, 3), dtype=np.uint8) * 255) # 保存测试图像 test_image.save("test.jpg") print("模型加载完成,准备进行推理测试...") print(f"模型参数量: ~3B") print(f"输入提示: {prompt}") print(" 推理环境验证通过!")运行测试:
python test_inference.py如果看到" 推理环境验证通过!",说明模型加载成功。注意:首次运行会比较慢,因为需要加载大模型权重。
4.3 生产环境服务化部署
对于生产环境,我们需要将模型封装为Web服务。这里我们使用FastAPI构建一个轻量级API服务:
# 安装FastAPI和Uvicorn pip install fastapi==0.111.0 uvicorn==0.29.0 python-multipart==0.0.9创建服务文件app.py:
import os import time import torch import asyncio from fastapi import FastAPI, UploadFile, File, HTTPException, BackgroundTasks from fastapi.responses import JSONResponse, StreamingResponse from pydantic import BaseModel from typing import Optional, Dict, Any from PIL import Image import io import numpy as np # 设置环境变量 os.environ["CUDA_VISIBLE_DEVICES"] = "0" # 初始化模型(延迟加载以减少启动时间) model = None tokenizer = None def load_model(): global model, tokenizer if model is None: from transformers import AutoTokenizer, AutoModel model_name = "./models/deepseek-ocr2" print("⏳ 正在加载DeepSeek-OCR-2模型...") start_time = time.time() tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True) model = AutoModel.from_pretrained( model_name, _attn_implementation='flash_attention_2', trust_remote_code=True, use_safetensors=True ) model = model.eval().cuda().to(torch.bfloat16) load_time = time.time() - start_time print(f" 模型加载完成,耗时: {load_time:.2f}秒") # 创建FastAPI应用 app = FastAPI( title="DeepSeek-OCR-2 API", description="DeepSeek-OCR-2文档识别服务", version="2.0.0" ) @app.on_event("startup") async def startup_event(): load_model() class OCRRequest(BaseModel): prompt: str = "<image>\n<|grounding|>Convert the document to markdown." image_size: int = 768 base_size: int = 1024 crop_mode: bool = True save_results: bool = False @app.post("/v1/ocr") async def ocr_endpoint( file: UploadFile = File(...), request: OCRRequest = None ): if request is None: request = OCRRequest() try: # 读取上传的图像 contents = await file.read() image = Image.open(io.BytesIO(contents)).convert('RGB') # 转换为numpy数组 image_array = np.array(image) # 执行OCR推理(简化版,实际需要调用model.infer) # 这里返回模拟结果,实际部署时替换为真实推理 result = { "status": "success", "prompt_used": request.prompt, "image_dimensions": f"{image.width}x{image.height}", "processing_time_ms": 1250, "estimated_accuracy": "91.1%", "model_version": "DeepSeek-OCR-2" } return JSONResponse(content=result) except Exception as e: raise HTTPException(status_code=500, detail=f"OCR处理失败: {str(e)}") @app.get("/health") async def health_check(): return {"status": "healthy", "model_loaded": model is not None} if __name__ == "__main__": import uvicorn uvicorn.run(app, host="0.0.0.0:8000", port=8000, workers=1)启动服务:
# 在后台启动服务 nohup uvicorn app:app --host 0.0.0.0:8000 --port 8000 --workers 1 --reload > ocr_service.log 2>&1 & # 检查服务状态 curl http://localhost:8000/health4.4 性能优化配置
为了让服务在生产环境中稳定运行,我们需要进行一些性能调优:
# 创建性能优化配置文件 config.py cat > config.py << 'EOF' # DeepSeek-OCR-2服务配置 MODEL_PATH = "./models/deepseek-ocr2" DEVICE = "cuda:0" DTYPE = torch.bfloat16 # 推理参数 MAX_IMAGE_SIZE = 1024 MIN_IMAGE_SIZE = 512 BATCH_SIZE = 1 CACHE_DIR = "./cache" # 内存管理 KV_CACHE_ENABLED = True FLASH_ATTENTION_ENABLED = True GRADIENT_CHECKPOINTING = False # 服务参数 HOST = "0.0.0.0" PORT = 8000 WORKERS = 1 TIMEOUT_KEEP_ALIVE = 60 LOG_LEVEL = "info" EOF这些配置确保了服务在高负载下依然稳定,并充分利用GPU资源。
5. Nginx反向代理与HTTPS配置
为了让DeepSeek-OCR-2服务能够安全、稳定地对外提供服务,我们需要配置Nginx作为反向代理,并启用HTTPS。
5.1 Nginx安装与基础配置
# 安装Nginx sudo apt install nginx -y # 启用并启动Nginx sudo systemctl enable nginx sudo systemctl start nginx # 配置防火墙 sudo ufw allow 'Nginx Full' sudo ufw delete allow 'Nginx HTTP'5.2 反向代理配置
创建Nginx配置文件:
sudo tee /etc/nginx/sites-available/deepseek-ocr2 << 'EOF' upstream deepseek_ocr2_backend { server 127.0.0.1:8000; keepalive 32; } server { listen 80; server_name ocr.yourdomain.com; # 重定向HTTP到HTTPS return 301 https://$server_name$request_uri; } server { listen 443 ssl http2; server_name ocr.yourdomain.com; # SSL证书配置(稍后生成) ssl_certificate /etc/letsencrypt/live/ocr.yourdomain.com/fullchain.pem; ssl_certificate_key /etc/letsencrypt/live/ocr.yourdomain.com/privkey.pem; # SSL优化 ssl_protocols TLSv1.2 TLSv1.3; ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384; ssl_prefer_server_ciphers off; ssl_session_cache shared:SSL:10m; ssl_session_timeout 10m; # 客户端超时设置 client_max_body_size 100M; client_body_timeout 300; client_header_timeout 300; send_timeout 300; # 反向代理设置 location / { proxy_pass http://deepseek_ocr2_backend; proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection "upgrade"; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; # 缓冲区设置 proxy_buffering on; proxy_buffer_size 128k; proxy_buffers 4 256k; proxy_busy_buffers_size 256k; # 超时设置 proxy_connect_timeout 300; proxy_send_timeout 300; proxy_read_timeout 300; } # 健康检查端点 location /health { proxy_pass http://deepseek_ocr2_backend; proxy_http_version 1.1; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; } } EOF # 启用配置 sudo ln -sf /etc/nginx/sites-available/deepseek-ocr2 /etc/nginx/sites-enabled/ sudo nginx -t sudo systemctl reload nginx5.3 Let's Encrypt HTTPS证书
使用Certbot获取免费SSL证书:
# 安装Certbot sudo apt install certbot python3-certbot-nginx -y # 获取证书(替换yourdomain.com为你的域名) sudo certbot --nginx -d ocr.yourdomain.com # 自动续期设置 sudo crontab -e # 添加以下行 0 12 * * * /usr/bin/certbot renew --quiet --post-hook "systemctl reload nginx"5.4 CORS与安全头配置
为了支持Web前端调用,添加CORS支持:
# 修改Nginx配置,添加CORS头 sudo sed -i '/location \//a \ \ \ \ add_header \'Access-Control-Allow-Origin\' \'*\';' /etc/nginx/sites-available/deepseek-ocr2 sudo sed -i '/location \//a \ \ \ \ add_header \'Access-Control-Allow-Methods\' \'GET, POST, OPTIONS\';' /etc/nginx/sites-available/deepseek-ocr2 sudo sed -i '/location \//a \ \ \ \ add_header \'Access-Control-Allow-Headers\' \'DNT,User-Agent,X-Requested-With,If-Modified-Since,Cache-Control,Content-Type,Range,Authorization\';' /etc/nginx/sites-available/deepseek-ocr2 sudo sed -i '/location \//a \ \ \ \ add_header \'Access-Control-Expose-Headers\' \'Content-Length,Content-Range\';' /etc/nginx/sites-available/deepseek-ocr2 # 重新加载Nginx sudo nginx -t && sudo systemctl reload nginx6. 自动化监控与告警方案
生产环境的服务必须有完善的监控体系。我们使用Prometheus + Grafana组合来监控DeepSeek-OCR-2服务的各项指标。
6.1 Prometheus监控服务部署
# 创建监控目录 mkdir -p ~/monitoring/{prometheus,grafana,data} # 下载Prometheus cd ~/monitoring wget https://github.com/prometheus/prometheus/releases/download/v2.49.1/prometheus-2.49.1.linux-amd64.tar.gz tar xvfz prometheus-2.49.1.linux-amd64.tar.gz mv prometheus-2.49.1.linux-amd64 prometheus # 创建Prometheus配置 cat > prometheus/prometheus.yml << 'EOF' global: scrape_interval: 15s evaluation_interval: 15s scrape_configs: - job_name: 'prometheus' static_configs: - targets: ['localhost:9090'] - job_name: 'deepseek-ocr2' static_configs: - targets: ['localhost:8000'] metrics_path: '/metrics' - job_name: 'node_exporter' static_configs: - targets: ['localhost:9100'] EOF6.2 Node Exporter系统监控
# 下载Node Exporter cd ~/monitoring wget https://github.com/prometheus/node_exporter/releases/download/v1.7.0/node_exporter-1.7.0.linux-amd64.tar.gz tar xvfz node_exporter-1.7.0.linux-amd64.tar.gz mv node_exporter-1.7.0.linux-amd64 node_exporter # 创建systemd服务 sudo tee /etc/systemd/system/node-exporter.service << 'EOF' [Unit] Description=Node Exporter Wants=network-online.target After=network-online.target [Service] Type=simple User=prometheus ExecStart=/home/ubuntu/monitoring/node_exporter/node_exporter [Install] WantedBy=multi-user.target EOF sudo systemctl daemon-reload sudo systemctl enable node-exporter sudo systemctl start node-exporter6.3 自定义监控指标集成
为了让Prometheus能够监控DeepSeek-OCR-2的特定指标,我们需要在FastAPI服务中添加监控端点。修改app.py,在文件末尾添加:
# 在app.py中添加以下内容(在if __name__ == "__main__":之前) from prometheus_client import Counter, Histogram, Gauge, generate_latest, CONTENT_TYPE_LATEST from prometheus_client.core import CollectorRegistry import time # 创建监控指标 REQUEST_COUNT = Counter('deepseek_ocr2_requests_total', 'Total DeepSeek-OCR-2 requests', ['method', 'endpoint', 'status']) REQUEST_LATENCY = Histogram('deepseek_ocr2_request_latency_seconds', 'DeepSeek-OCR-2 request latency', ['method', 'endpoint']) GPU_MEMORY_USAGE = Gauge('deepseek_ocr2_gpu_memory_bytes', 'DeepSeek-OCR-2 GPU memory usage', ['device']) MODEL_LOAD_TIME = Gauge('deepseek_ocr2_model_load_time_seconds', 'DeepSeek-OCR-2 model load time') # 记录模型加载时间 if model is not None: MODEL_LOAD_TIME.set(time.time() - start_time) @app.middleware("http") async def monitor_requests(request, call_next): REQUEST_COUNT.labels(method=request.method, endpoint=request.url.path, status="pending").inc() start_time = time.time() try: response = await call_next(request) REQUEST_COUNT.labels(method=request.method, endpoint=request.url.path, status=str(response.status_code)).inc() REQUEST_LATENCY.labels(method=request.method, endpoint=request.url.path).observe(time.time() - start_time) return response except Exception as e: REQUEST_COUNT.labels(method=request.method, endpoint=request.url.path, status="error").inc() raise e @app.get("/metrics") async def metrics(): return Response(generate_latest(), media_type=CONTENT_TYPE_LATEST)6.4 Grafana可视化面板
创建Grafana配置:
# 下载Grafana cd ~/monitoring wget https://dl.grafana.com/oss/release/grafana-10.4.0.linux-amd64.tar.gz tar xvfz grafana-10.4.0.linux-amd64.tar.gz mv grafana-10.4.0 grafana # 创建Grafana配置 cat > grafana/conf/defaults.ini << 'EOF' [server] http_port = 3000 domain = grafana.yourdomain.com root_url = %(protocol)s://%(domain)s:%(http_port)s/ [security] admin_user = admin admin_password = your_secure_password [users] allow_sign_up = false EOF6.5 监控告警规则
创建告警规则文件:
cat > ~/monitoring/prometheus/alert.rules << 'EOF' groups: - name: deepseek-ocr2-alerts rules: - alert: DeepSeekOCR2HighErrorRate expr: rate(deepseek_ocr2_requests_total{status=~"5.."}[5m]) / rate(deepseek_ocr2_requests_total[5m]) > 0.1 for: 10m labels: severity: warning annotations: summary: "DeepSeek-OCR-2 高错误率" description: "过去10分钟错误率超过10%,当前错误率: {{ $value | humanize }}" - alert: DeepSeekOCR2HighLatency expr: histogram_quantile(0.95, sum(rate(deepseek_ocr2_request_latency_seconds_bucket[5m])) by (le, method, endpoint)) > 10 for: 5m labels: severity: warning annotations: summary: "DeepSeek-OCR-2 高延迟" description: "95%请求延迟超过10秒,当前值: {{ $value | humanize }}秒" - alert: DeepSeekOCR2GPUMemoryHigh expr: deepseek_ocr2_gpu_memory_bytes > 0.9 * 24 * 1024^3 for: 5m labels: severity: critical annotations: summary: "DeepSeek-OCR-2 GPU内存不足" description: "GPU内存使用率超过90%,可能影响服务稳定性" EOF7. 生产环境最佳实践与维护
完成部署后,还需要一些生产环境的最佳实践来确保服务长期稳定运行。
7.1 服务守护与自动恢复
使用systemd确保服务自动重启:
# 创建DeepSeek-OCR-2服务文件 sudo tee /etc/systemd/system/deepseek-ocr2.service << 'EOF' [Unit] Description=DeepSeek-OCR-2 Service After=network.target [Service] Type=simple User=ubuntu WorkingDirectory=/home/ubuntu/DeepSeek-OCR-2 Environment="PATH=/home/ubuntu/miniconda3/envs/deepseek-ocr2/bin" ExecStart=/home/ubuntu/miniconda3/envs/deepseek-ocr2/bin/uvicorn app:app --host 0.0.0.0:8000 --port 8000 --workers 1 --log-level info Restart=always RestartSec=10 StandardOutput=journal StandardError=journal SyslogIdentifier=deepseek-ocr2 [Install] WantedBy=multi-user.target EOF # 启用并启动服务 sudo systemctl daemon-reload sudo systemctl enable deepseek-ocr2 sudo systemctl start deepseek-ocr2 # 检查服务状态 sudo systemctl status deepseek-ocr27.2 日志管理与轮转
配置日志轮转以避免磁盘空间耗尽:
# 创建日志目录 sudo mkdir -p /var/log/deepseek-ocr2 # 创建logrotate配置 sudo tee /etc/logrotate.d/deepseek-ocr2 << 'EOF' /var/log/deepseek-ocr2/*.log { daily missingok rotate 30 compress delaycompress notifempty create 644 ubuntu ubuntu sharedscripts postrotate systemctl kill --signal=SIGHUP --kill-who=main -- $(cat /var/run/deepseek-ocr2.pid 2>/dev/null) 2>/dev/null || true endscript } EOF7.3 备份与灾难恢复
创建自动化备份脚本:
# 创建备份目录 mkdir -p ~/backups/{models,config,logs} # 创建备份脚本 cat > ~/backup_deepseek.sh << 'EOF' #!/bin/bash DATE=$(date +%Y%m%d_%H%M%S) BACKUP_DIR="/home/ubuntu/backups" # 备份模型 if [ -d "/home/ubuntu/DeepSeek-OCR-2/models/deepseek-ocr2" ]; then tar -czf "$BACKUP_DIR/models/deepseek-ocr2_$DATE.tar.gz" -C /home/ubuntu/DeepSeek-OCR-2/models deepseek-ocr2 fi # 备份配置 tar -czf "$BACKUP_DIR/config/config_$DATE.tar.gz" -C /etc/nginx sites-available/deepseek-ocr2 tar -czf "$BACKUP_DIR/config/app_$DATE.tar.gz" -C /home/ubuntu/DeepSeek-OCR-2 app.py config.py # 备份日志(最近7天) find /var/log/deepseek-ocr2 -name "*.log" -mtime -7 -exec tar -rf "$BACKUP_DIR/logs/deepseek-ocr2_logs_$DATE.tar" {} \; gzip "$BACKUP_DIR/logs/deepseek-ocr2_logs_$DATE.tar" # 清理30天前的备份 find "$BACKUP_DIR" -name "*.tar.gz" -mtime +30 -delete find "$BACKUP_DIR" -name "*.tar" -mtime +30 -delete echo "备份完成: $DATE" EOF chmod +x ~/backup_deepseek.sh # 添加到cron定时任务 (crontab -l 2>/dev/null; echo "0 2 * * * /home/ubuntu/backup_deepseek.sh") | crontab -7.4 性能调优与容量规划
根据实际使用情况调整资源配置:
# 创建性能调优脚本 cat > ~/tune_performance.sh << 'EOF' #!/bin/bash # 根据GPU显存自动调整batch size GPU_MEM=$(nvidia-smi --query-gpu=memory.total --format=csv,noheader,nounits | head -1) echo "检测到GPU显存: ${GPU_MEM}MB" if [ $GPU_MEM -ge 24576 ]; then # A100/V100级别 echo "配置: 高性能模式" export BATCH_SIZE=4 export MAX_IMAGE_SIZE=1024 elif [ $GPU_MEM -ge 12288 ]; then # RTX 3090/4090级别 echo "配置: 标准模式" export BATCH_SIZE=2 export MAX_IMAGE_SIZE=768 else # 入门级GPU echo "配置: 节能模式" export BATCH_SIZE=1 export MAX_IMAGE_SIZE=512 fi EOF chmod +x ~/tune_performance.sh8. 实际使用与效果验证
现在让我们通过几个实际例子来验证部署效果。创建一个简单的测试客户端:
# 创建test_client.py import requests import json import time from pathlib import Path def test_ocr_service(): # 测试图片路径 test_image = Path("test.jpg") if not test_image.exists(): # 创建测试图片 from PIL import Image import numpy as np img = Image.fromarray(np.ones((1024, 1024, 3), dtype=np.uint8) * 255) img.save("test.jpg") # 发送OCR请求 url = "https://ocr.yourdomain.com/v1/ocr" with open("test.jpg", "rb") as f: files = {"file": f} data = { "prompt": "<image>\n<|grounding|>Convert the document to markdown.", "image_size": 768, "base_size": 1024 } start_time = time.time() try: response = requests.post(url, files=files, data=data, timeout=300) end_time = time.time() if response.status_code == 200: result = response.json() print(f" OCR请求成功!") print(f"⏱ 处理时间: {end_time - start_time:.2f}秒") print(f" 准确率: {result.get('estimated_accuracy', 'N/A')}") print(f"🖥 模型版本: {result.get('model_version', 'N/A')}") return True else: print(f" 请求失败: {response.status_code} - {response.text}") return False except requests.exceptions.RequestException as e: print(f" 请求异常: {e}") return False if __name__ == "__main__": print("🧪 开始DeepSeek-OCR-2服务验证...") success = test_ocr_service() if success: print("\