PyTorch 2.8镜像部署教程:支持JWT鉴权的企业级AI API服务配置
1. 环境准备与快速部署
在开始之前,请确保您的服务器满足以下硬件要求:
- 显卡:RTX 4090D 24GB显存
- 内存:≥120GB
- 存储:系统盘50GB + 数据盘40GB
1.1 镜像拉取与启动
使用以下命令拉取并启动PyTorch 2.8镜像:
docker pull pytorch/pytorch:2.8-cuda12.4-cudnn8-devel docker run -it --gpus all -p 8000:8000 -v /data:/data pytorch/pytorch:2.8-cuda12.4-cudnn8-devel1.2 环境验证
验证GPU是否可用:
python -c "import torch; print('PyTorch:', torch.__version__); print('CUDA available:', torch.cuda.is_available()); print('GPU count:', torch.cuda.device_count())"预期输出应显示CUDA可用且检测到GPU设备。
2. API服务基础配置
2.1 安装必要依赖
pip install fastapi uvicorn python-jose[cryptography] passlib[bcrypt] python-multipart2.2 创建基础API服务
新建main.py文件:
from fastapi import FastAPI app = FastAPI() @app.get("/") def read_root(): return {"message": "PyTorch 2.8 API服务已启动"}启动服务:
uvicorn main:app --host 0.0.0.0 --port 80003. JWT鉴权系统实现
3.1 用户认证模块
创建auth.py文件:
from datetime import datetime, timedelta from typing import Optional from jose import JWTError, jwt from passlib.context import CryptContext SECRET_KEY = "your-secret-key" ALGORITHM = "HS256" ACCESS_TOKEN_EXPIRE_MINUTES = 30 pwd_context = CryptContext(schemes=["bcrypt"], deprecated="auto") def verify_password(plain_password: str, hashed_password: str): return pwd_context.verify(plain_password, hashed_password) def get_password_hash(password: str): return pwd_context.hash(password) def create_access_token(data: dict, expires_delta: Optional[timedelta] = None): to_encode = data.copy() if expires_delta: expire = datetime.utcnow() + expires_delta else: expire = datetime.utcnow() + timedelta(minutes=15) to_encode.update({"exp": expire}) encoded_jwt = jwt.encode(to_encode, SECRET_KEY, algorithm=ALGORITHM) return encoded_jwt3.2 保护API端点
更新main.py:
from fastapi import Depends, FastAPI, HTTPException, status from fastapi.security import OAuth2PasswordBearer, OAuth2PasswordRequestForm from auth import * oauth2_scheme = OAuth2PasswordBearer(tokenUrl="token") @app.post("/token") async def login(form_data: OAuth2PasswordRequestForm = Depends()): # 这里应查询数据库验证用户 # 简化示例使用固定用户 if form_data.username != "admin" or not verify_password(form_data.password, get_password_hash("admin123")): raise HTTPException( status_code=status.HTTP_401_UNAUTHORIZED, detail="用户名或密码错误", headers={"WWW-Authenticate": "Bearer"}, ) access_token = create_access_token( data={"sub": form_data.username} ) return {"access_token": access_token, "token_type": "bearer"} @app.get("/protected") async def protected_route(token: str = Depends(oauth2_scheme)): try: payload = jwt.decode(token, SECRET_KEY, algorithms=[ALGORITHM]) username: str = payload.get("sub") if username is None: raise HTTPException( status_code=status.HTTP_401_UNAUTHORIZED, detail="无效认证凭证", headers={"WWW-Authenticate": "Bearer"}, ) except JWTError: raise HTTPException( status_code=status.HTTP_401_UNAUTHORIZED, detail="无效认证凭证", headers={"WWW-Authenticate": "Bearer"}, ) return {"message": "访问受保护路由成功", "username": username}4. 模型推理服务集成
4.1 添加模型推理端点
扩展main.py:
import torch from torch import nn from fastapi import File, UploadFile class SimpleModel(nn.Module): def __init__(self): super().__init__() self.linear = nn.Linear(10, 1) def forward(self, x): return self.linear(x) model = SimpleModel().cuda() @app.post("/predict") async def predict(data: list, token: str = Depends(oauth2_scheme)): try: input_tensor = torch.tensor(data, dtype=torch.float32).cuda() with torch.no_grad(): output = model(input_tensor) return {"prediction": output.cpu().numpy().tolist()} except Exception as e: raise HTTPException(status_code=400, detail=str(e))4.2 文件上传处理
添加文件上传端点:
@app.post("/upload") async def upload_file(file: UploadFile = File(...), token: str = Depends(oauth2_scheme)): try: contents = await file.read() # 这里可以添加文件处理逻辑 return {"filename": file.filename, "size": len(contents)} except Exception as e: raise HTTPException(status_code=400, detail=str(e))5. 生产环境优化
5.1 使用Gunicorn多进程
安装Gunicorn:
pip install gunicorn创建启动脚本start.sh:
#!/bin/bash gunicorn -w 4 -k uvicorn.workers.UvicornWorker -b 0.0.0.0:8000 main:app5.2 配置Nginx反向代理
示例Nginx配置:
server { listen 80; server_name your-domain.com; location / { proxy_pass http://localhost:8000; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; } }5.3 日志与监控
配置日志记录:
import logging from fastapi.logger import logger logging.basicConfig( level=logging.INFO, format="%(asctime)s - %(name)s - %(levelname)s - %(message)s", handlers=[ logging.FileHandler("api.log"), logging.StreamHandler() ] ) logger = logging.getLogger(__name__)6. 总结
本教程详细介绍了如何在PyTorch 2.8镜像上部署支持JWT鉴权的企业级AI API服务,主要内容包括:
- 环境准备:验证GPU可用性,安装必要依赖
- 基础API服务:使用FastAPI创建RESTful端点
- 安全认证:实现JWT令牌的生成与验证
- 模型集成:将PyTorch模型封装为API服务
- 生产优化:多进程处理、反向代理配置和日志记录
通过这套方案,您可以快速搭建安全、高效的AI服务API,适用于各类企业级应用场景。建议在实际部署时:
- 替换示例中的硬编码密钥和用户信息
- 添加更完善的用户管理和数据库支持
- 根据业务需求扩展模型推理功能
- 配置HTTPS加密传输
获取更多AI镜像
想探索更多AI镜像和应用场景?访问 CSDN星图镜像广场,提供丰富的预置镜像,覆盖大模型推理、图像生成、视频生成、模型微调等多个领域,支持一键部署。