Janus-Pro-7B多模态模型实战：基于Python爬虫的数据采集与图像生成-平芜编程栈

Janus-Pro-7B多模态模型实战：基于Python爬虫的数据采集与图像生成

最近在做一个电商项目，需要批量生成商品主图，传统方法要么得找设计师一张张做，要么用现成的模板工具，效果又不够灵活。正好看到DeepSeek开源的Janus-Pro-7B模型，既能理解图片内容，又能根据文字生成图片，这不就是我要的解决方案吗？

更妙的是，我们可以把Python爬虫和这个模型结合起来，直接从电商平台爬取商品描述，然后自动生成对应的商品主图。这样一来，整个流程就完全自动化了，效率提升可不是一点半点。

今天我就来分享这个实战方案，从爬虫搭建到模型调用，手把手带你实现一个完整的自动化内容生产系统。

1. 场景分析：电商内容生产的痛点与机遇

做电商的朋友都知道，商品主图有多重要。一张好的主图能直接决定点击率，但问题也来了：

传统方式的三大痛点：

成本高：请设计师做图，一张图少则几十，多则几百，批量制作成本惊人
效率低：从沟通需求到出图修改，一个商品可能要来回折腾好几天
风格不统一：不同设计师做的图风格各异，店铺整体形象难以统一

我们的解决方案思路：

用Python爬虫自动抓取商品描述和竞品图片
用Janus-Pro-7B理解商品特性，生成符合描述的图片
批量处理，保持风格一致，大幅降低成本

这个方案特别适合那些需要大量上新、做季节性促销的电商卖家。比如服装店，每个季度都要更新几百个SKU，用传统方式做图，光设计费就是一笔不小的开支。

2. 环境准备：一站式搭建开发环境

2.1 基础环境配置

首先，我建议新建一个独立的Python环境，避免包冲突。用conda或者venv都可以：

# 创建新环境 conda create --name janus-crawler python=3.9 conda activate janus-crawler # 检查CUDA版本（如果有GPU的话） nvcc --version # 输出应该是11.8或更高版本

2.2 安装核心依赖

Janus-Pro-7B需要PyTorch和一些特定的依赖包：

# 安装PyTorch（根据你的CUDA版本调整） pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118 # 安装Janus模型和相关依赖 pip install git+https://github.com/deepseek-ai/Janus pip install transformers>=4.40.0 pip install numpy==1.26.3 # 注意：需要这个特定版本 # 安装爬虫相关库 pip install requests beautifulsoup4 scrapy selenium pip install pillow # 图片处理

2.3 下载模型文件

Janus-Pro-7B模型比较大，有7B参数，可以从Hugging Face或者ModelScope下载：

# 方法1：使用ModelScope下载（国内速度较快） from modelscope import snapshot_download model_dir = snapshot_download('deepseek-ai/Janus-Pro-7B') print(f"模型下载到: {model_dir}") # 方法2：如果网络条件好，也可以直接从Hugging Face加载 # 模型会自动下载到缓存目录

下载完成后，模型文件大概占20-30GB空间，确保你的硬盘有足够空间。如果显存不够（比如只有8GB），可以考虑用CPU推理，就是速度会慢一些。

3. Python爬虫实战：智能抓取商品数据

3.1 简单的商品信息爬虫

我们先从一个简单的爬虫开始，抓取电商平台的商品标题和描述。这里以某电商平台为例（实际使用时请遵守平台robots.txt协议）：

import requests from bs4 import BeautifulSoup import json import time from urllib.parse import urljoin class ProductCrawler: def __init__(self): self.headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36', 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8', 'Accept-Language': 'zh-CN,zh;q=0.9,en;q=0.8', } self.session = requests.Session() def fetch_product_page(self, url): """抓取商品页面""" try: response = self.session.get(url, headers=self.headers, timeout=10) response.raise_for_status() response.encoding = 'utf-8' return response.text except Exception as e: print(f"抓取页面失败: {e}") return None def parse_product_info(self, html): """解析商品信息""" soup = BeautifulSoup(html, 'html.parser') # 提取商品标题（根据实际网站结构调整选择器） title_elem = soup.select_one('.product-title, .goods-title, h1.title') title = title_elem.get_text(strip=True) if title_elem else "未找到标题" # 提取商品描述 desc_elem = soup.select_one('.product-desc, .goods-desc, .description') description = desc_elem.get_text(strip=True) if desc_elem else "" # 提取价格 price_elem = soup.select_one('.price, .current-price') price = price_elem.get_text(strip=True) if price_elem else "" # 提取图片链接 images = [] img_elems = soup.select('.product-image img, .goods-img img') for img in img_elems[:5]: # 只取前5张图 src = img.get('src') or img.get('data-src') if src and src.startswith('http'): images.append(src) return { 'title': title, 'description': description, 'price': price, 'images': images, 'crawled_at': time.strftime('%Y-%m-%d %H:%M:%S') } def crawl_multiple_products(self, urls, delay=1): """批量爬取多个商品""" results = [] for url in urls: print(f"正在爬取: {url}") html = self.fetch_product_page(url) if html: product_info = self.parse_product_info(html) product_info['url'] = url results.append(product_info) print(f"成功爬取: {product_info['title'][:50]}...") time.sleep(delay) # 礼貌爬取，避免被封 return results # 使用示例 if __name__ == "__main__": crawler = ProductCrawler() # 示例商品链接（请替换为实际链接） product_urls = [ "https://example.com/product/123", "https://example.com/product/456", ] products = crawler.crawl_multiple_products(product_urls) # 保存结果 with open('products.json', 'w', encoding='utf-8') as f: json.dump(products, f, ensure_ascii=False, indent=2) print(f"爬取完成，共{len(products)}个商品")

3.2 使用Scrapy框架构建更强大的爬虫

对于大规模爬取，我推荐用Scrapy框架，它更稳定、功能更全：

# products_spider.py import scrapy import json from urllib.parse import urljoin class ProductsSpider(scrapy.Spider): name = 'products' def start_requests(self): # 可以从文件读取URL列表 with open('product_urls.txt', 'r') as f: urls = [line.strip() for line in f if line.strip()] for url in urls: yield scrapy.Request(url=url, callback=self.parse_product) def parse_product(self, response): """解析单个商品页面""" # 使用XPath或CSS选择器提取数据 title = response.css('h1.product-title::text').get(default='').strip() description = ''.join(response.css('.product-description *::text').getall()).strip() price = response.css('.price::text').get(default='').strip() # 提取图片 images = [] for img in response.css('.product-images img'): src = img.attrib.get('src') or img.attrib.get('data-src') if src: images.append(urljoin(response.url, src)) # 提取商品属性 attributes = {} for attr in response.css('.product-attributes li'): key = attr.css('.attr-name::text').get(default='').strip() value = attr.css('.attr-value::text').get(default='').strip() if key and value: attributes[key] = value yield { 'url': response.url, 'title': title, 'description': description, 'price': price, 'images': images[:5], # 只保留前5张 'attributes': attributes, 'category': self.extract_category(response), } def extract_category(self, response): """提取商品分类""" breadcrumbs = response.css('.breadcrumb a::text').getall() return ' > '.join(breadcrumbs[-3:]) if breadcrumbs else ''

运行Scrapy爬虫：

# 创建Scrapy项目 scrapy startproject product_crawler cd product_crawler # 将上面的spider代码放到 spiders/products_spider.py # 运行爬虫 scrapy crawl products -o products.json -s FEED_EXPORT_ENCODING=utf-8

3.3 数据清洗与格式化

爬取到的数据需要清洗，才能给模型使用：

import re import jieba # 中文分词 from collections import Counter class DataProcessor: def __init__(self): self.stop_words = self.load_stop_words() def load_stop_words(self): """加载停用词表""" stop_words = set(['的', '了', '在', '是', '我', '有', '和', '就', '不', '人', '都', '一', '一个', '上', '也', '很', '到', '说', '要', '去', '你', '会', '着', '没有', '看', '好', '自己', '这']) return stop_words def clean_text(self, text): """清洗文本""" if not text: return "" # 移除HTML标签 text = re.sub(r'<[^>]+>', '', text) # 移除特殊字符 text = re.sub(r'[^\w\u4e00-\u9fff\s]', '', text) # 合并多个空格 text = re.sub(r'\s+', ' ', text) # 移除首尾空格 return text.strip() def extract_keywords(self, text, top_n=10): """提取关键词""" words = jieba.lcut(text) # 过滤停用词和单字 filtered_words = [w for w in words if len(w) > 1 and w not in self.stop_words] # 统计词频 word_counts = Counter(filtered_words) return [word for word, _ in word_counts.most_common(top_n)] def prepare_prompt(self, product_info): """根据商品信息生成图片生成提示词""" title = self.clean_text(product_info.get('title', '')) desc = self.clean_text(product_info.get('description', '')) attrs = product_info.get('attributes', {}) # 提取关键特征 keywords = self.extract_keywords(f"{title} {desc}") # 构建提示词 prompt_parts = [] # 基础描述 if title: prompt_parts.append(f"商品标题：{title}") # 关键属性 if attrs: attr_str = '，'.join([f"{k}：{v}" for k, v in attrs.items()]) prompt_parts.append(f"商品属性：{attr_str}") # 风格要求 prompt_parts.append("要求：高清产品主图，白色背景，专业摄影风格，突出产品细节") # 生成最终的提示词 full_prompt = "。".join(prompt_parts) # 简化为模型输入格式 model_prompt = f"生成一张商品主图：{title}，{', '.join(keywords[:5])}，白色背景，高清细节" return { 'full_prompt': full_prompt, 'model_prompt': model_prompt, 'keywords': keywords, 'cleaned_title': title, 'cleaned_desc': desc[:200] # 只取前200字 } # 使用示例 processor = DataProcessor() # 加载爬取的数据 with open('products.json', 'r', encoding='utf-8') as f: products = json.load(f) processed_products = [] for product in products[:10]: # 先处理前10个 processed = processor.prepare_prompt(product) product.update(processed) processed_products.append(product) print(f"处理完成: {product['title'][:30]}...") print(f"生成提示词: {product['model_prompt']}") print("-" * 50)

4. Janus-Pro-7B模型调用：从理解到生成

4.1 初始化模型

现在到了核心部分，调用Janus-Pro-7B模型。我们先初始化模型：

import torch from transformers import AutoModelForCausalLM from janus.models import MultiModalityCausalLM, VLChatProcessor from janus.utils.io import load_pil_images import os class JanusModel: def __init__(self, model_path="deepseek-ai/Janus-Pro-7B", device="cuda"): """ 初始化Janus-Pro-7B模型 参数： model_path: 模型路径，可以是本地路径或Hugging Face模型ID device: 运行设备，cuda或cpu """ self.device = device self.model_path = model_path print("正在加载模型和处理器...") # 加载处理器 self.vl_chat_processor = VLChatProcessor.from_pretrained(model_path) self.tokenizer = self.vl_chat_processor.tokenizer # 加载模型 self.vl_gpt = AutoModelForCausalLM.from_pretrained( model_path, trust_remote_code=True, torch_dtype=torch.bfloat16 # 使用bfloat16节省显存 ) # 移动到指定设备 if device == "cuda" and torch.cuda.is_available(): self.vl_gpt = self.vl_gpt.cuda().eval() print(f"模型已加载到GPU，显存占用: {torch.cuda.memory_allocated()/1024**3:.2f} GB") else: self.vl_gpt = self.vl_gpt.cpu().eval() print("模型已加载到CPU") print("模型加载完成！") def analyze_product_image(self, image_path, question="请描述这个商品的主要特点"): """ 分析商品图片，理解商品特征 参数： image_path: 图片路径 question: 要问的问题 返回： 模型的回答 """ if not os.path.exists(image_path): return f"图片不存在: {image_path}" try: # 构建对话 conversation = [ { "role": "<|User|>", "content": f"<image_placeholder>\n{question}", "images": [image_path], }, {"role": "<|Assistant|>", "content": ""}, ] # 加载图片并准备输入 pil_images = load_pil_images(conversation) prepare_inputs = self.vl_chat_processor( conversations=conversation, images=pil_images, force_batchify=True ).to(self.vl_gpt.device) # 获取图片嵌入 inputs_embeds = self.vl_gpt.prepare_inputs_embeds(**prepare_inputs) # 生成回答 outputs = self.vl_gpt.language_model.generate( inputs_embeds=inputs_embeds, attention_mask=prepare_inputs.attention_mask, pad_token_id=self.tokenizer.eos_token_id, bos_token_id=self.tokenizer.bos_token_id, eos_token_id=self.tokenizer.eos_token_id, max_new_tokens=512, do_sample=False, use_cache=True, ) # 解码回答 answer = self.tokenizer.decode(outputs[0].cpu().tolist(), skip_special_tokens=True) # 提取纯回答内容（去掉问题部分） if "<|Assistant|>:" in answer: answer = answer.split("<|Assistant|>:")[-1].strip() return answer except Exception as e: return f"分析图片时出错: {str(e)}" def generate_product_image(self, prompt, output_dir="generated_images", num_images=4): """ 根据提示词生成商品图片 参数： prompt: 生成图片的提示词 output_dir: 输出目录 num_images: 生成图片数量 返回： 生成的图片路径列表 """ os.makedirs(output_dir, exist_ok=True) try: # 构建对话 conversation = [ { "role": "<|User|>", "content": prompt, }, {"role": "<|Assistant|>", "content": ""}, ] # 应用模板 sft_format = self.vl_chat_processor.apply_sft_template_for_multi_turn_prompts( conversations=conversation, sft_format=self.vl_chat_processor.sft_format, system_prompt="", ) final_prompt = sft_format + self.vl_chat_processor.image_start_tag # 生成图片 generated_paths = self._generate_images( final_prompt, output_dir=output_dir, parallel_size=num_images ) return generated_paths except Exception as e: print(f"生成图片时出错: {e}") return [] def _generate_images(self, prompt, output_dir, parallel_size=4, **kwargs): """内部方法：实际生成图片""" import numpy as np from PIL import Image # 设置生成参数 temperature = kwargs.get('temperature', 1.0) cfg_weight = kwargs.get('cfg_weight', 5.0) image_token_num_per_image = kwargs.get('image_token_num_per_image', 576) img_size = kwargs.get('img_size', 384) patch_size = kwargs.get('patch_size', 16) # 编码提示词 input_ids = self.vl_chat_processor.tokenizer.encode(prompt) input_ids = torch.LongTensor(input_ids) # 准备tokens（用于CFG） tokens = torch.zeros((parallel_size * 2, len(input_ids)), dtype=torch.int) if torch.cuda.is_available(): tokens = tokens.cuda() for i in range(parallel_size * 2): tokens[i, :] = input_ids if i % 2 != 0: # 无条件生成的tokens tokens[i, 1:-1] = self.vl_chat_processor.pad_id # 获取输入嵌入 inputs_embeds = self.vl_gpt.language_model.get_input_embeddings()(tokens) # 生成图片tokens generated_tokens = torch.zeros((parallel_size, image_token_num_per_image), dtype=torch.int) if torch.cuda.is_available(): generated_tokens = generated_tokens.cuda() outputs = None for i in range(image_token_num_per_image): outputs = self.vl_gpt.language_model.model( inputs_embeds=inputs_embeds, use_cache=True, past_key_values=outputs.past_key_values if i != 0 else None ) hidden_states = outputs.last_hidden_state logits = self.vl_gpt.gen_head(hidden_states[:, -1, :]) # CFG（Classifier-Free Guidance） logit_cond = logits[0::2, :] # 条件生成 logit_uncond = logits[1::2, :] # 无条件生成 logits = logit_uncond + cfg_weight * (logit_cond - logit_uncond) # 采样下一个token probs = torch.softmax(logits / temperature, dim=-1) next_token = torch.multinomial(probs, num_samples=1) generated_tokens[:, i] = next_token.squeeze(dim=-1) # 准备下一轮输入 next_token = torch.cat([next_token.unsqueeze(dim=1), next_token.unsqueeze(dim=1)], dim=1).view(-1) img_embeds = self.vl_gpt.prepare_gen_img_embeds(next_token) inputs_embeds = img_embeds.unsqueeze(dim=1) # 解码为图片 dec = self.vl_gpt.gen_vision_model.decode_code( generated_tokens.to(dtype=torch.int), shape=[parallel_size, 8, img_size // patch_size, img_size // patch_size] ) # 后处理 dec = dec.to(torch.float32).cpu().numpy().transpose(0, 2, 3, 1) dec = np.clip((dec + 1) / 2 * 255, 0, 255).astype(np.uint8) # 保存图片 image_paths = [] for i in range(parallel_size): img = Image.fromarray(dec[i]) filename = f"product_{int(time.time())}_{i}.jpg" save_path = os.path.join(output_dir, filename) img.save(save_path) image_paths.append(save_path) print(f"图片已保存: {save_path}") return image_paths # 使用示例 if __name__ == "__main__": # 初始化模型（第一次运行会自动下载模型） model = JanusModel(device="cuda") # 如果有GPU # 示例1：分析现有商品图片 analysis_result = model.analyze_product_image( "sample_product.jpg", "这个商品是什么？有什么特点？适合什么场景使用？" ) print("图片分析结果:", analysis_result) # 示例2：生成新的商品图片 prompt = "高端蓝牙耳机，黑色磨砂材质，白色背景，产品摄影风格，突出细节" generated_images = model.generate_product_image(prompt, num_images=2) print(f"生成了 {len(generated_images)} 张图片")

4.2 批量处理商品数据

现在我们把爬虫和模型结合起来，实现自动化流水线：

import json import time from tqdm import tqdm class ProductImagePipeline: def __init__(self, model_path="deepseek-ai/Janus-Pro-7B"): self.model = JanusModel(model_path) self.processor = DataProcessor() def process_single_product(self, product_info, output_base_dir="output"): """处理单个商品：分析+生成""" product_id = product_info.get('id', str(int(time.time()))) output_dir = os.path.join(output_base_dir, f"product_{product_id}") os.makedirs(output_dir, exist_ok=True) results = { 'product_id': product_id, 'original_info': product_info, 'generated_images': [], 'analysis_results': [], 'prompts_used': [] } # 1. 如果有现有图片，先分析 existing_images = product_info.get('images', []) for img_url in existing_images[:2]: # 只分析前2张 try: # 下载图片（这里简化处理，实际需要实现下载逻辑） local_path = self.download_image(img_url, output_dir) if local_path: analysis = self.model.analyze_product_image( local_path, "请详细描述这个商品的外观、材质、特点和适用场景" ) results['analysis_results'].append({ 'image_url': img_url, 'analysis': analysis }) except Exception as e: print(f"分析图片失败 {img_url}: {e}") # 2. 准备生成提示词 prompt_info = self.processor.prepare_prompt(product_info) results['processed_prompt'] = prompt_info # 3. 生成新图片 generation_prompts = self.create_generation_prompts(product_info, prompt_info) for i, prompt in enumerate(generation_prompts[:3]): # 最多生成3个变体 try: results['prompts_used'].append(prompt) generated = self.model.generate_product_image( prompt, output_dir=os.path.join(output_dir, f"variant_{i}"), num_images=2 # 每个提示生成2张 ) results['generated_images'].extend(generated) print(f"商品 {product_id} 变体 {i} 生成完成") time.sleep(1) # 避免过热 except Exception as e: print(f"生成图片失败: {e}") # 4. 保存结果 result_file = os.path.join(output_dir, "results.json") with open(result_file, 'w', encoding='utf-8') as f: json.dump(results, f, ensure_ascii=False, indent=2) return results def create_generation_prompts(self, product_info, prompt_info): """创建多个生成提示词变体""" base_prompt = prompt_info['model_prompt'] keywords = prompt_info['keywords'] # 不同风格的提示词 styles = [ "专业产品摄影风格，白色背景，高清细节", "电商平台主图风格，吸引眼球，突出卖点", "简约现代风格，干净背景，突出设计感", "场景化展示，使用场景背景，生活化" ] prompts = [] for style in styles: prompt = f"{base_prompt}，{style}" prompts.append(prompt) # 添加关键词变体 if keywords: keyword_prompt = f"{base_prompt}，重点突出{keywords[0]}和{keywords[1]}特征" prompts.append(keyword_prompt) return prompts def download_image(self, url, save_dir): """下载图片到本地（简化版）""" # 实际实现需要考虑重试、代理、headers等 import requests from urllib.parse import urlparse try: filename = os.path.basename(urlparse(url).path) if not filename: filename = f"image_{int(time.time())}.jpg" save_path = os.path.join(save_dir, filename) response = requests.get(url, timeout=10, stream=True) if response.status_code == 200: with open(save_path, 'wb') as f: for chunk in response.iter_content(1024): f.write(chunk) return save_path except Exception as e: print(f"下载图片失败 {url}: {e}") return None def batch_process(self, products_file, max_products=10): """批量处理多个商品""" with open(products_file, 'r', encoding='utf-8') as f: products = json.load(f) all_results = [] for i, product in enumerate(tqdm(products[:max_products])): print(f"\n处理商品 {i+1}/{min(len(products), max_products)}") try: result = self.process_single_product(product) all_results.append(result) print(f"✓ 商品 {product.get('title', 'N/A')[:30]}... 处理完成") except Exception as e: print(f"✗ 处理失败: {e}") # 每处理5个商品休息一下 if (i + 1) % 5 == 0: print("休息10秒...") time.sleep(10) # 汇总报告 self.generate_report(all_results) return all_results def generate_report(self, results): """生成处理报告""" total_products = len(results) total_generated = sum(len(r['generated_images']) for r in results) total_analyzed = sum(len(r['analysis_results']) for r in results) report = { 'summary': { 'total_products_processed': total_products, 'total_images_generated': total_generated, 'total_images_analyzed': total_analyzed, 'average_images_per_product': total_generated / total_products if total_products > 0 else 0, 'processing_time': time.strftime('%Y-%m-%d %H:%M:%S') }, 'product_details': [ { 'id': r['product_id'], 'title': r['original_info'].get('title', ''), 'generated_count': len(r['generated_images']), 'analyzed_count': len(r['analysis_results']) } for r in results ] } report_file = "processing_report.json" with open(report_file, 'w', encoding='utf-8') as f: json.dump(report, f, ensure_ascii=False, indent=2) print(f"\n{'='*50}") print(f"处理完成！") print(f"共处理商品: {total_products}个") print(f"生成图片: {total_generated}张") print(f"分析图片: {total_analyzed}张") print(f"详细报告已保存到: {report_file}") print(f"{'='*50}") # 运行完整流程 if __name__ == "__main__": # 1. 爬取商品数据（如果还没有的话） # crawler = ProductCrawler() # products = crawler.crawl_multiple_products([...]) # with open('products.json', 'w') as f: # json.dump(products, f) # 2. 运行处理流水线 pipeline = ProductImagePipeline() # 处理前5个商品 results = pipeline.batch_process('products.json', max_products=5) print("\n处理完成！生成的图片在 output/ 目录下")

5. 实际效果与优化建议

5.1 效果展示

我实际测试了几个电商商品，效果还是挺让人惊喜的：

案例1：蓝牙耳机

输入提示词："高端无线蓝牙耳机，黑色磨砂材质，金属装饰，白色背景，产品摄影"
生成效果：图片清晰，材质质感表现不错，背景干净，适合做电商主图

案例2：运动水杯

输入提示词："不锈钢保温运动水杯，磨砂黑色，简约设计，白色背景，水滴效果"
生成效果：杯身反光处理自然，水滴效果逼真，整体看起来很专业

案例3：书籍封面

输入提示词："Python编程书籍封面，现代简约风格，蓝色调，代码元素，白色背景"
生成效果：设计感强，颜色搭配协调，文字清晰度有待提升

5.2 遇到的挑战与解决方案

在实际使用中，我也遇到了一些问题：

1. 生成图片分辨率固定384×384

问题：对于电商主图来说，这个分辨率有点低
解决方案：用超分辨率模型（如Real-ESRGAN）后处理，提升到768×768或更高

2. 中文提示词效果不稳定

问题：有时候中文描述生成效果不如英文
解决方案：中英文混合提示词，或者先用模型翻译成英文再生成

3. 生成速度较慢

问题：一张图要10-20秒，批量处理耗时
解决方案：使用批处理，一次生成多张；或者用更小的1B版本

5.3 性能优化技巧

# 优化后的生成函数，支持批处理和缓存 class OptimizedJanusModel(JanusModel): def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) self.prompt_cache = {} # 提示词缓存 self.image_cache = {} # 图片缓存 def batch_generate(self, prompts, output_dir="batch_output"): """批量生成图片，提高效率""" os.makedirs(output_dir, exist_ok=True) all_images = [] batch_size = 2 # 根据显存调整 for i in range(0, len(prompts), batch_size): batch_prompts = prompts[i:i+batch_size] print(f"处理批次 {i//batch_size + 1}: {len(batch_prompts)}个提示词") batch_images = [] for prompt in batch_prompts: # 检查缓存 cache_key = hash(prompt) if cache_key in self.image_cache: batch_images.extend(self.image_cache[cache_key]) else: images = self.generate_product_image(prompt, output_dir, num_images=2) self.image_cache[cache_key] = images batch_images.extend(images) all_images.extend(batch_images) # 清理显存 if torch.cuda.is_available(): torch.cuda.empty_cache() return all_images def optimize_prompt(self, original_prompt): """优化提示词，提高生成质量""" if original_prompt in self.prompt_cache: return self.prompt_cache[original_prompt] # 添加质量描述词 quality_words = ["高清", "高质量", "专业摄影", "细节丰富", "8K分辨率"] style_words = ["电商主图风格", "白色背景", "产品特写", "商业摄影"] # 分析原提示词 words = jieba.lcut(original_prompt) # 构建优化后的提示词 optimized = original_prompt # 如果没有质量描述，添加 if not any(word in original_prompt for word in quality_words): optimized += f"，{quality_words[0]}" # 如果没有风格描述，添加 if not any(word in original_prompt for word in style_words): optimized += f"，{style_words[0]}" # 缓存结果 self.prompt_cache[original_prompt] = optimized return optimized

5.4 成本效益分析

让我们算一笔账：

传统方式：

设计师做图：50-200元/张
10个商品，每个商品3张图：10 × 3 × 100 = 3000元
制作时间：3-5天

我们的方案：

服务器成本：GPU实例约5元/小时
10个商品，每个商品生成6张图（2个变体×3张）：共60张图
生成时间：约30分钟
总成本：5 × 0.5 = 2.5元
时间成本：1小时（含爬取、处理、生成）

节省：

直接成本：3000 vs 2.5元
时间成本：3-5天 vs 1小时
而且可以随时重新生成，无限迭代

6. 扩展应用场景

这个方案不仅适用于电商，还可以扩展到很多其他领域：

6.1 内容创作

自媒体配图生成
文章插图制作
社交媒体内容创作

6.2 教育培训

教学材料配图
课件插图生成
学习卡片制作

6.3 营销广告

广告素材生成
宣传海报制作
活动页面配图

6.4 产品设计

概念图生成
设计灵感获取
方案可视化

7. 总结与展望

把Python爬虫和Janus-Pro-7B结合起来，确实打开了很多可能性。从实际使用来看，这个方案最大的优势就是成本极低、速度极快，特别适合需要批量处理图片的场景。

不过也要客观看待，目前AI生成的图片在细节上还有提升空间，比如文字清晰度、复杂结构的准确性等。但随着模型不断进化，这些问题肯定会逐步改善。

对于想要尝试的朋友，我的建议是：

从小规模开始：先处理10-20个商品，熟悉整个流程
多尝试提示词：不同的描述方式效果差异很大
结合人工审核：生成后最好有人工筛选一下
关注模型更新：开源社区发展很快，新版本会有改进

未来，我计划在这个基础上加入更多功能，比如自动A/B测试不同风格的图片、根据点击率反馈优化生成策略、支持视频生成等。AI工具正在改变内容生产的方式，早点掌握这些技能，就能在竞争中占据先机。

获取更多AI镜像
想探索更多AI镜像和应用场景？访问 CSDN星图镜像广场，提供丰富的预置镜像，覆盖大模型推理、图像生成、视频生成、模型微调等多个领域，支持一键部署。

Janus-Pro-7B多模态模型实战：基于Python爬虫的数据采集与图像生成