CRITIC权重法保姆级教程：用Python从数据标准化到结果可视化全流程-平芜编程栈

CRITIC权重法实战指南：用Python实现数据评价与可视化全流程

当你面对一份包含多个评价指标的数据集时，如何科学地确定各指标的重要性？CRITIC权重法提供了一种基于数据内在特性的客观赋权方案。不同于主观打分法容易受人为因素影响，CRITIC方法通过分析指标间的对比强度和冲突性来自动计算权重，特别适合处理具有相关性的多指标评价问题。

1. 环境准备与数据理解

在开始之前，确保你的Python环境已安装以下库：

pip install numpy pandas matplotlib seaborn

假设我们有一份城市发展评估数据集，包含5个城市的6项指标：

城市	GDP(亿元)	人均收入(万元)	空气质量指数	失业率(%)	医疗资源指数	教育投入占比
A市	1200	6.8	85	3.2	78	4.1
B市	980	5.9	92	4.1	65	3.8
C市	1500	7.5	78	2.8	82	4.5
D市	850	5.2	88	5.0	60	3.5
E市	1100	6.5	82	3.5	75	4.0

注意：需要预先明确每个指标的类型，GDP、人均收入、医疗资源指数和教育投入占比是正向指标（越大越好），空气质量指数和失业率是负向指标（越小越好）。

2. 数据预处理与标准化

不同类型指标需要先进行标准化处理，消除量纲影响。我们使用极值标准化方法：

import numpy as np import pandas as pd def data_normalization(data, indicators_type): """ 数据标准化处理 :param data: 原始数据矩阵 :param indicators_type: 指标类型列表，1为正向指标，2为负向指标 :return: 标准化后的矩阵 """ normalized_data = np.zeros_like(data, dtype=float) for i in range(data.shape[1]): col = data[:, i] max_val, min_val = np.max(col), np.min(col) if indicators_type[i] == 1: # 正向指标 normalized_data[:, i] = (col - min_val) / (max_val - min_val) else: # 负向指标 normalized_data[:, i] = (max_val - col) / (max_val - min_val) return normalized_data # 示例数据 raw_data = np.array([ [1200, 6.8, 85, 3.2, 78, 4.1], [980, 5.9, 92, 4.1, 65, 3.8], [1500, 7.5, 78, 2.8, 82, 4.5], [850, 5.2, 88, 5.0, 60, 3.5], [1100, 6.5, 82, 3.5, 75, 4.0] ]) indicators_type = [1, 1, 2, 2, 1, 1] # 指标类型 normalized_data = data_normalization(raw_data, indicators_type) print("标准化后的矩阵:\n", normalized_data)

标准化后的数据所有值都在[0,1]区间内，且都转化为正向指标，便于后续比较。

3. CRITIC权重计算核心步骤

CRITIC权重的计算主要基于两个关键概念：

对比强度：通过标准差衡量指标内数据的波动程度
冲突性：通过相关系数衡量指标间的相关性

3.1 计算对比强度

对比强度反映指标内各评价对象取值的差异程度：

def calculate_contrast_intensity(normalized_data): """计算对比强度（标准差）""" return np.std(normalized_data, axis=0, ddof=1) contrast = calculate_contrast_intensity(normalized_data) print("各指标对比强度:\n", contrast)

3.2 计算冲突性矩阵

冲突性反映指标间的相关性，使用Pearson相关系数：

def calculate_conflict(normalized_data): """计算冲突性""" corr_matrix = np.corrcoef(normalized_data, rowvar=False) return np.sum(1 - corr_matrix, axis=0) conflict = calculate_conflict(normalized_data) print("各指标冲突性:\n", conflict)

3.3 计算信息承载量与权重

信息承载量是对比强度与冲突性的乘积，权重则是归一化的信息承载量：

def calculate_critic_weights(contrast, conflict): """计算CRITIC权重""" information_capacity = contrast * conflict weights = information_capacity / np.sum(information_capacity) return weights weights = calculate_critic_weights(contrast, conflict) print("各指标权重:\n", weights)

4. 结果可视化分析

4.1 权重分布条形图

import matplotlib.pyplot as plt import seaborn as sns # 设置中文显示 plt.rcParams['font.sans-serif'] = ['SimHei'] plt.rcParams['axes.unicode_minus'] = False # 指标名称 indicators = ['GDP', '人均收入', '空气质量', '失业率', '医疗资源', '教育投入'] # 绘制权重条形图 plt.figure(figsize=(10, 6)) sns.barplot(x=indicators, y=weights, palette="viridis") plt.title('各指标CRITIC权重分布') plt.ylabel('权重') plt.xticks(rotation=45) for i, v in enumerate(weights): plt.text(i, v+0.01, f"{v:.3f}", ha='center') plt.tight_layout() plt.show()

4.2 城市评价雷达图

计算各城市综合得分并可视化：

def calculate_scores(normalized_data, weights): """计算综合得分""" scores = np.dot(normalized_data, weights) return 100 * scores / np.max(scores) # 转换为百分制 scores = calculate_scores(normalized_data, weights) print("各城市综合得分:\n", scores) # 雷达图绘制 def plot_radar_chart(cities, scores): angles = np.linspace(0, 2*np.pi, len(cities), endpoint=False).tolist() angles += angles[:1] # 闭合图形 fig, ax = plt.subplots(figsize=(8, 8), subplot_kw=dict(polar=True)) ax.fill(angles, [*scores, scores[0]], alpha=0.25) ax.plot(angles, [*scores, scores[0]], marker='o') ax.set_theta_offset(np.pi / 2) ax.set_theta_direction(-1) ax.set_thetagrids(np.degrees(angles[:-1]), cities) ax.set_rlabel_position(0) plt.yticks([20, 40, 60, 80, 100], ["20", "40", "60", "80", "100"], color="grey", size=7) plt.ylim(0, 110) plt.title('城市发展综合评价雷达图', pad=20) plt.show() cities = ['A市', 'B市', 'C市', 'D市', 'E市'] plot_radar_chart(cities, scores)

5. 实际应用中的注意事项

指标类型判断：务必正确区分正向和负向指标，错误的类型设定会导致权重计算完全错误
数据质量检查：
- 处理缺失值（删除或合理填充）
- 检查异常值（使用箱线图或3σ原则识别）
结果解释：
- 高权重指标通常具有较大内部差异和较低相关性
- 低权重指标可能是数据变化小或与其他指标高度相关
方法比较：
- 与熵权法相比，CRITIC考虑了指标间相关性
- 与AHP相比，CRITIC完全基于数据，避免主观偏差

# 完整流程封装示例 def critic_method(data, indicators_type, indicator_names=None): """CRITIC权重法完整流程封装""" # 数据标准化 normalized_data = data_normalization(data, indicators_type) # 计算对比强度 contrast = calculate_contrast_intensity(normalized_data) # 计算冲突性 conflict = calculate_conflict(normalized_data) # 计算权重 weights = calculate_critic_weights(contrast, conflict) # 计算得分 scores = calculate_scores(normalized_data, weights) return { 'normalized_data': normalized_data, 'contrast': contrast, 'conflict': conflict, 'weights': weights, 'scores': scores } # 使用示例 result = critic_method(raw_data, indicators_type) print("最终权重:", result['weights'])

在实际项目中，我发现CRITIC方法特别适合处理指标间存在一定相关性的评价问题。曾经在一个区域经济发展评估项目中，使用这种方法成功识别出虽然GDP指标波动较大，但由于与其他指标高度相关，最终权重反而低于预期，这一发现帮助我们更全面地理解了区域发展差异的主要驱动因素。