yolov26改进 | Neck/颈部改进篇 | 利用Damo-YOLO的RepGFPN改进特征融合网络结构（含独家整理版本）-平芜编程栈

开始讲解之前推荐一下我的专栏，本专栏的内容支持(分类、检测、分割、追踪、关键点检测),专栏目前为限时折扣，欢迎大家订阅本专栏，本专栏每周更新5-7篇最新机制，更有包含我所有改进的文件和交流群提供给大家，本人定期在群内分享发表论文方法和经验。

一、本文介绍

本文给大家带来的最新改进机制是Damo-YOLO的RepGFPN（重参数化泛化特征金字塔网络），利用其优化YOLOv26的Neck部分，可以在不影响计算量的同时大幅度涨点（亲测在小目标和大目标检测的数据集上效果均表现良好）。RepGFPN不同于以往提出的改进模块，其更像是一种结构一种思想(一种处理事情的方法)，RepGFPN相对于BiFPN和之前的FPN均有一定程度上的优化效果，本文含两个版本一个是个人总结的使用方法，另一个是官方的使用方法。

专栏链接：YOLOv26有效涨点专栏包含：Conv、注意力机制、主干/Backbone、损失函数、优化器、后处理等改进机制

一、本文介绍

二、RepGFPN的框架原理

三、RepGFPN的核心代码

四、手把手教你添加RepGFPN

4.1 修改一

4.2 修改二

4.3 修改三

4.4 修改四

4.5 修改五

五、正式训练

5.1 yaml文件

5.1.1 yaml文件1

5.1.2 yaml文件2

5.2 训练代码

5.3 训练过程截图

五、本文总结

二、RepGFPN的框架原理

官方论文地址：官方论文地址

官方代码地址：官方代码地址

RepGFPN（重参数化泛化特征金字塔网络）是DAMO-YOLO框架中用于实时目标检测的新方法。其主要主要原理是：RepGFPN改善了用于目标检测的特征金字塔网络（FPN）的概念，更高效地融合多尺度特征，对于捕捉高层语义和低层空间细节至关重要。

其主要改进机制包括->

不同尺度通道：它为不同尺度的特征图采用不同的通道维度，优化了计算资源下的性能。
优化的皇后融合机制：该方法通过修改的皇后融合机制增强了特征交互，通过去除额外的上采样操作减少延迟。
整合CSPNet和ELAN：它结合了CSPNet和高效层聚合网络（ELAN）以及重参数化，改善了特征融合，而不显著增加计算需求。

总结：RepGFPN更像是一种结构一种思想，其中的模块我们是可以用其它的机制替换的。

下面的图片是Damo-YOLO的网络结构图，其中我用红框标出来的部分就是RepGFPN的路径聚合图。

根据图片我们来说一下GFPN(重参数化特征金字塔网络)：作为“颈部（也就是YOLOv8中的neck），用于优化和融合高层语义和低层空间特征。

在左上角的融合块（Fusion Block）中，我们可以看到反复出现的结构单元，它们由多个1x1卷积，一个3x3卷积组成，这些卷积后面通常跟着批量归一化（BN）和激活函数（Act）。这个复合结构在训练时和推理时有所不同，这是通过“简化Rep 3x3”结构来实现的，它在训练时使用3x3卷积，而在推理时则简化为1x1卷积，以提高效率(现在很多结构都使用在何种思想训练时候用复杂的模块，推理时换为简单的模块，这在大家自己的改进中也可以是一种思想)。

三、RepGFPN的核心代码

下面的代码是GFPN的核心代码，我们将其复制导'ultralytics/nn/modules'目录下，在其中创建一个文件，我这里起名为GFPN然后粘贴进去，其余使用方式看章节四。

import torch import torch.nn as nn import numpy as np class swish(nn.Module): def forward(self, x): return x * torch.sigmoid(x) def autopad(k, p=None, d=1): # kernel, padding, dilation """Pad to 'same' shape outputs.""" if d > 1: k = d * (k - 1) + 1 if isinstance(k, int) else [d * (x - 1) + 1 for x in k] # actual kernel-size if p is None: p = k // 2 if isinstance(k, int) else [x // 2 for x in k] # auto-pad return p class Conv(nn.Module): """Standard convolution with args(ch_in, ch_out, kernel, stride, padding, groups, dilation, activation).""" default_act = swish() # default activation def __init__(self, c1, c2, k=1, s=1, p=None, g=1, d=1, act=True): """Initialize Conv layer with given arguments including activation.""" super().__init__() self.conv = nn.Conv2d(c1, c2, k, s, autopad(k, p, d), groups=g, dilation=d, bias=False) self.bn = nn.BatchNorm2d(c2) self.act = self.default_act if act is True else act if isinstance(act, nn.Module) else nn.Identity() def forward(self, x): """Apply convolution, batch normalization and activation to input tensor.""" return self.act(self.bn(self.conv(x))) def forward_fuse(self, x): """Perform transposed convolution of 2D data.""" return self.act(self.conv(x)) class RepConv(nn.Module): default_act = swish() # default activation def __init__(self, c1, c2, k=3, s=1, p=1, g=1, d=1, act=True, bn=False, deploy=False): """Initializes Light Convolution layer with inputs, outputs & optional activation function.""" super().__init__() assert k == 3 and p == 1 self.g = g self.c1 = c1 self.c2 = c2 self.act = self.default_act if act is True else act if isinstance(act, nn.Module) else nn.Identity() self.bn = nn.BatchNorm2d(num_features=c1) if bn and c2 == c1 and s == 1 else None self.conv1 = Conv(c1, c2, k, s, p=p, g=g, act=False) self.conv2 = Conv(c1, c2, 1, s, p=(p - k // 2), g=g, act=False) def forward_fuse(self, x): """Forward process.""" return self.act(self.conv(x)) def forward(self, x): """Forward process.""" id_out = 0 if self.bn is None else self.bn(x) return self.act(self.conv1(x) + self.conv2(x) + id_out) def get_equivalent_kernel_bias(self): """Returns equivalent kernel and bias by adding 3x3 kernel, 1x1 kernel and identity kernel with their biases.""" kernel3x3, bias3x3 = self._fuse_bn_tensor(self.conv1) kernel1x1, bias1x1 = self._fuse_bn_tensor(self.conv2) kernelid, biasid = self._fuse_bn_tensor(self.bn) return kernel3x3 + self._pad_1x1_to_3x3_tensor(kernel1x1) + kernelid, bias3x3 + bias1x1 + biasid def _pad_1x1_to_3x3_tensor(self, kernel1x1): """Pads a 1x1 tensor to a 3x3 tensor.""" if kernel1x1 is None: return 0 else: return torch.nn.functional.pad(kernel1x1, [1, 1, 1, 1]) def _fuse_bn_tensor(self, branch): """Generates appropriate kernels and biases for convolution by fusing branches of the neural network.""" if branch is None: return 0, 0 if isinstance(branch, Conv): kernel = branch.conv.weight running_mean = branch.bn.running_mean running_var = branch.bn.running_var gamma = branch.bn.weight beta = branch.bn.bias eps = branch.bn.eps elif isinstance(branch, nn.BatchNorm2d): if not hasattr(self, 'id_tensor'): input_dim = self.c1 // self.g kernel_value = np.zeros((self.c1, input_dim, 3, 3), dtype=np.float32) for i in range(self.c1): kernel_value[i, i % input_dim, 1, 1] = 1 self.id_tensor = torch.from_numpy(kernel_value).to(branch.weight.device) kernel = self.id_tensor running_mean = branch.running_mean running_var = branch.running_var gamma = branch.weight beta = branch.bias eps = branch.eps std = (running_var + eps).sqrt() t = (gamma / std).reshape(-1, 1, 1, 1) return kernel * t, beta - running_mean * gamma / std def fuse_convs(self): """Combines two convolution layers into a single layer and removes unused attributes from the class.""" if hasattr(self, 'conv'): return kernel, bias = self.get_equivalent_kernel_bias() self.conv = nn.Conv2d(in_channels=self.conv1.conv.in_channels, out_channels=self.conv1.conv.out_channels, kernel_size=self.conv1.conv.kernel_size, stride=self.conv1.conv.stride, padding=self.conv1.conv.padding, dilation=self.conv1.conv.dilation, groups=self.conv1.conv.groups, bias=True).requires_grad_(False) self.conv.weight.data = kernel self.conv.bias.data = bias for para in self.parameters(): para.detach_() self.__delattr__('conv1') self.__delattr__('conv2') if hasattr(self, 'nm'): self.__delattr__('nm') if hasattr(self, 'bn'): self.__delattr__('bn') if hasattr(self, 'id_tensor'): self.__delattr__('id_tensor') class BasicBlock_3x3_Reverse(nn.Module): def __init__(self, ch_in, ch_hidden_ratio, ch_out, shortcut=True): super(BasicBlock_3x3_Reverse, self).__init__() assert ch_in == ch_out ch_hidden = int(ch_in * ch_hidden_ratio) self.conv1 = Conv(ch_hidden, ch_out, 3, s=1) self.conv2 = RepConv(ch_in, ch_hidden, 3, s=1) self.shortcut = shortcut def forward(self, x): y = self.conv2(x) y = self.conv1(y) if self.shortcut: return x + y else: return y class SPP(nn.Module): def __init__( self, ch_in, ch_out, k, pool_size ): super(SPP, self).__init__() self.pool = [] for i, size in enumerate(pool_size): pool = nn.MaxPool2d(kernel_size=size, stride=1, padding=size // 2, ceil_mode=False) self.add_module('pool{}'.format(i), pool) self.pool.append(pool) self.conv = Conv(ch_in, ch_out, k) def forward(self, x): outs = [x] for pool in self.pool: outs.append(pool(x)) y = torch.cat(outs, axis=1) y = self.conv(y) return y class CSPStage(nn.Module): def __init__(self, ch_in, ch_out, n, block_fn='BasicBlock_3x3_Reverse', ch_hidden_ratio=1.0, act='silu', spp=False): super(CSPStage, self).__init__() split_ratio = 2 ch_first = int(ch_out // split_ratio) ch_mid = int(ch_out - ch_first) self.conv1 = Conv(ch_in, ch_first, 1) self.conv2 = Conv(ch_in, ch_mid, 1) self.convs = nn.Sequential() next_ch_in = ch_mid for i in range(n): if block_fn == 'BasicBlock_3x3_Reverse': self.convs.add_module( str(i), BasicBlock_3x3_Reverse(next_ch_in, ch_hidden_ratio, ch_mid, shortcut=True)) else: raise NotImplementedError if i == (n - 1) // 2 and spp: self.convs.add_module('spp', SPP(ch_mid * 4, ch_mid, 1, [5, 9, 13])) next_ch_in = ch_mid self.conv3 = Conv(ch_mid * n + ch_first, ch_out, 1) def forward(self, x): y1 = self.conv1(x) y2 = self.conv2(x) mid_out = [y1] for conv in self.convs: y2 = conv(y2) mid_out.append(y2) y = torch.cat(mid_out, axis=1) y = self.conv3(y) return y

四、手把手教你添加RepGFPN

下面的步骤如果你不会或者不想麻烦操作，可以联系作者获得本专栏添加所有项目文件的源代码，可直接训练.

4.1 修改一

第一还是建立文件，我们找到如下ultralytics/nn文件夹下建立一个目录名字呢就是'Addmodules'文件夹！

4.2 修改二

然后在Addmodules文件夹内建立一个新的py文件，将本文章节三中的“核心代码"复制粘贴进去。

4.3 修改三

第二步我们在该目录下创建一个新的py文件名字为'__init__.py'，然后在其内部导入我们的文件，如下图所示。

4.4 修改四

第三步我门中到如下文件'ultralytics/nn/tasks.py'进行导入和注册我们的模块(此处只需要添加一次即可，如果你用我其它的改进机制这里的步骤只需要添加一次)！

4.5 修改五

在'ultralytics/nn/tasks.py'文件内的parse_model方法函数内（位置大概在1500+行左右），按照图示位置添加即可（此处需要自己有一定的判别能力，如果不会可联系作者获得视频教程）。

五、正式训练

5.1 yaml文件

5.1.1 yaml文件1

训练信息：YOLO26-Neck-DamoYOLO-1 summary: 287 layers, 2,823,196 parameters, 2,823,196 gradients, 6.2 GFLOPs

# Ultralytics 🚀 AGPL-3.0 License - https://ultralytics.com/license # Ultralytics YOLO26 object detection model with P3/8 - P5/32 outputs # Model docs: https://docs.ultralytics.com/models/yolo26 # Task docs: https://docs.ultralytics.com/tasks/detect # Parameters nc: 80 # number of classes end2end: True # whether to use end-to-end mode reg_max: 1 # DFL bins scales: # model compound scaling constants, i.e. 'model=yolo26n.yaml' will call yolo26.yaml with scale 'n' # [depth, width, max_channels] n: [0.50, 0.25, 1024] # summary: 260 layers, 2,572,280 parameters, 2,572,280 gradients, 6.1 GFLOPs s: [0.50, 0.50, 1024] # summary: 260 layers, 10,009,784 parameters, 10,009,784 gradients, 22.8 GFLOPs m: [0.50, 1.00, 512] # summary: 280 layers, 21,896,248 parameters, 21,896,248 gradients, 75.4 GFLOPs l: [1.00, 1.00, 512] # summary: 392 layers, 26,299,704 parameters, 26,299,704 gradients, 93.8 GFLOPs x: [1.00, 1.50, 512] # summary: 392 layers, 58,993,368 parameters, 58,993,368 gradients, 209.5 GFLOPs # YOLO26n backbone backbone: # [from, repeats, module, args] - [-1, 1, Conv, [64, 3, 2]] # 0-P1/2 - [-1, 1, Conv, [128, 3, 2]] # 1-P2/4 - [-1, 2, C3k2, [256, False, 0.25]] - [-1, 1, Conv, [256, 3, 2]] # 3-P3/8 - [-1, 2, C3k2, [512, False, 0.25]] - [-1, 1, Conv, [512, 3, 2]] # 5-P4/16 - [-1, 2, C3k2, [512, True]] - [-1, 1, Conv, [1024, 3, 2]] # 7-P5/32 - [-1, 2, C3k2, [1024, True]] - [-1, 1, SPPF, [1024, 5, 3, True]] # 9 - [-1, 2, C2PSA, [1024]] # 10 # DAMO-YOLO GFPN Head head: - [-1, 1, Conv, [512, 1, 1]] # 11 - [6, 1, Conv, [512, 3, 2]] - [[-1, 11], 1, Concat, [1]] - [-1, 2, C3k2, [512, True]] # 14 - [-1, 1, nn.Upsample, [None, 2, 'nearest']] #15 - [4, 1, Conv, [256, 3, 2]] # 16 - [[15, -1, 6], 1, Concat, [1]] - [-1, 2, C3k2, [512, True]] # 18 - [-1, 1, nn.Upsample, [None, 2, 'nearest']] - [[-1, 4], 1, Concat, [1]] - [-1, 2, C3k2, [256, True]] # 21 - [-1, 1, Conv, [256, 3, 2]] - [[-1, 18], 1, Concat, [1]] - [-1, 2, C3k2, [512, True]] # 24 - [18, 1, Conv, [256, 3, 2]] # 25 - [24, 1, Conv, [256, 3, 2]] # 26 - [[14, 25, -1], 1, Concat, [1]] - [-1, 2, C3k2, [1024, True, 0.5, True]] # 28 - [[21, 24, 28], 1, Detect, [nc]] # Detect(P3, P4, P5)

5.1.2 yaml文件2

训练信息：YOLO26-Neck-DamoYOLO-2 summary: 265 layers, 2,916,444 parameters, 2,916,444 gradients, 6.5 GFLOPs

# Ultralytics 🚀 AGPL-3.0 License - https://ultralytics.com/license # Ultralytics YOLO26 object detection model with P3/8 - P5/32 outputs # Model docs: https://docs.ultralytics.com/models/yolo26 # Task docs: https://docs.ultralytics.com/tasks/detect # Parameters nc: 80 # number of classes end2end: True # whether to use end-to-end mode reg_max: 1 # DFL bins scales: # model compound scaling constants, i.e. 'model=yolo26n.yaml' will call yolo26.yaml with scale 'n' # [depth, width, max_channels] n: [0.50, 0.25, 1024] # summary: 260 layers, 2,572,280 parameters, 2,572,280 gradients, 6.1 GFLOPs s: [0.50, 0.50, 1024] # summary: 260 layers, 10,009,784 parameters, 10,009,784 gradients, 22.8 GFLOPs m: [0.50, 1.00, 512] # summary: 280 layers, 21,896,248 parameters, 21,896,248 gradients, 75.4 GFLOPs l: [1.00, 1.00, 512] # summary: 392 layers, 26,299,704 parameters, 26,299,704 gradients, 93.8 GFLOPs x: [1.00, 1.50, 512] # summary: 392 layers, 58,993,368 parameters, 58,993,368 gradients, 209.5 GFLOPs # YOLO26n backbone backbone: # [from, repeats, module, args] - [-1, 1, Conv, [64, 3, 2]] # 0-P1/2 - [-1, 1, Conv, [128, 3, 2]] # 1-P2/4 - [-1, 2, C3k2, [256, False, 0.25]] - [-1, 1, Conv, [256, 3, 2]] # 3-P3/8 - [-1, 2, C3k2, [512, False, 0.25]] - [-1, 1, Conv, [512, 3, 2]] # 5-P4/16 - [-1, 2, C3k2, [512, True]] - [-1, 1, Conv, [1024, 3, 2]] # 7-P5/32 - [-1, 2, C3k2, [1024, True]] - [-1, 1, SPPF, [1024, 5, 3, True]] # 9 - [-1, 2, C2PSA, [1024]] # 10 # DAMO-YOLO GFPN Head head: - [-1, 1, Conv, [512, 1, 1]] # 11 - [6, 1, Conv, [512, 3, 2]] - [[-1, 11], 1, Concat, [1]] - [-1, 2, CSPStage, [512]] # 14 - [-1, 1, nn.Upsample, [None, 2, 'nearest']] #15 - [4, 1, Conv, [256, 3, 2]] # 16 - [[15, -1, 6], 1, Concat, [1]] - [-1, 2, CSPStage, [512]] # 18 - [-1, 1, nn.Upsample, [None, 2, 'nearest']] - [[-1, 4], 1, Concat, [1]] - [-1, 2, CSPStage, [256]] # 21 - [-1, 1, Conv, [256, 3, 2]] - [[-1, 18], 1, Concat, [1]] - [-1, 2, CSPStage, [512]] # 24 - [18, 1, Conv, [256, 3, 2]] # 25 - [24, 1, Conv, [256, 3, 2]] # 26 - [[14, 25, -1], 1, Concat, [1]] - [-1, 2, CSPStage, [1024]] # 28 - [[21, 24, 28], 1, Detect, [nc]] # Detect(P3, P4, P5)

5.2 训练代码

大家可以创建一个py文件将我给的代码复制粘贴进去，配置好自己的文件路径即可运行。

import warnings warnings.filterwarnings('ignore') from ultralytics import YOLO if __name__ == '__main__': model = YOLO('模型配置文件地址,也就是5.1你保存到本地文件的地址') # 如何切换模型版本, 上面的ymal文件可以改为 yolo26s.yaml就是使用的26s, # 类似某个改进的yaml文件名称为yolo26-XXX.yaml那么如果想使用其它版本就把上面的名称改为yolo26l-XXX.yaml即可（改的是上面YOLO中间的名字不是配置文件的）！ # model.load('yolo26n.pt') # 是否加载预训练权重,科研不建议大家加载否则很难提升精度 model.train( data=r"数据集文件地址", # 如果大家任务是其它的'ultralytics/cfg/default.yaml'找到这里修改task可以改成detect, segment, classify, pose cache=False, imgsz=640, epochs=20, single_cls=False, # 是否是单类别检测 batch=16, close_mosaic=0, workers=0, device='0', optimizer='MuSGD', # using SGD/MuSGD # resume=, # 这里是填写last.pt地址 amp=True, # 如果出现训练损失为Nan可以关闭amp project='runs/train', name='exp', )