在前文中我们基于YOLOv5最为轻量级的n系列的模型开发构建了涵洞场景下的洞体建筑病害缺陷检测分割系统,感兴趣的话可以自行移步阅读即可:
《基于轻量级yolov5n开发构建涵洞场景下洞体墙体缺陷病害检测分割系统》
本文的核心思想与前文相近,主要是助力隧道巡检质检相关场景下人工运维作业的自动化,除此之外,本文开发实践了YOLOv5全系列所有参数量级的模型,用于实际分析对比业务应用开发过程中如何选择最为合适的模型。
首先看下效果图:
接下来看下数据集:
来源于真实场景采集,基于labelme完成数据的标注。
实例标注数据如下所示:
2 0.9898554752640356 0.45831017231795446 0.6580739299610895 0.48533135692668766 0.6459143968871596 0.4644864430856649 0.5891699092088197 0.4629423753937373 0.5387947007596813 0.4637144092397011 0.48262923846581435 0.4783830523130134 0.45773114693348155 0.47297881539126674 0.44383453770613307 0.4698906800074115 0.4241476746340559 0.47915508615897723 0.4212525477116917 0.4930516953863258 0.4525199184732258 0.49845593230807245 0.5231610153789142 0.5046322030757829 0.5579025384472855 0.49613983077018103 0.5909069853622383 0.49845593230807245 0.6349129145821752 0.4930516953863258 0.6517046507318881 0.4899635600024705 0.6499675745784695 0.5108084738434934 0.659811006114508 0.5146686430733124 0.6621271076523995 0.5262491507627695 0.6910783768760422 0.5370576246062627 0.7594033722438391 0.5432338953739732 0.8474152306837133 0.5447779630659009 0.9076338706688902 0.5463220307578284 0.9475866221975172 0.5687110122907788 0.9950667037242914 0.5748872830584892 0.9956457291087643 0.4629423753937373
4 0.6730859010270775 1.2249844382197324 0.6781921101774043 1.1919156551509493 0.6913223622782446 1.1666277622159975 0.6811099439775911 1.127723311546841 0.6803804855275444 1.0868736383442266 0.6964285714285714 1.0538048552754435 0.7073704481792717 1.0275443510737627 0.7044526143790849 0.9954481792717087 0.6956991129785247 0.9652972300031123 0.6646628641017992 0.9040328727176327 0.6582941892832289 0.8916061901448123 0.6577672735760971 0.878734827264239 0.6665207749766573 0.8359399315281668 0.6688568694701262 0.8141465354408987 0.6628734827264239 0.7766106442577031 0.6665207749766573 0.7338157485216309 0.6723564425770308 0.6598972922502335 0.6723564425770308 0.6151571739807035 0.6691675365344467 0.6004075951883885 0.671808206581171 0.5948155880306194 0.6738153594771242 0.5383208839091193 0.6767331932773109 0.4799642079053844 0.6701680672268908 0.4157718643012761 0.6657913165266107 0.3739495798319328 0.6716528730490108 0.3566375053848958 0.6722742071776517 0.34586771382178483 0.6727402077741326 0.3359263677635285 0.6760037348272642 0.31851073762838467 0.6744488766278953 0.30755210922225534 0.6746042101600556 0.29636809490671706 0.6792642161248632 0.2907760877489479 0.6794195496570234 0.283112966829042 0.6801962173178248 0.27669251416641816 0.6803804855275444 0.2630718954248366 0.684700889750472 0.2580524903071876 0.6878075603936772 0.2466613646154356 0.6898268963117605 0.24024091195281175 0.6913223622782446 0.21346872082166202 0.7066409897292251 0.1638655462184874 0.7219596171802054 0.13760504201680673 0.7358193277310925 0.23875661375661378 0.741654995331466 0.2436196700902583 0.7394666199813259 0.15219421101774044 0.7489495798319328 0.11523498288204173 0.7606209150326797 0.10064581388110802 0.7744806255835668 0.07049486461251168 0.7701038748832867 0.01700124494242143 0.7102882819794585 0.013110799875505775 0.7212301587301587 0.04131652661064428 0.7190417833800187 0.08022097727980083 0.7146650326797386 0.11426237161531282 0.708829365079365 0.1337145969498911 0.7098458235753318 0.14307398733628246 0.7000878220140516 0.16746899123948306 0.690736403851158 0.19891144071471942 0.6826047358834244 0.24173822534478273 0.6716269841269842 0.2834807875791483 0.6683743169398907 0.3165495706479314 0.6675611501431175 0.34636568652962096 0.6682355353414852 0.3556019485038275 0.6606492323705438 0.37347124642206614 0.6683743169398907 0.44231936854887677 0.6691874837366641 0.4694249284413219 0.6464188134270101 0.44828259172521473 0.6220238095238096 0.4260560326134097 0.6008814728077023 0.4087084742822448 0.5878708040593287 0.4087084742822448 0.5785193858964351 0.4087084742822448 0.5642889669529014 0.40816636308439597 0.559816549570648 0.4059979182930003 0.5439597970335676 0.41141903027148935 0.5403005464480876 0.4179243646456761 0.5248503773093938 0.4309350333940498 0.5142792089513402 0.4325613669875965 0.4951697892271663 0.43852459016393447 0.4858183710642727 0.44665625813166804 0.467928701535259 0.45424581490155264 0.45695094977881867 0.4607511492757395 0.46955503512880564 0.4618353716714373 0.48378545407233936 0.4574984820886461 0.4996422066094198 0.4488247029230636 0.5110265417642468 0.4396088125596323 0.520784543325527 0.4390667013617834 0.5350149622690606 0.42876658860265426 0.5468058808222743 0.4206349206349207 0.5569704657819413 0.4146716974585828 0.5695745511319282 0.42117703183276956 0.5866510538641686 0.41738225344782726 0.5980353890189957 0.41684014224997834 0.6122658079625294 0.4276823662069564 0.6155184751496228 0.43418770058114325 0.6285291438979964 0.441235146153179 0.6334081446786366 0.4471983693295169 0.6533307311995836 0.4661722612542285 0.6671545667447307 0.485688264376789 0.6691874837366641 0.5073727122907451 0.6671545667447307 0.563752276867031 0.6667479833463441 0.5897736143637783 0.6622755659640907 0.5984473935293608 0.6648181976339596 0.604549822712662 0.6665268664877223 0.6107631639990722 0.6667479833463441 0.625010842223957 0.6671545667447307 0.6575375140948911 0.6639018995576373 0.7274698586173995 0.6594294821753839 0.7730071992367075 0.6614623991673172 0.792523202359268 0.6641968635053185 0.8044123007588561 0.6643521970374787 0.821602544984591 0.662643528183716 0.8379643437054711 0.6561195198329853 0.8520479172880008 0.6536341833184212 0.8698594956423766 0.6527021821254597 0.8845644033535475 0.6561195198329853 0.8965768631739405 0.6742935430957351 0.9305431288729827 0.6887395615866387 0.9701014017297942 0.6864095586042349 1.0117307883487423 0.6860988915399144 1.0471468336812804 0.6778662143354209 1.0732428670842031 0.6746042101600556 1.0962322298439209 0.6758468784173376 1.1266776021473306 0.6808175514464657 1.1507025217881168 0.6834582214931901 1.165614540875501 0.6753808778208569 1.188396792259005 0.6721188736454915 1.2022732544653212 0.6699442041952479 1.221327501076979 0.6699442041952479 1.2273337309871755
默认完全相同的训练配置,接下来我们来依次看下模型结果:
【n】
模型文件如下所示:
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
# Parameters
nc: 3 # number of classes
depth_multiple: 0.33 # model depth multiple
width_multiple: 0.25 # layer channel multiple
anchors:
- [10,13, 16,30, 33,23] # P3/8
- [30,61, 62,45, 59,119] # P4/16
- [116,90, 156,198, 373,326] # P5/32
# YOLOv5 v6.0 backbone
backbone:
# [from, number, module, args]
[[-1, 1, Conv, [64, 6, 2, 2]], # 0-P1/2
[-1, 1, Conv, [128, 3, 2]], # 1-P2/4
[-1, 3, C3, [128]],
[-1, 1, Conv, [256, 3, 2]], # 3-P3/8
[-1, 6, C3, [256]],
[-1, 1, Conv, [512, 3, 2]], # 5-P4/16
[-1, 9, C3, [512]],
[-1, 1, Conv, [1024, 3, 2]], # 7-P5/32
[-1, 3, C3, [1024]],
[-1, 1, SPPF, [1024, 5]], # 9
]
# YOLOv5 v6.0 head
head:
[[-1, 1, Conv, [512, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 6], 1, Concat, [1]], # cat backbone P4
[-1, 3, C3, [512, False]], # 13
[-1, 1, Conv, [256, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 4], 1, Concat, [1]], # cat backbone P3
[-1, 3, C3, [256, False]], # 17 (P3/8-small)
[-1, 1, Conv, [256, 3, 2]],
[[-1, 14], 1, Concat, [1]], # cat head P4
[-1, 3, C3, [512, False]], # 20 (P4/16-medium)
[-1, 1, Conv, [512, 3, 2]],
[[-1, 10], 1, Concat, [1]], # cat head P5
[-1, 3, C3, [1024, False]], # 23 (P5/32-large)
[[17, 20, 23], 1, Segment, [nc, anchors, 32, 256]], # Detect(P3, P4, P5)
]
结果详情如下所示:
【s】
模型文件如下所示:
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
# Parameters
nc: 80 # number of classes
depth_multiple: 0.33 # model depth multiple
width_multiple: 0.5 # layer channel multiple
anchors:
- [10,13, 16,30, 33,23] # P3/8
- [30,61, 62,45, 59,119] # P4/16
- [116,90, 156,198, 373,326] # P5/32
# YOLOv5 v6.0 backbone
backbone:
# [from, number, module, args]
[[-1, 1, Conv, [64, 6, 2, 2]], # 0-P1/2
[-1, 1, Conv, [128, 3, 2]], # 1-P2/4
[-1, 3, C3, [128]],
[-1, 1, Conv, [256, 3, 2]], # 3-P3/8
[-1, 6, C3, [256]],
[-1, 1, Conv, [512, 3, 2]], # 5-P4/16
[-1, 9, C3, [512]],
[-1, 1, Conv, [1024, 3, 2]], # 7-P5/32
[-1, 3, C3, [1024]],
[-1, 1, SPPF, [1024, 5]], # 9
]
# YOLOv5 v6.0 head
head:
[[-1, 1, Conv, [512, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 6], 1, Concat, [1]], # cat backbone P4
[-1, 3, C3, [512, False]], # 13
[-1, 1, Conv, [256, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 4], 1, Concat, [1]], # cat backbone P3
[-1, 3, C3, [256, False]], # 17 (P3/8-small)
[-1, 1, Conv, [256, 3, 2]],
[[-1, 14], 1, Concat, [1]], # cat head P4
[-1, 3, C3, [512, False]], # 20 (P4/16-medium)
[-1, 1, Conv, [512, 3, 2]],
[[-1, 10], 1, Concat, [1]], # cat head P5
[-1, 3, C3, [1024, False]], # 23 (P5/32-large)
[[17, 20, 23], 1, Segment, [nc, anchors, 32, 256]], # Detect(P3, P4, P5)
]
结果详情如下所示:
【m】
模型文件如下所示:
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
# Parameters
nc: 80 # number of classes
depth_multiple: 0.67 # model depth multiple
width_multiple: 0.75 # layer channel multiple
anchors:
- [10,13, 16,30, 33,23] # P3/8
- [30,61, 62,45, 59,119] # P4/16
- [116,90, 156,198, 373,326] # P5/32
# YOLOv5 v6.0 backbone
backbone:
# [from, number, module, args]
[[-1, 1, Conv, [64, 6, 2, 2]], # 0-P1/2
[-1, 1, Conv, [128, 3, 2]], # 1-P2/4
[-1, 3, C3, [128]],
[-1, 1, Conv, [256, 3, 2]], # 3-P3/8
[-1, 6, C3, [256]],
[-1, 1, Conv, [512, 3, 2]], # 5-P4/16
[-1, 9, C3, [512]],
[-1, 1, Conv, [1024, 3, 2]], # 7-P5/32
[-1, 3, C3, [1024]],
[-1, 1, SPPF, [1024, 5]], # 9
]
# YOLOv5 v6.0 head
head:
[[-1, 1, Conv, [512, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 6], 1, Concat, [1]], # cat backbone P4
[-1, 3, C3, [512, False]], # 13
[-1, 1, Conv, [256, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 4], 1, Concat, [1]], # cat backbone P3
[-1, 3, C3, [256, False]], # 17 (P3/8-small)
[-1, 1, Conv, [256, 3, 2]],
[[-1, 14], 1, Concat, [1]], # cat head P4
[-1, 3, C3, [512, False]], # 20 (P4/16-medium)
[-1, 1, Conv, [512, 3, 2]],
[[-1, 10], 1, Concat, [1]], # cat head P5
[-1, 3, C3, [1024, False]], # 23 (P5/32-large)
[[17, 20, 23], 1, Segment, [nc, anchors, 32, 256]], # Detect(P3, P4, P5)
]
结果详情如下所示:
【l】
模型文件如下所示:
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
# Parameters
nc: 80 # number of classes
depth_multiple: 1.0 # model depth multiple
width_multiple: 1.0 # layer channel multiple
anchors:
- [10,13, 16,30, 33,23] # P3/8
- [30,61, 62,45, 59,119] # P4/16
- [116,90, 156,198, 373,326] # P5/32
# YOLOv5 v6.0 backbone
backbone:
# [from, number, module, args]
[[-1, 1, Conv, [64, 6, 2, 2]], # 0-P1/2
[-1, 1, Conv, [128, 3, 2]], # 1-P2/4
[-1, 3, C3, [128]],
[-1, 1, Conv, [256, 3, 2]], # 3-P3/8
[-1, 6, C3, [256]],
[-1, 1, Conv, [512, 3, 2]], # 5-P4/16
[-1, 9, C3, [512]],
[-1, 1, Conv, [1024, 3, 2]], # 7-P5/32
[-1, 3, C3, [1024]],
[-1, 1, SPPF, [1024, 5]], # 9
]
# YOLOv5 v6.0 head
head:
[[-1, 1, Conv, [512, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 6], 1, Concat, [1]], # cat backbone P4
[-1, 3, C3, [512, False]], # 13
[-1, 1, Conv, [256, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 4], 1, Concat, [1]], # cat backbone P3
[-1, 3, C3, [256, False]], # 17 (P3/8-small)
[-1, 1, Conv, [256, 3, 2]],
[[-1, 14], 1, Concat, [1]], # cat head P4
[-1, 3, C3, [512, False]], # 20 (P4/16-medium)
[-1, 1, Conv, [512, 3, 2]],
[[-1, 10], 1, Concat, [1]], # cat head P5
[-1, 3, C3, [1024, False]], # 23 (P5/32-large)
[[17, 20, 23], 1, Segment, [nc, anchors, 32, 256]], # Detect(P3, P4, P5)
]
结果详情如下所示:
【x】
模型文件如下所示:
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
# Parameters
nc: 80 # number of classes
depth_multiple: 1.33 # model depth multiple
width_multiple: 1.25 # layer channel multiple
anchors:
- [10,13, 16,30, 33,23] # P3/8
- [30,61, 62,45, 59,119] # P4/16
- [116,90, 156,198, 373,326] # P5/32
# YOLOv5 v6.0 backbone
backbone:
# [from, number, module, args]
[[-1, 1, Conv, [64, 6, 2, 2]], # 0-P1/2
[-1, 1, Conv, [128, 3, 2]], # 1-P2/4
[-1, 3, C3, [128]],
[-1, 1, Conv, [256, 3, 2]], # 3-P3/8
[-1, 6, C3, [256]],
[-1, 1, Conv, [512, 3, 2]], # 5-P4/16
[-1, 9, C3, [512]],
[-1, 1, Conv, [1024, 3, 2]], # 7-P5/32
[-1, 3, C3, [1024]],
[-1, 1, SPPF, [1024, 5]], # 9
]
# YOLOv5 v6.0 head
head:
[[-1, 1, Conv, [512, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 6], 1, Concat, [1]], # cat backbone P4
[-1, 3, C3, [512, False]], # 13
[-1, 1, Conv, [256, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 4], 1, Concat, [1]], # cat backbone P3
[-1, 3, C3, [256, False]], # 17 (P3/8-small)
[-1, 1, Conv, [256, 3, 2]],
[[-1, 14], 1, Concat, [1]], # cat head P4
[-1, 3, C3, [512, False]], # 20 (P4/16-medium)
[-1, 1, Conv, [512, 3, 2]],
[[-1, 10], 1, Concat, [1]], # cat head P5
[-1, 3, C3, [1024, False]], # 23 (P5/32-large)
[[17, 20, 23], 1, Segment, [nc, anchors, 32, 256]], # Detect(P3, P4, P5)
]
结果详情如下所示:
最后为了方便整体全方位对不同系列的模型进行对比分析,这里对其进行了综合的对比可视化如下所示:
【Precision曲线】
精确率曲线(Precision-Recall Curve)是一种用于评估二分类模型在不同阈值下的精确率性能的可视化工具。它通过绘制不同阈值下的精确率和召回率之间的关系图来帮助我们了解模型在不同阈值下的表现。
精确率(Precision)是指被正确预测为正例的样本数占所有预测为正例的样本数的比例。召回率(Recall)是指被正确预测为正例的样本数占所有实际为正例的样本数的比例。
绘制精确率曲线的步骤如下:
使用不同的阈值将预测概率转换为二进制类别标签。通常,当预测概率大于阈值时,样本被分类为正例,否则分类为负例。
对于每个阈值,计算相应的精确率和召回率。
将每个阈值下的精确率和召回率绘制在同一个图表上,形成精确率曲线。
根据精确率曲线的形状和变化趋势,可以选择适当的阈值以达到所需的性能要求。
通过观察精确率曲线,我们可以根据需求确定最佳的阈值,以平衡精确率和召回率。较高的精确率意味着较少的误报,而较高的召回率则表示较少的漏报。根据具体的业务需求和成本权衡,可以在曲线上选择合适的操作点或阈值。
精确率曲线通常与召回率曲线(Recall Curve)一起使用,以提供更全面的分类器性能分析,并帮助评估和比较不同模型的性能。
【Recall曲线】
召回率曲线(Recall Curve)是一种用于评估二分类模型在不同阈值下的召回率性能的可视化工具。它通过绘制不同阈值下的召回率和对应的精确率之间的关系图来帮助我们了解模型在不同阈值下的表现。
召回率(Recall)是指被正确预测为正例的样本数占所有实际为正例的样本数的比例。召回率也被称为灵敏度(Sensitivity)或真正例率(True Positive Rate)。
绘制召回率曲线的步骤如下:
使用不同的阈值将预测概率转换为二进制类别标签。通常,当预测概率大于阈值时,样本被分类为正例,否则分类为负例。
对于每个阈值,计算相应的召回率和对应的精确率。
将每个阈值下的召回率和精确率绘制在同一个图表上,形成召回率曲线。
根据召回率曲线的形状和变化趋势,可以选择适当的阈值以达到所需的性能要求。
通过观察召回率曲线,我们可以根据需求确定最佳的阈值,以平衡召回率和精确率。较高的召回率表示较少的漏报,而较高的精确率意味着较少的误报。根据具体的业务需求和成本权衡,可以在曲线上选择合适的操作点或阈值。
召回率曲线通常与精确率曲线(Precision Curve)一起使用,以提供更全面的分类器性能分析,并帮助评估和比较不同模型的性能。
【F1值曲线】
F1值曲线是一种用于评估二分类模型在不同阈值下的性能的可视化工具。它通过绘制不同阈值下的精确率(Precision)、召回率(Recall)和F1分数的关系图来帮助我们理解模型的整体性能。
F1分数是精确率和召回率的调和平均值,它综合考虑了两者的性能指标。F1值曲线可以帮助我们确定在不同精确率和召回率之间找到一个平衡点,以选择最佳的阈值。
绘制F1值曲线的步骤如下:
使用不同的阈值将预测概率转换为二进制类别标签。通常,当预测概率大于阈值时,样本被分类为正例,否则分类为负例。
【loss曲线】
【mAP0.5】
mAP0.5(Mean Average Precision at 0.5)是目标检测任务中常用的评估指标之一。mAP0.5衡量了在交并比(IoU)阈值为0.5时的平均精确率(Average Precision)。它用于衡量目标检测算法在识别和定位物体方面的准确性。
在目标检测任务中,算法需要预测出物体的边界框和类别,而mAP0.5指标则用于评估这些预测的准确性。具体而言,对于每个类别,mAP0.5计算了在IoU阈值为0.5时的精确率,并对所有类别的精确率取平均值,得到最终的mAP0.5分数。
IoU是指预测的边界框与真实边界框之间的重叠度量。当预测的边界框与真实边界框之间的IoU大于等于0.5时,被认为是正确的检测。mAP0.5考虑了不同类别的预测结果,对每个类别的预测进行了排序,并计算了不同阈值下的精确率,最后取平均值得到最终的分数。
【mAP0.5:mAP0.95】
mAP0.5 和 mAP0.95 是目标检测任务中常用的两个评估指标,它们分别衡量算法在不同 IoU 阈值下的平均精确率。
mAP0.5 是指在 IoU 阈值为 0.5 时,算法的平均精确率。它能够反映算法在识别较大幅度变化的目标时的准确性。mAP0.5 的计算方式是将预测框与真实框的 IoU 阈值设为 0.5,计算预测框与真实框重叠的面积,然后根据每个类别的预测框与真实框重叠的面积,计算精确率,最后取平均值得到 mAP0.5。
mAP0.95 是指在 IoU 阈值为 0.95 时,算法的平均精确率。它能够反映算法在识别较小变化的目标时的准确性。mAP0.95 的计算方式与 mAP0.5 相似,只是将 IoU 阈值设为 0.95。
上面我们全方位对比可视化分析了不同参数量级模型的性能指标,在每幅图上面我们都绘制了整体对比图和单个模型的指标分析图,整体来看:m系列的模型具有最为合适的计算复杂度和性能结果的平衡。感兴趣的话也都可以自行实践尝试下!