当前位置:网站首页>ECCV 2022 | fix the performance damage of large targets caused by FPN: you should look at all objects
ECCV 2022 | fix the performance damage of large targets caused by FPN: you should look at all objects
2022-07-22 16:36:00 【PaperWeekly】
author | Heat lamp lamp
The main purpose of this article is correct FPN Big target performance damage , It is found that the reason mainly comes from FPN Inappropriate in back-propagation, Therefore, the author puts forward the auxiliary loss based on uncertainty and constructs a new FPN Structure to solve the above problems .
Paper title :
You Should Look at All Objects
The conference :
ECCV 2022
Thesis link :
https://arxiv.org/abs/2207.07889
Code link :
https://github.com/CharlesPikachu/YSLAO
Preface
FPN (Feature Pryramids Network) Integration into the backbone network can effectively improve the performance of the target detection model , It is one of the basic modules of target detection , There have been a lot of related work to optimize FPN Structure , such as PANet and Nas-fpn. At present, I think FPN There are two main advantages , One is FPN By integrating the multi-layer characteristics of the backbone network , Can be better characterized , The other is FPN By dealing with targets of different sizes at each different stage , Realized the idea of partition . Obviously, these methods should be able to improve the detection effect of all scale targets , But the truth is Small and medium goals AP Got a promotion , Big target AP It's down .
As shown in the figure above , The author in MMDetection and Detectron2 Two detection frameworks are used for experimental comparison FPN Changes of various performance indicators before and after , You can see the use of FPN after , whole AP There is a promotion , But the big goal (AP_l) Have decreased in varying degrees . How to solve this problem ? The author found FPN The change brought about is not only the multi-level feature fusion and divide and conquer thought , also back-propagation path Changes , This will also directly affect the effect of the detection model .
There is a problem here , The two models compared by the author are ResNet-50-DC5 and ResNet-50-FPN, It doesn't feel very fair ,ResNet-50-FPN Should be and ResNet-50 Comparison is right , Here we add FPN Only a few points higher , commonly FPN The rising point will be more . there DC5 It should be to add hole convolution behind the backbone network , The output receptive field behind is larger, and the detection effect of large targets should be better .
The author proposes to introduce more auxiliary loss functions to expand back-propagation path So that the additional supervised signals can assist the learning of the corresponding backbone network layer . The key technique is to use uncertainty to balance a large number of loss functions . besides , The author designs a new FPN Network to modify back-propagation path. The proposed method can steadily improve on multiple detection methods 2 percentage , Include one-stage, two-stage, anchor-based, anchor-free The detector .
First, a brief review FPN, It mainly consists of three parts :top-down, bottom-up and lateral connection, The main differences with the basic backbone network are shown in the figure above . The author's main findings are FPN-free The shallow layer of the detection frame is back-propogation No effective supervision signal was received at the time of , However FPN-based All backbone network features of the detection framework can get direct supervision signals . To prove this phenomenon , The author did a simple experiment , That is to FPN-free The characteristic layer of the detector adds some auxiliary supervision loss .
The experimental results are shown in the figure above , It can be seen that after adding auxiliary losses ,FPN-free and FPN-based There is no difference in the performance of the detector (FPN-Aux v.s. DC5-Aux). So why FPN It will also inhibit the detection performance of large targets ? as a result of FPN The lowest feature in the structure is mainly used to supervise the learning objectives , So I learned feature It has good detection ability only for small targets . meanwhile , The bottom features will spread backwards , The ability to detect large targets will continue to weaken with the spread , So the final result Feature Relatively weak detection ability for large targets .
Method
2.1 Ancillary loss
Firstly, an auxiliary loss with uncertainty is proposed :
The loss function here includes the loss of classification and regression , among α It's uncertainty , Is obtained through a prediction function , as follows :
here (x) yes feature map, (w) and (b) It's a learnable parameter . What's not clear is , Why did you get it like this α It can represent uncertainty ?
2.2 Feature pyramid generation paradigm
The main purpose of the author is to let the supervision loss see every goal . There are two changes here , One is Feature Grouping, The other is Cascade Structure. Simply speaking ,Feature Grouping Through some operations, the interaction between multiple layers is more , Mainly through some operations to exchange different layers feature Of channel. Maybe several layers of network get a feature converter to transform feature, Then press channel grouping , Finally, cross layer exchange channel.Cascade structure It is the intermediate features obtained above that are converted again through several layers of Nonlinear Networks . The specific operation is as follows :
experiment
▲ Auxiliary loss Ablation Study, The introduction of uncertainty has been significantly improved
▲ About feature pyramid generation paradigm Of Ablation Study
It can be seen that , Combine Feature Grouping and Cascade The structure can be improved obviously , and Cascade The more times, the better .
summary
The article raises a question , namely FPN It is harmful to the detection of large targets , And found that the main reason is back-propagation path in , So two strategies are proposed to solve this problem , One is to add multi-layer auxiliary losses , And use uncertainty to balance the relationship between multiple losses . Moreover, through modification PFN To change the structure of back-propagation path, Put forward Feature Grouping To make the Feature more conpactness, Fully integrate the characteristics of different layers , Let the features of each layer see all Objects, The experimental results have been significantly improved on many detectors .
There are also some problems in the article that are not very clear :
1. Fig.1 Chinese comparative experiment , use ResNet-50-DC5 And ResNet-50-FPN Is it appropriate to compare ?DC5 It may be helpful for the detection of large targets , If so , The article is about FPN The conclusion that it is harmful to large target detection remains to be discussed .
2. The article is actually right FPN Some improvements to , But not with others FPN Compare the work of , such as PANet, Nas-fpn, The problem of big goals may lie in these methods somehow It's been solved .
Read more
# cast draft through Avenue #
Let your words be seen by more people
How to make more high-quality content reach the reader group in a shorter path , How about reducing the cost of finding quality content for readers ? The answer is : People you don't know .
There are always people you don't know , Know what you want to know .PaperWeekly Maybe it could be a bridge , Push different backgrounds 、 Scholars and academic inspiration in different directions collide with each other , There are more possibilities .
PaperWeekly Encourage university laboratories or individuals to , Share all kinds of quality content on our platform , It can be Interpretation of the latest paper , It can also be Analysis of academic hot spots 、 Scientific research experience or Competition experience explanation etc. . We have only one purpose , Let knowledge really flow .
The basic requirements of the manuscript :
• The article is really personal Original works , Not published in public channels , For example, articles published or to be published on other platforms , Please clearly mark
• It is suggested that markdown Format writing , The pictures are sent as attachments , The picture should be clear , No copyright issues
• PaperWeekly Respect the right of authorship , And will be adopted for each original first manuscript , Provide Competitive remuneration in the industry , Specifically, according to the amount of reading and the quality of the article, the ladder system is used for settlement
Contribution channel :
• Send email :[email protected]
• Please note your immediate contact information ( WeChat ), So that we can contact the author as soon as we choose the manuscript
• You can also directly add Xiaobian wechat (pwbot02) Quick contribution , remarks : full name - contribute
△ Long press add PaperWeekly Small make up
Now? , stay 「 You know 」 We can also be found
Go to Zhihu home page and search 「PaperWeekly」
Click on 「 Focus on 」 Subscribe to our column
·
边栏推荐
- CF1635F Closest Pair
- Leetcode 234. 回文链表
- [solution] solve the importerror: library "Glu" not found
- window开机启动增加/关闭
- Simplified writing of not like in MySQL
- Switch and router technology: Standard ACL, extended ACL and named ACL
- sql server2008数据库查询admin密码
- Chant Developer Workbench 2022
- Data structure in redis (2): jump table
- 信息学奥赛一本通 1974:【16NOIP普及组】回文日期 | 洛谷 P2010 [NOIP2016 普及组] 回文日期
猜你喜欢
Android interview: 2022 please keep this experience of Netease Android development and Tiktok e-commerce Android engineers
服务器与本地资料互传的命令行代码
sftp创建
JVM memory model: class loading process
盒马两大供应链中心启用 多业态商品创新研发“有后台”
How far can TTL, RS232 and 485 transmit?
Operation tutorial: UOB camera registers the detailed configuration of easycvr platform through gb28181 protocol
AcWing_ 11. Number of solutions for knapsack problem_ dp
Popular science | how to create a Dao?
Data structure in redis (2): jump table
随机推荐
mysql中not like的简化写法
网络层面试题
内存管理面试问题
Thinking about the transformation between string and char[]
ECCV 2022 | 修正FPN帶來的大目標性能損害:You Should Look at All Objects
Detailed explanation of PN communication between botu PLC and ABB Inverter
Informatics Olympiad all in one 1977: [08noip popularization group] stereogram | Luogu p1058 [noip2008 popularization group] stereogram
学生如何提高专业英文阅读能力丨传道授业
sftp创建
The principle of embedded IDE, openocd introduction and how stlink connects STM32 board
Leetcode 234. 回文链表
C语言 pthread_join()函数
Lateral biting function provided by wisdom teeth
Transparent transmission of punctual atom Lora wireless serial port point-to-point communication and Its Precautions
招股书写了“元宇宙“318次!飞天云动再战港股“元宇宙第一股“
Command line code for server and local data transmission
This easy-to-use office network optimization tool is free
[SSM]SSM整合③(接口测试)
AcWing_ 11. Number of solutions for knapsack problem_ dp
Leetcode 172. 阶乘后的零