当前位置:网站首页>Pytorch target detection data processing (II) extracting difficult samples, low AP samples
Pytorch target detection data processing (II) extracting difficult samples, low AP samples
2022-07-20 08:24:00 【Visual feast】
Abstract
There are many kinds of data processing in the competition , Analysis of image data , And how to strengthen the relatively low after analysis ap Category , Today, I will explain several difficult samples I have used recently, learning and low concentration ap Data processing after enhancement .
The difficult sample is loss The bigger one , It accounts for a large part of each batch of training loss, Lead to loss It is difficult to continue to reduce .
Difficult sample extraction
I'm using pytorch edition efficientdet, The overall process is also relatively simple , Is in the dataloader Last change getitem This function , Add the image when returning name That's all right. . Because most of the training is batch training collect This function returns with an image name Just go , And then to train In the! loss Greater than 0.5 The left and right image names are recorded , Write to a txt In the text . The specific operation is as follows , Look at me first train.py Modify it
When loading the image name Loading in ,
I focus on classification here loss, So when processing, load the name into a txt Just in the document , Come and look at me carefully dataloader To deal with
Take a closer look here , stay sample I set the load name, There is another point that needs to be modified , Relatively simple , It is the processing of data loading , You can see the blog I wrote before. Data loading is easy to master .
Finally, put this txt The photos in the file are extracted separately from the training set , The code is as follows
import numpy as np
# a = 'hahawqeq'
file=open('./haha.txt','r')
aa = []
for i in file.readlines():
aa.append(i.split('\n')[0])
print(len(aa))
dd = list(np.unique(aa))
print(len(dd))
import os
import shutil
xml_train = './coco/train2017/'
i = 0
while(i<len(dd)):
random_file = dd[i]
source_file = "%s/%s" % (xml_train, random_file)
xml_val = './coco/kunnan/'
if random_file not in os.listdir(xml_val):
shutil.move(source_file, xml_val)
i=i+1
The code is simple , Just a few details source_file Generate relative paths , And then move on to xml_val Under the path , It's handled successfully ,xml The same goes for documents , Then generate json Just file it .
low ap Data category enhancement
When we train, we can't be high in every category , So we need to deal with json File extraction ap Lower category ,
import os
import torch
import numpy as np
import matplotlib.pyplot as plt
from torch.utils.data import Dataset, DataLoader
from pycocotools.coco import COCO
import cv2
# 1,19,36
coco = COCO('./test/all.json')
ids1 = coco.getAnnIds()
# print(ids1)
ids2 = coco.getImgIds()
# print(ids2)
items = []
for i in range(len(ids1)):
data = coco.loadAnns(ids1[i])
# In this step, I deal with the required categories corresponding to my own categories
if data[0]['category_id']==1:
items.append(data[0]['image_id'])
elif data[0]['category_id']==5:
items.append(data[0]['image_id'])
elif data[0]['category_id']==18:
items.append(data[0]['image_id'])
elif data[0]['category_id']==0:
items.append(data[0]['image_id'])
elif data[0]['category_id']==19:
items.append(data[0]['image_id'])
elif data[0]['category_id']==36:
items.append(data[0]['image_id'])
elif data[0]['category_id']==22:
items.append(data[0]['image_id'])
elif data[0]['category_id']==27:
items.append(data[0]['image_id'])
elif data[0]['category_id']==35:
items.append(data[0]['image_id'])
elif data[0]['category_id']==42:
items.append(data[0]['image_id'])
else:
continue
# print(items)
item =np.unique(items)
# print(item)
name =[]
for j in range(len(item)):
data=coco.loadImgs(ids2[j])
name.append(data[0]['file_name'])
import os
import shutil
xml_train = './coco/train2017/'
i = 0
while(i<len(name)):
random_file = name[i].split('.')[0]+'.jpg'
source_file = "%s/%s" % (xml_train, random_file)
xml_val = './coco/lowap/'
print(i)
if random_file not in os.listdir(xml_val):
shutil.move(source_file, xml_val)
i=i+1
The whole is to see a process , Yes json Full use of documents , Then we will enhance the extracted image data
import albumentations
import cv2
from PIL import Image, ImageDraw
import numpy as np
from albumentations import (GridDropout,GridDistortion)
import matplotlib.pyplot as plt
import glob
import numpy as np
import matplotlib.pyplot as plt
import cv2
def imread(image):
image=cv2.imread(image)
image=cv2.cvtColor(image,cv2.COLOR_BGR2RGB)
image=image.astype(np.uint8)
return np.array(image)
def show(image):
plt.imshow(image)
plt.axis('off')
plt.show()
# Path to data
data_folder = f"./lowap/"
# Read filenames in the data folder
filenames = glob.glob(f"{data_folder}*.jpg")
for i in range(len(filenames)):
b = filenames[i]
print(b)
a =imread(b)
image2 =GridDropout(0.2,10,p=1)(image=a)['image']
dd='./haha/'+filenames[i].split('/')[2]
cv2.imwrite(dd,image2)
I'm using albu Packet enhancement is more convenient , This is offline enhancement .
summary
There are many data enhancement methods , I simply used one of them , The more advanced one is cutmix Let's do it , You can also test more enhancements
边栏推荐
- Yolo series target detection data set
- Win10中用VS2019编译live555
- Pytorch Huawei cloud cup garbage classification summary (target detection)
- MySQL gets the start time and end time of the current day, yesterday, this week, last week, this month and last month
- 三角形问题最坏情况测试测试用例
- This free code snippet manager is open source!
- pytorch 目标检测 coco API 讲解 数据生成
- ThreadLocal学习笔记
- Path in sword finger offer matrix
- ZABBIX agent adds a user-defined monitoring item -- Ping to destination IP link monitoring
猜你喜欢
Zabbix Server Ping链路监控,状态改变后通过邮件告警
C语言程序环境和预处理
局域网访问项目注意事项
Openwrt manually installs the netdata plug-in
ZABBIX agent adds a user-defined monitoring item -- Ping to destination IP link monitoring
[yolov5 realizes mobile phone detection]
Zabbix-agent 增加自定义监控项-- Ping 到目的地 IP 链路监控
Hualu Cup - Jiangsu illegal advertising detection - champion summary
基于STM32F103,用蜂鸣器播放歌曲
第五十八篇:VS调试出现“覆盖。。。。是/N:否/A:全部)??”
随机推荐
数据的表示和运算
pytorch 目标检测数据处理比赛使用
Basic introduction to multithreading (with sample code)
pytorch 目标检测竞赛(一)数据分析
MySQL数据通过SQL查询指定数据表的字段名及字段备注
华录杯-江苏违法广告检测-冠军总结
Pytorch yolo4 training any training set
yolov3的GUI界面(3)--解决out of memory问题,新增摄像头检测功能
pytorch mmdetection2.0安装训练测试(coco训练集)
解决快速索引栏挤压的问题
C语言中的文件操作
自定义类型:结构体,位段,枚举,联合
整数的分划问题
pytorch 目标检测数据处理(二)提取困难样本,低ap样本
SSM notes
第五十九篇:main.c:62:9: note: use option -std=c99 or -std=gnu99 to compile your code
Partition of integers
函数指针
Obsidian 编译第三方插件
pytorch 目标检测数据增强cutmix和mixup混合