当前位置：网站首页>Pytorch target detection coco API explanation data generation

Pytorch target detection coco API explanation data generation

2022-07-20 08:24:00 【Visual feast】

Abstract

Before training in target detection , There are several different forms of data preparation , Today, I will specifically explain several common data preparation formats , Of course , The most commonly used is still coco Enter the training in the form of data set .

voc coco Data sets

We usually mention voc yes 07 Years ago , There are photos and xml Annotation information of the file ,xml yes labeling The most original file to save information after annotation ,json It is the information in the form of dictionary after all extraction , Efficient than xml Much faster .imagesets All stored are photos ,annotations yes xml file , Other seg It is the data marking form in semantic segmentation , There is no need to study here .
Insert picture description here
The latest target detection papers are basically based on coco Data sets map Value to reflect the quality of an algorithm , Here you can have a look first coco The form of the data set , I can handle the target detection as long as val2017, and train2017, It's all about storing photos ,annotations yes json file .
Insert picture description here
train2017 The files under the , Just photos

val2017 Next picture

Here we only need to train and verify two json file , Now open val2017.json To be specific

It looks messy , you 're right , Because the information of recording photos is very detailed , Not only those used for target detection need to be recorded , There are several other directions , But when we use generation coco Don't do this with datasets , Just generate images,annotations,categories Three ,images Record photos ,annotations Record box Information categories Record category information , Use my own data to have a brief understanding
Insert picture description here
images This dictionary records the name of the photo, so high and wide information , Recording a unique id. It smells like a database , It is also the way to record the quantity .

annotations It's a record. box Information about , You also need to know which photo corresponds , So the corresponding photos id That's all right. , And categories , The area of the frame .
Insert picture description here
categories Just record different categories , there name It can be Chinese or this compressed way . It should be right to get here coco The formal data understanding is very comprehensive . Next we will learn to use API Process data quickly .

coco API Explain

coco API It's a special treatment json file , Yes json It is very convenient to handle , In itself json Store image information in the form of a dictionary , We need to write and read the part ourselves , More trouble . There is now a coco API Great ease of use , It only needs a few simple operations to easily extract the data and load the training .

from pycocotools.coco import COCO

coco = COCO('./test/annotations.json')
ids = list(coco.imgs.keys())   # This kind of loading is quite special 
value = list(coco.imgs.values())
print(ids)
print(value)
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
[{
    'file_name': '20190816_095611.jpg', 'id': 1, 'height': 4032, 'width': 3024}, {
    'file_name': '20190816_095633.jpg', 'id': 2, 'height': 4032, 'width': 3024}, {
    'file_name': '81872f020e8ac5489c0c51cad67c435.jpg', 'id': 3, 'height': 1440, 'width': 1080}, {
    'file_name': '9d0ac700d8dafd5c568fd3d78224ffb.jpg', 'id': 4, 'height': 1440, 'width': 1080}, {
    'file_name': 'eed7d90379acd8c427b5b73f0a229e6.jpg', 'id': 5, 'height': 1440, 'width': 1080}, {
    'file_name': 'img_10.jpg', 'id': 6, 'height': 690, 'width': 750}, {
    'file_name': 'img_100.jpg', 'id': 7, 'height': 357, 'width': 688}, {
    'file_name': 'img_111.jpg', 'id': 8, 'height': 227, 'width': 287}, {
    'file_name': 'img_18.jpg', 'id': 9, 'height': 500, 'width': 375}, {
    'file_name': 'img_22.jpg', 'id': 10, 'height': 145, 'width': 210}, {
    'file_name': 'img_35.jpg', 'id': 11, 'height': 800, 'width': 800}, {
    'file_name': 'img_36.jpg', 'id': 12, 'height': 220, 'width': 293}, {
    'file_name': 'img_44.jpg', 'id': 13, 'height': 415, 'width': 475}, {
    'file_name': 'img_54.jpg', 'id': 14, 'height': 369, 'width': 429}, {
    'file_name': 'img_65.jpg', 'id': 15, 'height': 768, 'width': 1024}, {
    'file_name': 'img_78.jpg', 'id': 16, 'height': 645, 'width': 700}, {
    'file_name': 'img_83.jpg', 'id': 17, 'height': 736, 'width': 800}, {
    'file_name': 'img_92.jpg', 'id': 18, 'height': 210, 'width': 295}, {
    'file_name': 'img_97.jpg', 'id': 19, 'height': 768, 'width': 1024}]

Here is the simplest loading , Then use the extracted photos id and value View each photo images Information , stay COCO It's all read when you use it , Here we only need to extract the corresponding information , Next, let's talk about some commonly used functions ,

Three get

getAnnIds,getCatIds,getImgIds seeing the name of a thing one thinks of its function ,coco The functions inside are not named blindly , Is to obtain box The information of id, Get category information id, Get photos id, This is to facilitate the next operation .


from pycocotools.coco import COCO

coco = COCO('./test/annotations.json')
ids1 = coco.getAnnIds()
print(ids1)
ids2 = coco.getImgIds()
print(ids2)
ids3 = coco.getCatIds()
print(ids3)
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43]

It can be seen that , Just record what they recorded at that time id Loaded in , This step of loading is necessary , Only in this way can we proceed to the next step load operation

Three load

loadAnns,loadCats,loadImgs, This step is the substantive loading of data

from pycocotools.coco import COCO

coco = COCO('./test/annotations.json')
ids1 = coco.getAnnIds()
ids2 = coco.getImgIds()
ids3 = coco.getCatIds()
data1=coco.loadAnns(ids1[1])
print(data1)
data2=coco.loadImgs(ids3[1])
print(data2)
data3=coco.loadCats(ids3[1])
print(data3)
[{
    'id': 2, 'image_id': 2, 'bbox': [933, 88, 2178, 2559], 'category_id': 1, 'area': 5573502, 'iscrowd': 0}]
[{
    'file_name': '20190816_095611.jpg', 'id': 1, 'height': 4032, 'width': 3024}]
[{
    'id': 1, 'name': ' Books, paper '}]

Learn these simple operations to analyze data , Statistics json Various indicators .

coco Data sets json File generation

import os
import cv2
import json
import xml.dom.minidom
import xml.etree.ElementTree as ET

data_dir = './data' # Root file , It includes image The folder and box Folder （ Modify this path according to your own situation ）

image_file_dir = os.path.join(data_dir, 'image')
xml_file_dir = os.path.join(data_dir, 'box')

annotations_info = {
    'images': [], 'annotations': [], 'categories': []}

categories_map = {
    'holothurian': 1, 'echinus': 2, 'scallop': 3, 'starfish': 4}

for key in categories_map:
    categoriy_info = {
    "id":categories_map[key], "name":key}
    annotations_info['categories'].append(categoriy_info)

file_names = [image_file_name.split('.')[0]
              for image_file_name in os.listdir(image_file_dir)]
ann_id = 1
for i, file_name in enumerate(file_names):
    print(i)
    image_file_name = file_name + '.jpg'
    xml_file_name = file_name + '.xml'
    image_file_path = os.path.join(image_file_dir, image_file_name)
    xml_file_path = os.path.join(xml_file_dir, xml_file_name)

    image_info = dict()
    image = cv2.cvtColor(cv2.imread(image_file_path), cv2.COLOR_BGR2RGB)
    height, width, _ = image.shape
    image_info = {
    'file_name': image_file_name, 'id': i+1,
                  'height': height, 'width': width}
    annotations_info['images'].append(image_info)

    DOMTree = xml.dom.minidom.parse(xml_file_path)
    collection = DOMTree.documentElement

    names = collection.getElementsByTagName('name')
    names = [name.firstChild.data for name in names]

    xmins = collection.getElementsByTagName('xmin')
    xmins = [xmin.firstChild.data for xmin in xmins]
    ymins = collection.getElementsByTagName('ymin')
    ymins = [ymin.firstChild.data for ymin in ymins]
    xmaxs = collection.getElementsByTagName('xmax')
    xmaxs = [xmax.firstChild.data for xmax in xmaxs]
    ymaxs = collection.getElementsByTagName('ymax')
    ymaxs = [ymax.firstChild.data for ymax in ymaxs]

    object_num = len(names)

    for j in range(object_num):
        if names[j] in categories_map:
            image_id = i + 1
            x1,y1,x2,y2 = int(xmins[j]),int(ymins[j]),int(xmaxs[j]),int(ymaxs[j])
            x1,y1,x2,y2 = x1 - 1,y1 - 1,x2 - 1,y2 - 1

            if x2 == width:
                x2 -= 1
            if y2 == height:
                y2 -= 1

            x,y = x1,y1
            w,h = x2 - x1 + 1,y2 - y1 + 1
            category_id = categories_map[names[j]]
            area = w * h
            annotation_info = {
    "id": ann_id, "image_id":image_id, "bbox":[x, y, w, h], "category_id": category_id, "area": area,"iscrowd": 0}
            annotations_info['annotations'].append(annotation_info)
            ann_id += 1

with  open('./data/annotations.json', 'w')  as f:
    json.dump(annotations_info, f, indent=4)

print('--- Sorted annotation file ---')
print(' Number of all pictures ：',  len(annotations_info['images']))
print(' Number of all indications ：',  len(annotations_info['annotations']))
print(' Number of all categories ：',  len(annotations_info['categories']))

Here is the generation json File code , Just modify the file location and category , yes xml File storage information conversion json file , It's easy to understand after reading the code carefully .

summary

This step of learning can understand the data preparation of various target detection into training , After learning, you can easily deal with the various forms given by the competition, which can be transformed , Cooperate with the data reading aspect explained in my blog before , Target detection can be used

原网站

版权声明
本文为[Visual feast]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/201/202207190501378111.html