caffe检测网络训练及测试步骤

caffe检测网络训练及测试步骤_RefineDet

原博客搬移到：https://blog.csdn.net/u013171226/article/details/107680293

一：制作lmdb数据

制作成VOC的数据格式，在项目目录下分别创建几个目录

Annotations:该文件夹用来保存生成的xml文件

JPEGImages：该文件夹用来保存所有的图片，

labels:该文件夹用来保存所有的txt文件。

ImageSets：Main：创建ImageSets文件夹并在里面创建Main文件夹。

由于图片数据和标注的txt文件存在一些错误，因此在制作lmdb之前首先要对图片数据和标注的txt文档进行一些预处理，

1.如果是在windows上用winscp把数据上传到ubuntu上面，那么注意上传的时候配置下winscp，选择UTF8格式，否则传到ubuntu上之后图片名字和txt文件名字是乱码，

2.有些jpg图片的后缀是大写的".JPG"，需要把后缀变成小写的".jpg"。脚本如下，

import os
jpgs_dir = "./JPEGImages"

for jpg_name in os.listdir(jpgs_dir):
    portion = os.path.splitext(jpg_name)
    if portion[1] == ".JPG":
        new_name = portion[0] + ".jpg"
        jpg_name = os.path.join(jpgs_dir, jpg_name)#os.rename的参数需要的是路径和文件名，所以这里要加上路径，要不然脚本执行失败。
        new_name = os.path.join(jpgs_dir, new_name)
        print(jpg_name)
        print(new_name)
        os.rename(jpg_name, new_name)

3.有些jpg和txt的名字里面是带有空格的，需要把名字里面的空格删掉，脚本如下(该脚本把txt名字里面的空格也一起修改了)，

import os

image_dir   = "./delete_images"
txt_dir     = "./delete_labels"


def delete_space(buff_dir):
    for files_name in os.listdir(buff_dir):
        if len(files_name.split(" ")) >1:
            os.rename(os.path.join(buff_dir,files_name),
                      os.path.join(buff_dir,files_name.replace(" ","_")))
            print(os.path.join(buff_dir,files_name.replace(" ","_")))


delete_space(image_dir)
delete_space(txt_dir)

4.标注产生的txt文件的格式可能有不同的格式，

格式一：

1 1023 93 1079 137
1 1033 21 1077 59
1 1036 499 1234 645
0 1047 112 1071 164
1 1069 67 1117 105

格式二：

biker,1463,404,126,69
pedestrian,1514,375,14,35
pedestrian,1543,385,14,36

两种方式的主要有三点，

(1).目标的表示方式不同，一个用数字表示，一个用英文单词表示，

(2).中间的分割方式不同，一个用空格，一个用逗号

(3).坐标表示方式不同，一个是x1 y1 x2 y2,另一个是x,y,w,h用脚本把第一种格式转换为第二种格式，脚本如下，

import os 
rename_flg = 0
change_num = 0
dirpath = "labels/label4"
#dirpath = "test_delete"
for txt_name in os.listdir(dirpath):#列出当前目录下的所有文件名
    #print(os.listdir(dirpath))
    with open(os.path.join(dirpath, txt_name), 'r') as f, open(os.path.join(dirpath, "bak%s"%txt_name), 'w') as f_new:
        print(os.path.join(dirpath, "bak%s"%txt_name))
        for line in f.readlines():
            if 1 ==  len(line.split(" ")):#说明这个文件的格式不用修改，直接break
                rename_flg = 0
                break
            elif len(line.split(" ")) > 1:#说明是第一种空格分割的形式，只有这种形式的才进行转换。
                rename_flg = 1
                if '0' == line.split(" ")[0]:
                    line1 = 'pedestrian' + ',' + line.split(" ")[1] + ',' + line.split(" ")[2] + ',' 
                    line = line1 + str(int(line.split(" ")[3]) - int(line.split(" ")[1])) + ',' + str(int(line.split(" ")[4]) - int(line.split(" ")[2]))
                    #line = line.replace(line.split(" ")[0], 'pedestrian')#不能用replace，replace会把后面的数字0 1 2 也替换成英语单词。
                elif '1' == line.split(" ")[0]:
                    line1 = 'vehicle' + ',' + line.split(" ")[1] + ',' + line.split(" ")[2] + ',' 
                    line = line1 + str(int(line.split(" ")[3]) - int(line.split(" ")[1])) + ',' + str(int(line.split(" ")[4]) - int(line.split(" ")[2]))
                elif '2' == line.split(" ")[0]:
                    line1 = 'biker' + ',' + line.split(" ")[1] + ',' + line.split(" ")[2] + ',' 
                    line = line1 + str(int(line.split(" ")[3]) - int(line.split(" ")[1])) + ',' + str(int(line.split(" ")[4]) - int(line.split(" ")[2]))
                #print(line)
                f_new.write(line)
                f_new.write("
")
    if rename_flg == 1:#如果不加这个判断，那么不需要修改格式的txt文件也会被改变。
        change_num = change_num + 1#记录下一共修改了多少个文件， 
        os.remove(os.path.join(dirpath, txt_name))
        os.rename(os.path.join(dirpath, "bak%s"%txt_name), os.path.join(dirpath, txt_name))
    elif rename_flg == 0:
        os.remove(os.path.join(dirpath, "bak%s"%txt_name))
print('change_num:', change_num)

5.有些txt文件里面的坐标值可能只有两个，用脚本把这种txt文件直接删除，脚本如下，

"""
有些标注的txt文件里面是错误的，例如目标后面的坐标值本来应该是pedestrian,1138,306,18,56
但是它后面的坐标只有两个，pedestrian,1138,306这样在后面进行txt to xml转换的时候会发生错误，
因此编写脚本把这种错误的txt找出来，删掉。
"""

import os 

delete_labels = []
labels_dir = "./labels"
#labels_dir = "./delete_labels"

for label in os.listdir(labels_dir):
    with open(os.path.join(labels_dir, label), 'r') as f:
        for line in f.readlines():
            if 5 != len(line.split(",")):#说明坐标是少的，这种要删除，
                print(label)
                delete_labels.append(label)

for label in delete_labels:
    os.remove(os.path.join(labels_dir, label))

6.检测的分类明明只有biker，pedestrian,pedestrian,但是在制作LMDB的时候却提示未知的name face，这是因为标注的txt文件里面竟然有face分类。。。。。。。。。。。。。，于是要编写脚本看一下txt文件里面是不是只有biker，pedestrian,pedestrian,三个分类，如果有多余的分类，那么把相应的txt文件删除掉，

首先是查看分类类别的脚本，看是否只有这三类，脚本如下：

import os
import numpy as np

txt_dir     = "./labels"

buff_list = []
buff_dir = txt_dir
for files_name in os.listdir(buff_dir):
    with open(os.path.join(buff_dir,files_name),'r') as f:
        for line in f.readlines():
            buff_list.append(line.split(",")[0])
print(np.unique(buff_list))

如果发现有多余的分类，那么用下面的脚本把多余的txt文件删除，

import os


labels_dir = "./labels"

for label_name in os.listdir(labels_dir):
    with open(os.path.join(labels_dir, label_name), 'r') as f:
        for line in f.readlines():
            if "face" == line.split(",")[0]:
                print(label_name)
                os.remove(os.path.join(labels_dir, label_name))

7.要保证txt问价和jpg图片是一一对应的，因此编写脚本把没有对应起来的txt和jpg删除掉，脚本如下

"""
由于标注时有些错误，导致图片有几张多余的或者txt文件有几张多余的，
因此要删除多余的文件，保证每一张jpg对应一个txt文件。
"""

import os

images_dir = "./JPEGImages"
labels_dir = "./labels"

#删除多余的image,
labels = []
for label in os.listdir(labels_dir):
    #labels.append(label.split('.')[0])#不能用这一行，因为有些文件名字前面就有 . 这样得到的文件名字是不对的。 
    labels.append(os.path.splitext(label)[0]) 
#print(labels)

for image_name in os.listdir(images_dir):
        #image_name = image_name.split('.')[0] #不能用这一行，因为有些文件名字前面就有 . 
        image_name = os.path.splitext(image_name)[0] 
        #print(image_name)
        if image_name not in labels:
            image_name = image_name + ".jpg"
            print(image_name)
            #os.remove(os.path.join(images_dir, image_name))#删除图片，最开始先把这一行注释掉，运行下看看打印，以免误删导致数据还是重新做，

#删除多余的label
images = []
for image in os.listdir(images_dir):
    #images.append(image.split('.')[0])#不能用这一行，因为有些文件名字前面就有 . 
    images.append(os.path.splitext(image)[0] )

for label_name in os.listdir(labels_dir):
    #label_name = label_name.split('.')[0]#不能用这一行，因为有些文件名字前面就有 . 
    label_name = os.path.splitext(label_name)[0] 
    if label_name not in images:
        label_name = label_name + ".txt"
        print(label_name)
        #os.remove(os.path.join(labels_dir, label_name))#删除label，最开始先把这一行注释掉，运行下看看打印，以免误删导致数据还是重新做，

8.利用脚本获取所有的图片名字的列表，把名字写到txt文件里面，脚本如下，

import os
image_dir = "./JPEGImages"
with open("all_image_name.txt",'w') as f:
    for image_name in os.listdir(image_dir):
        f.write(os.path.abspath(os.path.join(image_dir,image_name))+"
")

9.生成训练，验证，测试的txt文件，脚本如下，

import  os
import random

f_train = open("./ImageSets/Main/train.txt",'w')#训练
f_val   = open("./ImageSets/Main/val.txt",'w')#验证
f_trainval = open("./ImageSets/Main/trainval.txt",'w')#训练加验证
f_test      =open("./ImageSets/Main/test.txt",'w')#测试，


for image_name in os.listdir("./JPEGImages"):
    image_name = image_name.split(".jpg")[0]
    feed = random.randint(0,10)
    if feed <= 8:
        f_train.write(image_name+"
")
        f_trainval.write(image_name+"
")
    if feed == 9:
        f_val.write(image_name+"
")
    if feed ==10:
        f_test.write(image_name+"
")

f_train.close()
f_val.close()
f_trainval.close()
f_test.close()

10.利用脚本把txt文件转换成xml文件，脚本如下，

import time
import os
from PIL import Image
import cv2

out0 ='''<?xml version="1.0" encoding="utf-8"?>
<annotation>
    <folder>None</folder>
    <filename>%(name)s</filename>
    <source>
        <database>None</database>
        <annotation>None</annotation>
        <image>None</image>
        <flickrid>None</flickrid>
    </source>
    <owner>
        <flickrid>None</flickrid>
        <name>None</name>
    </owner>
    <segmented>0</segmented>
    <size>
        <width>%(width)d</width>
        <height>%(height)d</height>
        <depth>3</depth>
    </size>
'''
out1 = '''    <object>
        <name>%(class)s</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>%(xmin)d</xmin>
            <ymin>%(ymin)d</ymin>
            <xmax>%(xmax)d</xmax>
            <ymax>%(ymax)d</ymax>
        </bndbox>
    </object>
'''

out2 = '''</annotation>
'''

#names = ["CD"]

def translate(lists): 
    source = {}
    label = {}
    for jpg in lists:
        if os.path.splitext(jpg)[1] == '.jpg':
            #img=cv2.imread(jpg)
            print(jpg)
            #h,w,_=img.shape[:]
            image= cv2.imread(jpg)
            h,w,_ = image.shape
#            h=320
#            w=320
            
            fxml = jpg.replace('JPEGImages','Annotations')
            fxml = fxml.replace('.jpg','.xml')
            fxml = open(fxml, 'w');
            imgfile = jpg.split('/')[-1]
            source['name'] = imgfile            

            source['width'] = w
            source['height'] = h
            
            fxml.write(out0 % source)
            txt = jpg.replace('.jpg','.txt')
            txt = txt.replace('JPEGImages','labels')
            print(txt)
            with open(txt,'r') as f:
                lines = [i.replace('
','') for i in f.readlines()]

            for box in lines:
                box = box.split(',')
                label['class'] = box[0]
#                _x = int(float(box[1])*w)
#                _y = int(float(box[2])*h)
#                _w = int(float(box[3])*w)
#                _h = int(float(box[4])*h)

                _x = int(float(box[1]))
                _y = int(float(box[2]))
                _w = int(float(box[3]))
                _h = int(float(box[4]))
                
                label['xmin'] = max(_x,0)
                label['ymin'] = max(_y,0)
                label['xmax'] = min(int(_x+_w),w-1)
                label['ymax'] = min(int(_y+_h),h-1)

                # if label['xmin']>=w or label['ymin']>=h or label['xmax']>=w or label['ymax']>=h:
                #     continue
                # if label['xmin']<0 or label['ymin']<0 or label['xmax']<0 or label['ymax']<0:
                #     continue
                    
                fxml.write(out1 % label)
            fxml.write(out2)

if __name__ == '__main__':
    with open('all_image_name.txt','r') as f:
        lines = [i.replace('
','') for i in f.readlines()]
        translate(lines)

11.生成做数据需要的trainval.txt和test.txt(注意这里生成的两个txt跟前面生成的两个txt不一样)，脚本如下

import os

f = open("./trainval.txt",'w')

new_lines = ""
with open("./ImageSets/Main/trainval.txt",'r') as ff:
    lines = ff.readlines()
    for line in lines:
        image_name = line.split("
")[0]
        new_lines += "JPEGImages/"+image_name+".jpg"+" "+"Annotations/"+image_name+".xml"+"
"
f.write(new_lines)
f.close()


f = open("./val.txt",'w')

new_lines = ""
with open("./ImageSets/Main/val.txt",'r') as ff:
    lines = ff.readlines()
    for line in lines:
        image_name = line.split("
")[0]
        new_lines += "JPEGImages/"+image_name+".jpg"+" "+"Annotations/"+image_name+".xml"+"
"
f.write(new_lines)
f.close()

12.利用creat_lmdb.sh脚本生成lmdb数据，这个脚本要运行两次，一次生成训练的lmdb,一次生成test的lmdb数据，要想生成test的lmdb只需要把for subset in train这里的train改为val,然后最下面倒数第二行那里的trainval.txt改为val.txt.脚本如下，(如果要resize的话，里面的weight = 608,height = 608,这里的长宽要和后面训练时用的protext里面的输入数据的长宽对应起来，这两个设置成0表示不resize)

caffe_root=/data/chw/caffe_master
root_dir=/data/chw/refineDet_20200409/data
LINK_DIR=$root_dir/lmdb/
redo=1
db_dir="$root_dir/lmdb1/"
data_root_dir="$root_dir"
dataset_name="trian"
mapfile="/data/chw/refineDet_20200409/data/labelmap_MyDataSet.prototxt"
anno_type="detection"
db="lmdb"
min_dim=0
max_dim=0
#width=608
#height=608

width=0
height=0




extra_cmd="--encode-type=jpg --encoded"
if [ $redo ]
then
  extra_cmd="$extra_cmd --redo"
fi
for subset in train
do
python2 $caffe_root/scripts/create_annoset.py --anno-type=$anno_type --label-map-file=$mapfile --min-dim=$min_dim --max-dim=$max_dim --resize-width=$width --resize-height=$height --check-label $extra_cmd $data_root_dir /data/chw/refineDet_20200409/data/trainval.txt $db_dir/$db/$dataset_name"_"$subset"_"$db $LINK_DIR/$dataset_name
done

上面这个脚本里面的labelmap_MyDataSet.prototxt文件内容如下，这个文件的内容需要根据不同的项目进行修改，其中第一个是背景，这个不用修改，下面的item依次增加就好了，

item {
  name: "none_of_the_above"
  label: 0
  display_name: "background"
}
item {
  name: "pedestrian"
  label: 1
  display_name: "pedestrian"
}
item {
  name: "vehicle"
  label: 2
  display_name: "vehicle"
}
item {
  name: "biker"
  label: 3
  display_name: "biker"
}

create_annoset.py脚本的内容如下，其中的sys.path.insert(0,'/data/chw/caffe_master/python')要根据路径的不同进行修改

import argparse
import os
import shutil
import subprocess
import sys
sys.path.insert(0,'/data/chw/caffe_master/python')
from caffe.proto import caffe_pb2
from google.protobuf import text_format

if __name__ == "__main__":
  parser = argparse.ArgumentParser(description="Create AnnotatedDatum database")
  parser.add_argument("root",
      help="The root directory which contains the images and annotations.")
  parser.add_argument("listfile",
      help="The file which contains image paths and annotation info.")
  parser.add_argument("outdir",
      help="The output directory which stores the database file.")
  parser.add_argument("exampledir",
      help="The directory to store the link of the database files.")
  parser.add_argument("--redo", default = False, action = "store_true",
      help="Recreate the database.")
  parser.add_argument("--anno-type", default = "classification",
      help="The type of annotation {classification, detection}.")
  parser.add_argument("--label-type", default = "xml",
      help="The type of label file format for detection {xml, json, txt}.")
  parser.add_argument("--backend", default = "lmdb",
      help="The backend {lmdb, leveldb} for storing the result")
  parser.add_argument("--check-size", default = False, action = "store_true",
      help="Check that all the datum have the same size.")
  parser.add_argument("--encode-type", default = "",
      help="What type should we encode the image as ('png','jpg',...).")
  parser.add_argument("--encoded", default = False, action = "store_true",
      help="The encoded image will be save in datum.")
  parser.add_argument("--gray", default = False, action = "store_true",
      help="Treat images as grayscale ones.")
  parser.add_argument("--label-map-file", default = "",
      help="A file with LabelMap protobuf message.")
  parser.add_argument("--min-dim", default = 0, type = int,
      help="Minimum dimension images are resized to.")
  parser.add_argument("--max-dim", default = 0, type = int,
      help="Maximum dimension images are resized to.")
  parser.add_argument("--resize-height", default = 0, type = int,
      help="Height images are resized to.")
  parser.add_argument("--resize-width", default = 0, type = int,
      help="Width images are resized to.")
  parser.add_argument("--shuffle", default = False, action = "store_true",
      help="Randomly shuffle the order of images and their labels.")
  parser.add_argument("--check-label", default = False, action = "store_true",
      help="Check that there is no duplicated name/label.")

  args = parser.parse_args()
  root_dir = args.root
  list_file = args.listfile
  out_dir = args.outdir
  example_dir = args.exampledir

  redo = args.redo
  anno_type = args.anno_type
  label_type = args.label_type
  backend = args.backend
  check_size = args.check_size
  encode_type = args.encode_type
  encoded = args.encoded
  gray = args.gray
  label_map_file = args.label_map_file
  min_dim = args.min_dim
  max_dim = args.max_dim
  resize_height = args.resize_height
  resize_width = args.resize_width
  shuffle = args.shuffle
  check_label = args.check_label

  # check if root directory exists
  if not os.path.exists(root_dir):
    print("root directory: {} does not exist".format(root_dir))
    sys.exit()
  # add "/" to root directory if needed
  if root_dir[-1] != "/":
    root_dir += "/"
  # check if list file exists
  if not os.path.exists(list_file):
    print("list file: {} does not exist".format(list_file))
    sys.exit()
  # check list file format is correct
  with open(list_file, "r") as lf:
    for line in lf.readlines():
      img_file, anno = line.strip("
").split(" ")
      if not os.path.exists(root_dir + img_file):
        print("image file: {} does not exist".format(root_dir + img_file))
      if anno_type == "classification":
        if not anno.isdigit():
          print("annotation: {} is not an integer".format(anno))
      elif anno_type == "detection":
        if not os.path.exists(root_dir + anno):
          print("annofation file: {} does not exist".format(root_dir + anno))
          sys.exit()
      break
  # check if label map file exist
  if anno_type == "detection":
    if not os.path.exists(label_map_file):
      print("label map file: {} does not exist".format(label_map_file))
      sys.exit()
    label_map = caffe_pb2.LabelMap()
    lmf = open(label_map_file, "r")
    try:
      text_format.Merge(str(lmf.read()), label_map)
    except:
      print("Cannot parse label map file: {}".format(label_map_file))
      sys.exit()
  out_parent_dir = os.path.dirname(out_dir)
  if not os.path.exists(out_parent_dir):
    os.makedirs(out_parent_dir)
  if os.path.exists(out_dir) and not redo:
    print("{} already exists and I do not hear redo".format(out_dir))
    sys.exit()
  if os.path.exists(out_dir):
    shutil.rmtree(out_dir)

  # get caffe root directory
  caffe_root = os.path.dirname(os.path.dirname(os.path.realpath(__file__)))
  if anno_type == "detection":
    cmd = "{}/build/tools/convert_annoset" 
        " --anno_type={}" 
        " --label_type={}" 
        " --label_map_file={}" 
        " --check_label={}" 
        " --min_dim={}" 
        " --max_dim={}" 
        " --resize_height={}" 
        " --resize_width={}" 
        " --backend={}" 
        " --shuffle={}" 
        " --check_size={}" 
        " --encode_type={}" 
        " --encoded={}" 
        " --gray={}" 
        " {} {} {}" 
        .format(caffe_root, anno_type, label_type, label_map_file, check_label,
            min_dim, max_dim, resize_height, resize_width, backend, shuffle,
            check_size, encode_type, encoded, gray, root_dir, list_file, out_dir)
  elif anno_type == "classification":
    cmd = "{}/build/tools/convert_annoset" 
        " --anno_type={}" 
        " --min_dim={}" 
        " --max_dim={}" 
        " --resize_height={}" 
        " --resize_width={}" 
        " --backend={}" 
        " --shuffle={}" 
        " --check_size={}" 
        " --encode_type={}" 
        " --encoded={}" 
        " --gray={}" 
        " {} {} {}" 
        .format(caffe_root, anno_type, min_dim, max_dim, resize_height,
            resize_width, backend, shuffle, check_size, encode_type, encoded,
            gray, root_dir, list_file, out_dir)
  print(cmd)
  process = subprocess.Popen(cmd.split(), stdout=subprocess.PIPE)
  output = process.communicate()[0]

  if not os.path.exists(example_dir):
    os.makedirs(example_dir)
  link_dir = os.path.join(example_dir, os.path.basename(out_dir))
  if os.path.exists(link_dir):
    os.unlink(link_dir)
  os.symlink(out_dir, link_dir)

二训练

做完lmdb之后，下一步需要进行训练，训练需要train.prototxt和solver.prototxt，也可以增加预训练模型，train.prototxt和solver.prototxt属于公司代码，不贴在博客上了。

然后就用命令行进行训练： caffe/build/tools/caffe train --solver solver.prototxt --weights xxx.caffemodel &> log.txt进行训练。

例如:

/data/caffe_s3fd-ssd/build/tools/caffe train --solver="./solver.prototxt" --weights="./_iter_300000.caffemodel" --gpu 1,2,3 >&log.txt

如果想后台运行，那么可以用nohup命令，例如，

nohup  /data/caffe_s3fd-ssd/build/tools/caffe train --solver="./solver.prototxt" --weights="./_iter_300000.caffemodel" --gpu 1,2,3 >&log.txt &

另外，也可以从快照中恢复训练，例如

# ./build/tools/caffe train -solver examples/mnist/lenet_solver.prototxt -snapshot examples/mnist/lenet_iter_5000.solverstate

-snapshot和-weight不能同时使用。

三测试

测试用的是C++代码，属于公司代码，不贴在博客上了。

作者：cumtchw
出处：http://www.cnblogs.com/cumtchw/
我的博客就是我的学习笔记，学习过程中看到好的博客也会转载过来，若有侵权，与我联系，我会及时删除。