john wangchu wangCharlie XU
Created November 28, 2020 © Apache-2.0

How to apply a new machine learning model on ultra96

We have converted, trained, tested, quantized, compiled and run enet, densenet and yolo models on ultra96 successfully.

IntermediateProtipOver 8 days42
How to apply a new machine learning model on ultra96

Things used in this project

Hardware components

Ultra96-V2
Avnet Ultra96-V2
×1
Webcam, Logitech® HD Pro
Webcam, Logitech® HD Pro
×1
Converter, USB 3.0 to Gigabit Ethernet
Converter, USB 3.0 to Gigabit Ethernet
×1

Software apps and online services

PYNQ Framework
AMD PYNQ Framework

Story

Read more

Schematics

vivado hardware platform design

It is hardware schematic design for ultra96 hardware platform, we base it to construct pynq hardware frame on ultra96.

Code

Yolo program on ultra96

Python
It is a program running yolo model on ultra96
No preview (download only).

Densenet training bat program

BatchFile
It sets environment parameters for training program.
#!/bin/bash

# train, evaluate and save trained keras model
train() {
  python3 trainrestore.py \
    --input_height ${INPUT_HEIGHT} \
    --input_width  ${INPUT_WIDTH} \
    --input_chan   ${INPUT_CHAN} \
    --epochs       ${EPOCHS} \
    --learnrate    ${LEARNRATE} \
    --batchsize    ${BATCHSIZE} \
    --tboard       ${TB_LOG} \
    --keras_hdf5   ${KERAS}/
#    --keras_hdf5   ${KERAS}/${K_MODEL}
}

echo "-----------------------------------------"
echo "TRAINING STARTED"
echo "-----------------------------------------"

#rm -rf ${KERAS}
mkdir -p ${KERAS}
#rm -rf ${TB_LOG}
mkdir -p ${TB_LOG}
train 2>&1 | tee ${LOG}/${TRAIN_LOG}

echo "-----------------------------------------"
echo "TRAINING FINISHED"
echo "-----------------------------------------"

Kernel compiling program

BatchFile
It is a program compiling DPU kernel.
!/bin/bash

set -e

if [ "$#" -eq 2 ]; then
    BOARD=$1
    MODEL_NAME=$2
        echo "./compile.sh $MODEL_NAME"
#        MODEL_UNZIP=$2 
else
        BOARD=Ultra96
#        MODEL_NAME=cf_yolov3_voc_608_608_65.42G
        MODEL_NAME=cf_yolotiny_voc_416_416_11.2G
#        MODEL_NAME=tf_yolov3_voc_416_416_65.63G
#        MODEL_NAME=cf_yolov3_bdd_288_512_53.7G
#        MODEL_NAME=cf_yolov3_cityscapes_256_512_0.9_5.46G
##        MODEL_NAME=cf_ssdpedestrian_coco_360_640_0.97_5.9G
#        cf_yolov3_voc_416_416_65.42G
#        MODEL_NAME=tf_densenet_imagenet_512_512_7.7G
#       MODEL_NAME=cf_segmentation_imagenet_512_512_7.7G
#        MODEL_NAME=cf_inceptionv4_imagenet_299_299_24.5G_79.58
#	 MODEL_NAME=tf_resnetv1_152_imagenet_224_224_21.83G
#	echo "Error: please provide BOARD and MODEL_NAME as arguments."
	echo "Default: ./compile.sh $MODEL_NAME"
#	exit 1
fi

if [ $BOARD = "Ultra96" ] && [ ! -e dpu.hwh ]; then
	echo "Error: please make sure dpu.hwh is in the working directory."
	exit 1
fi

VAI_VERSION=1.1
#MODEL_ZIP=$(echo ${MODEL_NAME} | sed 's/_[1-9\.]\+G_/_/g').zip
#MODEL_UNZIP=$(echo ${MODEL_NAME} | sed "s/\(.*\)_${VAI_VERSION}\(.*\)/\1\2/")
MODEL_UNZIP=${MODEL_NAME}
echo "MODEL_UNZIP NAME: $MODEL_UNZIP"
MODEL=$(echo $MODEL_NAME | cut -d'_' -f2)
echo "MODEL NAME: $MODEL"
FRAMEWORK=$(echo $MODEL_NAME | cut -d'_' -f1)
echo "FRAMEWORK $FRAMEWORK"
# Activate Vtisi AI conda environment
source /etc/profile.d/conda.sh
if [ $FRAMEWORK = 'cf' ]; then
	conda activate vitis-ai-caffe
elif [ $FRAMEWORK = 'tf' ]; then
	conda activate vitis-ai-tensorflow
else
	echo "Error: currently only caffe and tensorflow are supported."
	exit 1
fi

# If custom Ultra96 hwh file is provided, add DPU support
if [ $BOARD = "Ultra96" ]; then
	sudo mkdir -p /opt/vitis_ai/compiler/arch/dpuv2/Ultra96
	sudo cp -f Ultra96.json \
		/opt/vitis_ai/compiler/arch/dpuv2/Ultra96/Ultra96.json
	dlet -f dpu.hwh
        sudo rm -f /opt/vitis_ai/compiler/arch/dpuv2/${BOARD}/*.dcf
	sudo cp *.dcf /opt/vitis_ai/compiler/arch/dpuv2/${BOARD}/${BOARD}.dcf
fi

# ZCU111 and ZCU102 use equivalent DPU configurations
if [ $BOARD = "ZCU111" ]; then
	BOARD=ZCU102
fi

## Download model if it doesn't already exist in workspace
#if [ ! -f $MODEL_ZIP ]; then
#	wget -O ${MODEL_ZIP} \
#	"https://www.xilinx.com/bin/public/openDownload?filename=${MODEL_ZIP}"
#fi
#unzip -o ${MODEL_ZIP}

# Compile the model
if [ $FRAMEWORK = 'cf' ]; then
	vai_c_caffe \
		--prototxt ${MODEL_UNZIP}/quantized/deploy.prototxt \
		--caffemodel ${MODEL_UNZIP}/quantized/deploy.caffemodel \
		--arch /opt/vitis_ai/compiler/arch/dpuv2/${BOARD}/${BOARD}.json \
		--output_dir ./model \
		--net_name ${MODEL}
        sudo cp ./model/dpu_${MODEL}_0.elf ./modelbak

elif [ $FRAMEWORK = 'tf' ]; then
        echo "FRAMEWORK tensorflow"
	vai_c_tensorflow \
		--frozen_pb ${MODEL_UNZIP}/quantized/deploy_model.pb \
		--arch /opt/vitis_ai/compiler/arch/dpuv2/${BOARD}/${BOARD}.json \
		--output_dir ./model \
		--net_name tf_${MODEL}
#        sudo cp ./model/dpu_tf_${MODEL}_0.elf ./modelbak
#        sudo cp ./model/dpu_${MODEL}.elf ./modelbak

else
	echo "Error: currently only caffe and tensorflow are supported."
	exit 1
fi

Densenet program on ultra96

Python
It is the program running on ultra96
# data channel order: RGB(0~255) input = input / 255 crop: crop the central region of the image with an area containing 87.5%
# of the original image. resize: 224 * 224 (tf.image.resize_bilinear(image, [height, width], align_corners=False)) input = 2*(input - 0.5)

#1. data channel order: RGB(0~255)
#2. resize: short side reisze to 256 and keep the aspect ratio.
#3. center crop: 224 * 224
#4. input = input / 255
#5. input = 2*(input - 0.5) 

from ctypes import *
import cv2
import numpy as np
from dnndk import n2cube
import os
#import threading
from concurrent.futures import ThreadPoolExecutor, as_completed
import time
#import preprocess
#import queue
import sys

try:
    pyc_libdputils = cdll.LoadLibrary("libn2cube.so")
except Exception:
    print('Load libn2cube.so failed\nPlease install DNNDK first!')

top = 1 
resultname = "image.list.result"
threadPool = ThreadPoolExecutor(max_workers=2,)
#scale = 1
#shortsize = 256

classes = ['airplane','automobile','bird','cat','deer','dog','frog','horse','ship','truck']  
listPredictions = []

#from IPython.display import display
#from PIL import Image

#path = os.path.join(image_folder, listimage[2])
#print("path = %s" % path)
#img = cv2.imread(path)
#display(Image.open(path))

_R_MEAN = 123.68
_G_MEAN = 116.78
_B_MEAN = 103.94

MEANS = [_B_MEAN,_G_MEAN,_R_MEAN]

def BGR2RGB(image):
  # B, G, R = cv2.split(image)
  # image = cv2.merge([R, G, B])
  # image = image[:,:,::-1] 
  image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
  return image

def resize_shortest_edge(image, size):
  H, W = image.shape[:2]
  if H >= W:
    nW = size
    nH = int(float(H)/W * size)
  else:
    nH = size
    nW = int(float(W)/H * size)
  return cv2.resize(image,(nW,nH))

def central_crop(image, crop_height, crop_width):
  image_height = image.shape[0]
  image_width = image.shape[1]
  offset_height = (image_height - crop_height) // 2
  offset_width = (image_width - crop_width) // 2
  return image[offset_height:offset_height + crop_height, offset_width:
               offset_width + crop_width, :]

def normalize(image):
  image = image.astype(np.float32)
  image=image/256.0
  image=image-0.5
  image=image*2.0
  return image

#def preprocess_fn(image_path):
#    '''
#    Image pre-processing.
#    Rearranges from BGR to RGB then normalizes to range 0:1
#    input arg: path of image file
#    return: numpy array
#    '''
#    image = cv2.imread(image_path)
#    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
#    image = image/255.0
#    return image

def preprocess_fn(image):
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    image = image.astype(np.float32)
#    print(f"image.type = {type(image)}")
    image = image/255.0
    return image

#def preprocess_fn(image, crop_height, crop_width):
#    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
##    cv2.imshow("test", image)
##    cv2.waitKey(0)  
#    image = resize_shortest_edge(image, 256)
##    print("shape in = {}".format(image.shape))  
#    image = central_crop(image, crop_height, crop_width)
##    print("shape cr = {}".format(image.shape))
#    image = normalize(image)   
#    return image  

#def dpuSetInputImageWithScale(task, nodeName, image, mean, scale, height, width, channel, shortsize, idx=0):
#    (imageHeight, imageWidth, imageChannel) = image.shape
#    if height == imageHeight and width == imageWidth:
#        newImage = image
#    else:
#        newImage = preprocess_fn(image, height, width)
#    return newImage

def parameter(task, nodeName, idx=0):
    #inputscale =  n2cube.dpuGetInputTensorScale(task, nodeName, idx)
    #print("inputscale = %f"%inputscale)
    channel = n2cube.dpuGetInputTensorChannel(task, nodeName, idx)
    output = (c_float * channel)()
    outputMean = POINTER(c_float)(output)
    pyc_libdputils.loadMean(task, outputMean, channel)
    height = n2cube.dpuGetInputTensorHeight(task, nodeName, idx)
    print("height = %d"%height)
    width = n2cube.dpuGetInputTensorWidth(task, nodeName, idx)
    print("width = %d"%width)    
    for i in range(channel):
        outputMean[i] = float(outputMean[i])
#        print("outputMean[%i] = %f"%(i,outputMean[i]))
    return height, width, channel, outputMean
                 
def predict_label(img, task, inputscale, mean, height, width, inputchannel, shortsize, KERNEL_CONV_INPUT):
#    imageRun = preprocess_fn(img, height, width)
    imageRun = preprocess_fn(img)
#    imageRun = dpuSetInputImageWithScale(task, KERNEL_CONV_INPUT, img, mean, inputscale, height, width, inputchannel, shortsize, idx=0)
#    n2cube.dpuGetInputTensor(task, KERNEL_CONV_INPUT)
    imageRun = imageRun.reshape((imageRun.shape[0]*imageRun.shape[1]*imageRun.shape[2]))
#    input_len = 150528
#    print("imageRUN = {}".format(imageRun))
#    input_len = len(imageRun)
#    print(f"imageRun = {imageRun.shape}")
#    print(f"input_len = {input_len}")    
    return imageRun

def TopK(softmax, imagename, fo, correct, wrong):
    for i in range(top):
         num = np.argmax(softmax)
#         print("softmax = %f" % softmax[num])    
#         argmax = np.argmax((out_q[i]))
         prediction = classes[num]  
#         print(prediction)
#         softmax[num] = 0
#         num -1, should notice
#         num = num -1
#         fo.write(imagename+" "+str(num)+"\n")  
         ground_truth, _ = imagename.split('_')
         fo.write(imagename+' p: '+prediction+' g: '+ground_truth+' : '+str(softmax[num])+'\n')
         if (ground_truth==prediction):
            correct += 1
#            print(f"correct = {correct}")
         else:
            wrong += 1
#            print(f"wrong = {wrong}")
    return correct, wrong   
#sem=threading.BoundedSemaphore(1)
def run_dpu_task(outsize, task, outputchannel, conf, outputscale, listimage, imageRun, KERNEL_CONV_INPUT, KERNEL_FC_OUTPUT): 
    input_len = len(imageRun)
#    print(f"input_len = {input_len}")
    n2cube.dpuSetInputTensorInHWCFP32(task,KERNEL_CONV_INPUT,imageRun,input_len)
    n2cube.dpuRunTask(task)
#    outputtensor = n2cube.dpuGetOutputTensorInHWCFP32(task, KERNEL_FC_OUTPUT, outsize)
#    print(outputtensor)
#    print(outputchannel)
#    print(outputscale)
    softmax = n2cube.dpuRunSoftmax(conf, outputchannel, outsize//outputchannel, outputscale)
#    print(f"softmax = {softmax}")
    return softmax, listimage

def run(image_folder, shortsize, KERNEL_CONV, KERNEL_CONV_INPUT, KERNEL_FC_OUTPUT, inputscale):

    start = time.time()
#    listimage = [i for i in os.listdir(image_folder) if i.endswith("JPEG")]
    listimage = [i for i in os.listdir(image_folder) if i.endswith("jpg")]
    listimage.sort()
#    wordstxt = os.path.join(image_folder, "words.txt")
#    with open(wordstxt, "r") as f:
#        lines = f.readlines()
    fo = open(resultname, "w")
    n2cube.dpuOpen()
    kernel = n2cube.dpuLoadKernel(KERNEL_CONV)
    task = n2cube.dpuCreateTask(kernel, 0)
    height, width, inputchannel, mean = parameter(task, KERNEL_CONV_INPUT)
#    print("mean = %f"%mean[0])
    outsize = n2cube.dpuGetOutputTensorSize(task, KERNEL_FC_OUTPUT)
#    print("size = %d"%size)
    outputchannel = n2cube.dpuGetOutputTensorChannel(task, KERNEL_FC_OUTPUT)
#    print("outputchannel = %d"%outputchannel)
    conf = n2cube.dpuGetOutputTensorAddress(task, KERNEL_FC_OUTPUT)
#    print("conf = {}".format(conf))
#    print("inputscale = %f"%inputscale)
    inputscale = n2cube.dpuGetInputTensorScale(task,KERNEL_CONV_INPUT)
#    print("inputscalenow = %f"%inputscale)
    outputscale = n2cube.dpuGetOutputTensorScale(task, KERNEL_FC_OUTPUT)
#    print("outputscale = %f"%outputscale)  
    imagenumber = len(listimage) 
    print("\nimagenumber = %d\n"%imagenumber)
    softlist = []
#    imagenumber = 1000
    correct = 0
    wrong = 0
    for i in range(imagenumber):
        print(f"i = {i+1}") 
        print(listimage[i]) 
#        path = os.path.join(image_folder, listimage[i])
#        if i % 50 == 0:
#        print("\r", listimage[i], end = "") 
        path = image_folder + listimage[i]
        img = cv2.imread(path)
        imageRun = predict_label(img, task, inputscale, mean, height, width, inputchannel, shortsize, KERNEL_CONV_INPUT)
        input_len = len(imageRun)
#        print(f"input_len = {input_len}")     
#        soft = threadPool.submit(run_dpu_task, outsize, task, outputchannel, conf, outputscale, listimage[i], imageRun, KERNEL_CONV_INPUT, KERNEL_FC_OUTPUT)
#        softlist.append(soft)
#    for future in as_completed(softlist):
#        softmax, listimage = future.result()
        softmax, listimage[i] = run_dpu_task(outsize, task, outputchannel, conf, outputscale, listimage[i], imageRun, KERNEL_CONV_INPUT, KERNEL_FC_OUTPUT)
        correct, wrong = TopK(softmax, listimage[i], fo, correct, wrong)
        print("")

    fo.close()
    accuracy = correct/imagenumber
    print('Correct:',correct,' Wrong:',wrong,' Accuracy:', accuracy)    
    n2cube.dpuDestroyTask(task)
    n2cube.dpuDestroyKernel(kernel)
    n2cube.dpuClose()
    print("")

    end = time.time()
    total_time = end - start 
    print('\nAll processing time: {} seconds.'.format(total_time))
    print('\n{} ms per frame\n'.format(10000*total_time/imagenumber))
   
#threadPool.shutdown(wait=True)          
#criteria = (Accruacy-top1% -68.5)/15)*0.4+(10/latencyms)*0.6 

python link library

BatchFile
It produces python link library from DPU kernel.
#!/bin/bash

set -e

model=tf_densenet
overlays=overlays_300M2304
cd ${overlays}
#pwd
#aarch64-linux-gnu-gcc -fPIC -shared dpu_${model}_0.elf -o libdpumodel${model}.so
aarch64-linux-gnu-gcc -fPIC -shared dpu_${model}.elf -o libdpumodel${model}.so
#echo "aarch64-linux-gnu-gcc -fPIC -shared dpu_${model}_0.elf -o libdpumodel{$model}.so"
echo "aarch64-linux-gnu-gcc -fPIC -shared dpu_${model}.elf -o libdpumodel{$model}.so"
cp libdpumodel${model}.so /usr/lib/
ls -l /usr/lib/libdpu*.so
cd ..
pwd
cp ./${overlays}/* /usr/local/lib/python3.6/dist-packages/pynq_dpu/overlays/
#python3 overlay.py

Ultra96 DPU prj_config

BatchFile
A file needed by kernel compilation
[clock]
id=1:dpu_xrt_top_1.aclk
id=6:dpu_xrt_top_1.ap_clk_2

[connectivity]
sp=dpu_xrt_top_1.M_AXI_GP0:HPC0
sp=dpu_xrt_top_1.M_AXI_HP0:HP0
sp=dpu_xrt_top_1.M_AXI_HP2:HP1
nk=dpu_xrt_top:1

[advanced]
misc=:solution_name=link
param=compiler.addOutputTypes=sd_card

[vivado]
prop=run.impl_1.strategy=Performance_Explore
param=place.runPartPlacer=0

DPU Ultra96.json

BatchFile
A file needed by ultra96 kernel compilation
{
    "target"   : "dpuv2",
    "dcf"      : "/opt/vitis_ai/compiler/arch/dpuv2/Ultra96/Ultra96.dcf",
    "cpu_arch" : "arm64"
}

Enet program on ultra96

C/C++
It is a program running enet model on ultra96
No preview (download only).

Densenet training python program

Python
A program can save and continue training machine learning model, making training and analysing easier.
'''
 Copyright 2020 Xilinx Inc.

 Licensed under the Apache License, Version 2.0 (the "License");
 you may not use this file except in compliance with the License.
 You may obtain a copy of the License at

     http://www.apache.org/licenses/LICENSE-2.0

 Unless required by applicable law or agreed to in writing, software
 distributed under the License is distributed on an "AS IS" BASIS,
 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License.
'''

'''
Trains the DenseNetX model on the CIFAR-10 dataset

Author: Mark Harvey
'''
import warnings
warnings.filterwarnings("ignore")

import numpy as np
import os
import sys
import argparse

from datadownload import datadownload

# Silence TensorFlow messages
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'

# workaround for TF1.15 bug "Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR"
os.environ['TF_FORCE_GPU_ALLOW_GROWTH'] = 'true'

import tensorflow as tf
from tensorflow.keras.optimizers import RMSprop, SGD
from tensorflow.keras.callbacks import ModelCheckpoint, TensorBoard, LearningRateScheduler, ReduceLROnPlateau
from tensorflow.keras.preprocessing.image import ImageDataGenerator

from DenseNetX import densenetx

from tensorflow.keras.models import load_model
#from tensorflow.keras.models import get_init_epoch

DIVIDER = '-----------------------------------------'

def train(input_height,input_width,input_chan,batchsize,learnrate,epochs,keras_hdf5,tboard):

    def step_decay(epoch):
        """
        Learning rate scheduler used by callback
        Reduces learning rate depending on number of epochs
        """
        lr = learnrate
        if epoch > 150:
            lr /= 1000
        elif epoch > 120:
            lr /= 100
        elif epoch > 80:
            lr /= 10
        elif epoch > 2:
            lr /= 2
 # test lr, must comment for train
        lr = lr/ 1000
        return lr
    

    # CIFAR10 dataset has 60k images. Training set is 50k, test set is 10k.
    # Each image is 32x32x8bits
    (x_train, y_train), (x_test, y_test) = datadownload()
    print ('Dataset downloaded and pre-processed')

    '''
    -----------------------------------------------
    CALLBACKS
    -----------------------------------------------
    '''

    # chkpt_call = ModelCheckpoint(filepath=keras_hdf5,
                                 # monitor='val_acc',
                                 # verbose=1,
                                 # save_best_only=True)
     
    chkpt_call = ModelCheckpoint(filepath=keras_hdf5+"epoch.{epoch:03d}.val_acc.{val_acc:.2f}.h5",
                                  monitor='val_acc',
                                  verbose=1,
                                  save_best_only=True)                  
 
    tb_call = TensorBoard(log_dir=tboard,
                          batch_size=batchsize,
                          update_freq='epoch')

    lr_scheduler_call = LearningRateScheduler(schedule=step_decay,
                                              verbose=1)

    lr_plateau_call = ReduceLROnPlateau(factor=np.sqrt(0.1),
                                        cooldown=0,
                                        patience=5,
                                        min_lr=0.5e-6)

    callbacks_list = [tb_call, lr_scheduler_call, lr_plateau_call, chkpt_call]

    model_path=keras_hdf5
    listfile = [i for i in os.listdir(model_path) if i.endswith("h5")] 
    print("listfile = {}".format(listfile))
    listfile.sort()
    listfile=listfile[::-1]
    print("listfile = {}".format(listfile))

#    if listfile is not None:
#    while not listfile:
    if len(listfile) != 0:
       model_path = model_path + listfile[0]  
       model = load_model(model_path)  
       # latest=tf.train.latest_checkpoint(model_path)
       #    json_string = model.to_json()
       #    print(json_string)
       # Finding the epoch index from which we are resuming
       # initial_epoch = get_init_epoch(checkpoint_path)
       initial_epoch=int(listfile[0].split(".")[1])
       print("initial_epoch = %d"%int(initial_epoch))
       # Calculating the correct value of count
       # count = initial_epoch*batchsize
       # Update the value of count in callback instance
       # callbacks_list[1].count = count 
    else:
       model = densenetx(input_shape=(input_height,input_width,input_chan),classes=10,theta=0.5,drop_rate=0.2,k=12,convlayers=[16,16,16])
       initial_epoch = 0  
        
    # prints a layer-by-layer summary of the network
    print('\n'+DIVIDER)
    print(' Model Summary')
    print(DIVIDER)
#    print(model.summary())
    print("Model Inputs: {ips}".format(ips=(model.inputs)))
    print("Model Outputs: {ops}".format(ops=(model.outputs)))

    model.summary()

    '''
    -----------------------------------------------
    TRAINING
    -----------------------------------------------
    '''

    '''
    Input image pipeline for training, validation
    
     data augmentation for training
       - random rotation
       - random horiz flip
       - random linear shift up and down
    '''
    data_augment = ImageDataGenerator(rotation_range=10,
                                      horizontal_flip=True,
                                      height_shift_range=0.1,
                                      width_shift_range=0.1,
                                      shear_range=0.1,
                                      zoom_range=0.1)

    train_generator = data_augment.flow(x=x_train,
                                        y=y_train,
                                        batch_size=batchsize,
                                        shuffle=True)
                                  
    '''
    Optimizer
    RMSprop used in this example.
    SGD  with Nesterov momentum was used in original paper
    '''
    #opt = SGD(lr=learnrate, momentum=0.9, nesterov=True)
    opt = RMSprop(lr=learnrate)
    
    model.compile(optimizer=opt,
                  loss='categorical_crossentropy',
                  metrics=['accuracy'])

    # calculate number of steps in one train
    # train_steps = train_generator.n//train_generator.batch_size
    train_steps = train_generator.n//train_generator.batch_size/100
    print(f"train_generator.n = {train_generator.n}")
    print("train_steps = %d"%train_steps)
    # run training
    model.fit_generator(generator=train_generator,
                        epochs=epochs,
                        steps_per_epoch=train_steps,
                        validation_data=(x_test, y_test),
                        callbacks=callbacks_list,
                        verbose=1, initial_epoch=initial_epoch)

    print("\nTensorBoard can be opened with the command: tensorboard --logdir={dir} --host localhost --port 6006".format(dir=tboard))

    print('\n'+DIVIDER)
    # print(' Evaluate model accuracy with validation set..')
    # print(DIVIDER)

    # '''
    # -----------------------------------------------
    # EVALUATION
    # -----------------------------------------------
    # '''

    # scores = model.evaluate(x=x_test,y=y_test,batch_size=50, verbose=0)
    # print ('Evaluation Loss    : ', scores[0])
    # print ('Evaluation Accuracy: ', scores[1])


    # '''
    # -----------------------------------------------
    # PREDICTIONS
    # -----------------------------------------------
    # '''

    # # make predictions
    # predictions = model.predict(x_test,
                                # batch_size=batchsize,
                                # verbose=1)

    # # check accuracy
    # correct = 0
    # wrong = 0
    # for i in range(len(predictions)):
        # pred = np.argmax(predictions[i])
        # if (pred== np.argmax(y_test[i])):
            # correct+=1
        # else:
            # wrong+=1

    # print ('Correct predictions:',correct,' Wrong predictions:',wrong,' Accuracy:',(correct/len(predictions)))

    return


def run_main():
    
    print('\n'+DIVIDER)
    print('Keras version      : ',tf.keras.__version__)
    print('TensorFlow version : ',tf.__version__)
    print(sys.version)
    print(DIVIDER)

    # construct the argument parser and parse the arguments
    ap = argparse.ArgumentParser()
    ap.add_argument('-ih', '--input_height',
                    type=int,
                    default='32',
    	            help='Input image height in pixels.')
    ap.add_argument('-iw', '--input_width',
                    type=int,
                    default='32',
    	            help='Input image width in pixels.')
    ap.add_argument('-ic', '--input_chan',
                    type=int,
                    default='3',
    	            help='Number of input image channels.')
    ap.add_argument('-b', '--batchsize',
                    type=int,
                    default=100,
    	            help='Training batchsize. Must be an integer. Default is 100.')
    ap.add_argument('-e', '--epochs',
                    type=int,
                    default=300,
    	            help='number of training epochs. Must be an integer. Default is 300.')
    ap.add_argument('-lr', '--learnrate',
                    type=float,
                    default=0.001,
    	            help='optimizer initial learning rate. Must be floating-point value. Default is 0.001')
    ap.add_argument('-kh', '--keras_hdf5',
                    type=str,
                    default='./model.hdf5',
    	            help='path of Keras HDF5 file - must include file name. Default is ./model.hdf5')
    ap.add_argument('-tb', '--tboard',
                    type=str,
                    default='./tb_logs',
    	            help='path to folder for saving TensorBoard data. Default is ./tb_logs.')    
    args = ap.parse_args()
 
    args.learnrate = 0.002
    # final epochs
    args.epochs = 5

    print(' Command line options:')
    print ('--input_height : ',args.input_height)
    print ('--input_width  : ',args.input_width)
    print ('--input_chan   : ',args.input_chan)
    print ('--batchsize    : ',args.batchsize)
    print ('--learnrate    : ',args.learnrate)
    print ('--epochs       : ',args.epochs)
    print ('--keras_hdf5   : ',args.keras_hdf5)
    print ('--tboard       : ',args.tboard)
    print(DIVIDER)

    train(args.input_height,args.input_width,args.input_chan,args.batchsize,args.learnrate,args.epochs,args.keras_hdf5,args.tboard)


if __name__ == '__main__':
    run_main()

Credits

john wang

john wang

2 projects • 2 followers
chu wang

chu wang

1 project • 1 follower
Charlie XU

Charlie XU

2 projects • 0 followers

Comments