Published April 18, 2022

AI assisted video adjustment for Photosensitive Viewers

Use AI to detect video effects that are irritating to photosensitive viewers, so that the frames can be modified in real-time for display.

AdvancedFull instructions provided10 hours310

AI assisted video adjustment for Photosensitive Viewers

Things used in this project

Hardware components

AMD VCK5000 Versal Development Card

ES1 version of card (contest hardware)

Software apps and online services

AMD Vitis Unified Software Platform

Main page of the Xilinx tools

Vitis-AI

Vitis-AI repository containing all of the Xilinx AI tools used to build and deploy model to the VCK5000

ffmpeg

Really solid tool to manipulate videos and images for use with AI modelling

ImageMagick

Very helpful tool to resize images

Story

The Idea

While playing a recent video game I found myself visually disoriented after launching a specific powerful attack. To my surprise my friend did not have this experience. Curious, I utilized a video frame grabber to record gameplay. I found the game output 6 display frames of full-screen white flash animation when activating the move. I did not even recognize the flash, but my eyes and occipital lobe (brain) were overwhelmed with the stimulation.

I wanted to create a "selective filter" to edit the video frames in real time. Although I was successful in making the filter using color space conversion and applying thresholds, the filter was applied to a lot of material that did not need adjustment. Ultimately I found detecting the problem material was actually very difficult by hand. I need to detect content like a 'flash' to know when to apply the filter and when to release it. It is easy enough to detect a near full-screen bright white flash, but I would like to be able to detect and modify many other different types of effects. A trained AI classification model has the potential to perform well on many types of content: movies, games, as well as user generated content like shared videos.

This is the idea for the Datacenter AI Adaptive Computing challenge: Utilize deep learning engine AI to detect sudden 'bright' effects in video material so it can be modified by an adaptive threshold filter. In a mature implementation the full engine would utilize AI detection running alongside an adaptive filter, and could be deployed by any content delivery service to enable on-the-fly modification to photosensitive viewers without any need to store adjusted content.

Automatic detection and filter by media provider

The Tools

Hardware target : VCK5000-ES1 AI engine with DPU shell 4.4.6 deployed, in full PCIe3.0 x16 slot
Docker container running Xilinix Vitis-AI 1.4.1 (CPU build).
Clone Vitis-AI-Tutorials (specifically Design tutorials #12)
PC system used : AMD Ryzen9 PC with basic GPU, recommend a healthy amount of usable disk space, maybe 1TB for a project like this.
Ubuntu 20.04 with Kernel downgrade to 5.8.0.43 (for compatibility with VCK5000 drivers, completely remove all other kernel versions or driver install will detect them and fail)
Follow Xilinx published instructions to install VCK5000 on base machine and drivers
Pull down Xilinx tools and set up our container

git clone -b 1.4.1 --recurse-submodules https://github.com/Xilinx/Vitis-AI
git clone --recurse-submodules https://github.com/Xilinx/Vitis-AI-Tutorials
cd Vitis-AI
docker pull xilinx/vitis-ai-cpu:latest
./docker_run.sh xilinx/vitis-ai-cpu:latest

Finally, confirm you have a working X11 environment by following the setup instructions for your environment. X11 allows the docker container to display images and video on the host machine GUI environment.
For additional information on debug tools, see final section of this document.

Getting to know the VCK5000

VCK5000 runs 2 drivers on the PC (host): xclmgmt and xocl.
Base AI models for different toolchains can be found on Vitis-AI/models/AI-Model-Zoo/model-list/
From folder name you can infer quite a lot of information. Example: tf2_resnet50_imagenet_224_224_7.76G_1.4 : TensorFlow2 framework, ResNet50 classifier from ImageNet, input size 224x224, with 7.76Gflops performance, version 1.4
To download the actual model, view the model.yaml file in the folder with a text editor to find an https:// download link for your target device (VCK5000-ES1). The yaml file serves as a sort of 'look up' catalog for the zoo. Models are quite large, the ResNet50 is 180MB, that is why they are not included directly with the container.
The DPU is the 'Deep Learning Processor Unit' for VCK5000. Our DPU and shell will have this notation: 'DPUCVDX8H' - this name is described in Xilinx UG-1414 as: CNN, Versal DDR with AIE&PL, DESCENT quantization, 8-bit, High throughput

Baseline hardware check (from container with pretrained model)

Download the pretrained model referenced above and save in /workspace/vck5/resnet50. This folder will remain persistent between container sessions while /usr/share/... will not keep any changes after the container is exited or stopped.

Compile the demo/Vitis-AI_Library/samples/classification demo according to the instructions in its folder, then copy over the model and run the demo using the pretrained model:

sudo mkdir /usr/share/vitis_ai_library/models
sudo cp -r /workspace/vck5/resnet50 /usr/share/vitis_ai_library/models/resnet50
cd /workspace/demo/Vitis-AI-Library/samples/classification
source /workspace/setup/vck5000/setup.sh
./test_jpeg_classification resnet50 sample_classification.jpg

This should result in a set of classification scores from demo.hpp based on the default resnet50 training.

Build the model - Step 1 Prepare dataset

From your source videos extract high quality frame data using ffmpeg tool

ffmpeg -i <video.mp4> -qscale:v 2 <outdir>/out%04d.jpg

Convert the jpgs to our model input size 224x224 pixels. This will distort the images, but the result will still be effective because we are looking for colors and rough shapes, not specific details.

magick mogrify -resize 224x224! *.jpg

Now we will sort the images into folders to use for training and validation.

/Videos/event -or- /Videos/none1

Copy these into Training and Test such that we have this structure:

/Videos/AS_for_TFR/Training/event
/Videos/AS_for_TFR/Training/none1
/Videos/AS_for_TFR/Test/event
/Videos/AS_for_TFR/Test/none1

The samples are being prepared for a model with 2 output products (event, none1). I opted for none1 instead of 'none' because 'none' is a Python keyword.

Next we will build TensorFlow records from our folders using the tf_record.py script in Tutorial 12. We cloned Vitis-AI-Tutorials earlier along with Vitis-AI.

Update the tf_record.py with our new file locations

tf_record_dir='/workspace/Videos/tf_records' #output directory path for TF Records
raw_data_dir='/workspace/Videos/AS_to_TFR' #Directory path for raw dataset

Run the script, and you should get a directory that looks like this:

Python3 tf_record.py

Finally run gen_validation_set.py file (attached) to generate validation dataset.

Build the model - Step 2 Train floating point model

Many scripts in the Tutorial 12 project make use of the train_eval_h5.py source file. We will make a few adjustments :

Set the input shape from 100x100 to 224x224
Set the output from 131 to 2
My train_eval_h5.py (attached) is included as source

Edit the train_from_scratch.sh file to point to our evaluation images.

--eval_image_path=/workspace/Videos/tf_records \

Train until reasonable accuracy is achieved, or model hits an apex (meaning, it cannot improve)

bash train_from_scratch.sh

There are also tools to help set up 'resume training' where training can be paused and restarted with the last training result.

Build the model - Step 3 Quantize the floating point model

In order to convert the model from floating point to 8bit quantization for execution on our DPU, it must run through a quantization process. Xilinx provides a quantization tool that will apply necessary adjustments to the model without losing accuracy. The process will look similar to training, but is much faster.

Modify the following lines in quantize.py script

--model -> (the final trained model from step 2) \
--eval_image_path=/workspace/Videos/val/val \
--eval_image_list=/workspace/Videos/val/val_labels.txt \

Execute the script

Python3 quantize.py

Build the model - Step 4 Compile the model

Edit the compile.py script to point to the quantized model as well as our target information files. Execute the script to compile the model for VCK5000 DPU. Make note that the compiler needs to have access to a '.json' file that corresponds to the specific DPU being used. This.json file should have been installed with the VCK5000 installation setup. The compile.sh should be updated as shown:

vai_c_tensorflow2 -m tf2_resnet50/vai_q_output/quantized.h5 \
-a /opt/vitis_ai/compiler/arch/DPUCVDX8H/VCK5000/arch.json \
-o tf2_resnet50/vck5000 \
-n resnet50 \
--options '{"input_shape": "1,224,224,3"}'

Run the compilation

Python3 compile.py

Build the model - Step 5 Copy the model and test

Model outputs will be in ./tf2_resnet50/vck5000

md5sum.txtmeta.jsonresnet50.xmodel

We now rename and copy them to the /usr/share/vitis_ai_library/models/tf2_resnet50 directory. Also copy the resnet50.prototxt file from the resnet50 pretrained model we used in Baseline hardware check. resnet50.* files should be renamed tf2_resnet50.* Inspect the contents of the.prototxt file (it is text, so use nano or cat).

Attached to project is my pre-trained and compiled tf2_resnet50 model that can be deployed from this directory to run the demo.

Next edit the process_result.hpp to update our classification list and video output overlay

nano /workspace/demo/Vitis-AI-Library/samples/classification/process_result.hpp

I have included the process_result.hpp (attached) as source in this project.

With the changes completed it is time to rebuild the classification demo

cd /workspace/demo/Vitis-AI-Library/samples/classification
bash build.sh

Optional: modify the demo.hpp to slow down the video

I have found it helpful to add a usleep to be able to see how well the detection is running on video (frame by frame). Make changes as shown and rebuild demo.

sudo nano /usr/include/vitis/ai/demo.hpp

Search for a comment: // loop the video to find the correct section

virtual int run() override {
  auto& cap = *video_stream_.get();
  cv::Mat image;
  cap >> image;
  usleep(100000); // this will slow the video for analysis
  auto video_ended = image.empty();
  if (video_ended) {
    // loop the video
    open_stream();
    return 0;
  }
....

Run the system!

Prepare the container to display the video

sudo apt-get update
sudo apt-get install libcanberra-gtk*
export DISPLAY=:0.0

Finally run the video demo with our tf2_resnet50 model on an mp4 video file (replace with actual filename).

./test_video_classification tf2_resnet50 <video_filename.mp4> -t 4

A new window should spawn on Ubuntu desktop with the file playing and some overlay text. The -t 4 parameter sets the number of process threads that will queue DPU processing. Experiment with thread count to see if you get performance improvement.

On the command line window press CTRL+C to stop the application (the video window will close automatically).

Framerate throughput supports many simultaneous users!

Just under 500 frames-per-second performance on 1080P video

Full flow result

Warning! If you experience photosensitivity please use caution with the included videos. The content was selected to demonstrate extreme flashing and movement.

Launch video in large window, but pause before it plays. Drag the time slider back to the beginning of the video. Select high resolution playback (the yellow text should be sharp and legible)
Drag the window or use a piece of paper to cover the right side for the first playthrough. Take note of the intensity, and note if you experience any 'vision burn-in' where you blur focus or lose visual information for a period of time.
When the video has completed, move the slider to restart and cover the left side of the image for the second playthrough to see the detection accuracy and image adjustments.
Compare notes with me, first video is mild adjustment, second video is aggressive adjustment. The VCK5000 detector is the same in both.

detection only on the left / detection with minimal adjustment on the right

detection only on the left / detection with aggressive adjustment on the right

Clip 1 brain : The big flash gives me a 'visual burn-in' effect where I perceive it for much longer than it actually exists. In the adjusted video the flash does not disorient my vision, the artifacts from the adjustment are a little too obvious and could be tuned down with additional work on the filter. Clip 2 donut : The original was completely unwatchable for me, I cannot maintain focus with my eyes. The adjusted video is ok, it is still intense, but I can see what is going on.Clip 3 battle : The original was mostly ok but some block destructions were bright and I would lose focus on the screen. The adjusted video I did not experience any issue and felt like the video was smooth and clear for its duration.Clip 4 hallway : Original is very bright to me, so my vision is blurry when the lights are close and more sharp when they are far away. The adjusted video I have clear vision the entire time

Final project notes and future development

I am very happy with the trained resnet50's ability to detect a variety of irritating content, especially the DPU high framerate almost 500FPS. Frame by frame analysis did not reveal any detection 'misses' which indicate the training and subsequent quantization work very well to represent my intention. It can be even more accurate with additional training for specific content. Some of my training involved a popular first person video game, but its result could not be shown as to comply with contest rules.

Now that detection is working very well, the next item to tackle is to improve the adaptive content filter algorithm to dull bright effects without artifacts. Perhaps an inspired reader will take on the challenge!

Parting advice for development on containers and AI toolchain

Do not upgrade software in container. The purpose of a container is to hold a set of working and compatible tools. The Vitis-AI containers start from a working set, so be very deliberate about what you add. The base machine and build will also need to be compatible with hardware and recommended software drivers. This is not a case of 'download latest and go'
Learn how to mount persistent folders with your container launch script. A project like this will have a large batch of files that you might evaluate inside and outside of your container (like the video and image source files). Sometimes you might also want to edit or change folders with a more convenient tool that is not in the container.
conda activate.... when launching your container you will also select a toolchain. This will enable paths to the tools you will need to follow your flow.
Intermediate and output files are not interchangeable between tool chains. Meaning, it is not easy to take a TensorFlow2 model and convert it directly to an actual Caffe model. In some cases it is possible (like in this project where we use a trick), but do not assume it.
Models have specific input configurations, for images it might be 224x224 or 100x100, or something else. The model will expect to be trained and validated with images of that size. You will also want to confirm model input size is compatible with DPU. The DPU will be able to run the compatible model on much larger image/videos (1920x1080, 3840x2160, etc)
Xilinx provides example designs for specific toolchains. To start out, it will be helpful to find an example you can work through to learn a toolchain like TensorFlow2 or Caffe. I would highly recommend running through the full toolchain with a minimally trained model to make sure the flow is good before spending time/effort/money on training a model that might not be deployable.
CUDA, cuDNN, or not. I and several others have exciting GPU power that sadly could not be utilized for floating point training because we could not solve our workspace driver compatibility. New NVidia RTX cards require new CUDA and cuDNN versions, and some toolchains cannot support them. The new Vitis-AI 2.0 update supports modern CUDA but the VCK5000-ES1 card is limited to Vitis-AI 1.4.1. I still trained to high accuracy with CPU, but it takes approximately 10x longer even with a high-end CPU.
Docker container location. Many PCs have a very nice and fast boot drive where most programs reside. The Docker container workspace by default will be installed on the same drive where docker is installed, but it will become quite large with Vitis-AI, datasets, and training models. I followed a guide to move the actual container storage to a second drive. This enabled me allocate more space for the container.

Xilinx Tools and Examples helpful for our project

A flow chart is added (attached) to give visual representation of the steps and help navigate the sections of this project.
Vitis-AI-Tutorials / Design_Tutorials / 12-Alveo-U250-TF2-Classification : This is an example tutorial to train and deploy a ResNet50 model using TensorFlow2 toolset. I modified the tutorial quite a bit, but it is very helpful to understand the process. When the model is successfully compiled, I port it back to the classification demo instead of finishing the tutorial
Vitis-AI / demo / Vitis-AI-Library / samples / classification : Run test_jpeg_classification as a sanity check to confirm my hardware is working. Use test_video_classification demo as the framework to run a multi-threaded MP4 video -> VCK5000 DPU -> Overlay classification result on video -> display video on screen. Note that the changes in this project will 'break' the example design due to incompatible model output size (151 vs 2). The updated result process is only compatible with the new trained model.
Vitis-AI / demo / VART / resnet50 : This demo can also be used to evaluate JPG images, it is helpful that it will use the.xmodel file from compilation directly, the samples/classification demo used expects an additional.prototxt file from Caffe toolchain (that we borrow from the zoo....)
LSPCI tool will help identify the card and kernel drivers. Here is a snapshot of the working configuration on my machine. If kernel module is listed, but kernel driver is not listed as 'in use' then there is a problem with the driver or shell running on the VCK5000.

Working VCK5000, notice Kernel module and driver in use for each ID

XBUTIL is a powerful tool to check status of the VCK5000. There is an entire debug page on Xilinx github that is helpful to read and verify.

screenshot of Xilinx github page for XRT : use of the xbutil

Code

# Copyright 2021 Xilinx Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import os, time
import numpy as np
import tensorflow as tf
#import pdb

from tensorflow.keras.applications.resnet50 import ResNet50
from tensorflow.keras.layers import Input
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.resnet50 import preprocess_input, decode_predictions
from tensorflow.compat.v1 import flags
from tensorflow.keras.optimizers import RMSprop
from dataset import synth_input_fn
from dataset import input_fn, NUM_IMAGES
from dataset import get_images_infor_from_file, ImagenetSequence

keras = tf.keras

flags.DEFINE_string(
    'model', '/workspace/Videos/resnet50_model_bk.h5',
    'TensorFlow \'GraphDef\' file to load.')
flags.DEFINE_bool(
    'eval_tfrecords', True,
    'If True then use tf_records data .')
flags.DEFINE_string(
    'data_dir', '/workspace/Videos/tf_records',
    'The directory where put the eval images')
flags.DEFINE_bool(
    'eval_images', False,
    'If True then use tf_records data .')
flags.DEFINE_string(
    'eval_image_path', '/workspace/Videos/val/val',
    'The directory where put the eval images')
flags.DEFINE_string(
    'eval_image_list', '/workspace/Videos/val/val_labels.txt', 'file has validation images list')
flags.DEFINE_string(
    'save_path', "train_dir",
    'The directory where save model')
flags.DEFINE_string(
    'filename', "resnet50_model_{epoch}.h5",
    'The name of sved model')
flags.DEFINE_integer(
    'label_offset', 1, 'label offset')
flags.DEFINE_string(
    'gpus', '0',
    'The gpus used for running evaluation.')
flags.DEFINE_bool(
    'eval_only', False,
    'If True then do not train model, only eval model.')
flags.DEFINE_bool(
    'save_whole_model', False,
    'as applications h5 file just include weights if true save whole model to h5 file.')
flags.DEFINE_bool(
    'use_synth_data', False,
    'If True then use synth data other than imagenet.')
flags.DEFINE_bool(
    'save_best_only', False,
    'If True then only save a model if `val_loss` has improved..')
flags.DEFINE_integer('train_step', None, 'Train step number')
flags.DEFINE_integer('batch_size', 32, 'Train batch size')
flags.DEFINE_integer('epochs', 200, 'Train epochs')
flags.DEFINE_integer('eval_batch_size', 50, 'Evaluate batch size')
flags.DEFINE_integer('save_every_epoch', 1, 'save every step number')
flags.DEFINE_integer('eval_every_epoch', 1, 'eval every step number')
flags.DEFINE_integer('steps_per_epoch', None, 'steps_per_epoch')
flags.DEFINE_integer('decay_steps', 10000, 'decay_steps')
flags.DEFINE_float('learning_rate', 1e-6, 'learning rate')
flags.DEFINE_bool('createnewmodel', False, 'Create a new model from the base Resnet50 model')
# Quantization Config
flags.DEFINE_bool('quantize', False, 'Whether to do quantization.')
flags.DEFINE_string('quantize_output_dir', './quantized/', 'Directory for quantize output results.')
flags.DEFINE_bool('quantize_eval', False, 'Whether to do quantize evaluation.')
flags.DEFINE_bool('dump', False, 'Whether to do dump.')
flags.DEFINE_string('dump_output_dir', './quantized/', 'Directory for dump output results.')

FLAGS = flags.FLAGS

TRAIN_NUM = NUM_IMAGES['train']
EVAL_NUM = NUM_IMAGES['validation']

def get_input_data(num_epochs=1):
  print("getting train_data and eval_data from dirs")
  train_data = input_fn(
      is_training=True, data_dir=FLAGS.data_dir,
      batch_size=FLAGS.batch_size,
      num_epochs=num_epochs,
      num_gpus=0,
      dtype=tf.float32)

  eval_data = input_fn(
      is_training=False, data_dir='/workspace/Videos/tf_records',
      batch_size=FLAGS.eval_batch_size,
      num_epochs=1,
      num_gpus=0,
      dtype=tf.float32)
  # data_dir=FLAGS.data_dir,
  print("train num : ",TRAIN_NUM)
  print("eval num  : ",EVAL_NUM)
  return train_data, eval_data


def main():
  #breakpoint()
  print("********",tf.__version__)
  ## run once to save h5 file (add model info)
  if FLAGS.save_whole_model:
    print("********set to save whole model")
    model = ResNet50(weights='imagenet')
    model.save(FLAGS.model)
    exit()

  if not FLAGS.eval_images:
    print("********getting input data (no image evaluation)")
    train_data, eval_data = get_input_data(FLAGS.epochs)

  if FLAGS.dump or FLAGS.quantize_eval:
      print("********loading model for quantization")
      from tensorflow_model_optimization.quantization.keras import vitis_quantize
      with vitis_quantize.quantize_scope():
          model = keras.models.load_model(FLAGS.model)

  elif FLAGS.createnewmodel:
      print("********creating new model")
      #for training the model from scratch use the following:
      basemodel = ResNet50(weights='imagenet', include_top=True,input_tensor=Input(shape=(224, 224, 3)))
      base_output = basemodel.layers[175].output 
      new_output = tf.keras.layers.Dense(activation="softmax", units=2)(base_output)
      model = tf.keras.models.Model(inputs=basemodel.inputs, outputs=new_output)
      print(model.summary())

  else:
      print("********loading model")
      model = keras.models.load_model(FLAGS.model)
      print(model.summary())

  print("********loading image information from files here")
  print("eval image path : ",FLAGS.eval_image_path)
  print("eval image list : ",FLAGS.eval_image_list)
  print("label offset : ",FLAGS.label_offset)
  img_paths, labels = get_images_infor_from_file(FLAGS.eval_image_path,
          FLAGS.eval_image_list, FLAGS.label_offset)
  imagenet_seq = ImagenetSequence(img_paths[0:400], labels[0:400], FLAGS.eval_batch_size)

  if FLAGS.quantize:
      print("********running quantization")
      # do quantization
      from tensorflow_model_optimization.quantization.keras import vitis_quantize
      model = vitis_quantize.VitisQuantizer(model).quantize_model(calib_dataset=imagenet_seq)
      #print(eval_data)
      #model = vitis_quantize.VitisQuantizer(model).quantize_model(calib_dataset=eval_data)
      print("********model call completed, now save quantized model")

      # save quantized model
      model.save(os.path.join(FLAGS.quantize_output_dir, 'quantized.h5'))
      print('Quantize finished, results in: {}'.format(FLAGS.quantize_output_dir))
      return

  print("********loading information from files")
  img_paths, labels = get_images_infor_from_file(FLAGS.eval_image_path,
          FLAGS.eval_image_list, FLAGS.label_offset)
  imagenet_seq = ImagenetSequence(img_paths[0:1], labels[0:1], FLAGS.eval_batch_size)

  if FLAGS.dump:
      print("********dumping quantization results")
      # do quantize dump
      quantizer = vitis_quantize.VitisQuantizer.dump_model(model, imagenet_seq, FLAGS.dump_output_dir)

      print('Dump finished, results in: {}'.format(FLAGS.dump_output_dir))
      return

  print("********setting learning parameters")
  initial_learning_rate = FLAGS.learning_rate
  lr_schedule = keras.optimizers.schedules.ExponentialDecay(
              initial_learning_rate, decay_steps=FLAGS.decay_steps, decay_rate=0.96,
              staircase=True

          )
  opt = RMSprop(learning_rate=lr_schedule)
  
  loss = keras.losses.SparseCategoricalCrossentropy()
  metric_top_5 = keras.metrics.SparseTopKCategoricalAccuracy()
  accuracy = keras.metrics.SparseCategoricalAccuracy()
  model.compile(optimizer=opt, loss=loss,
          metrics=[accuracy, metric_top_5])
  if not FLAGS.eval_only:
    if not os.path.exists(FLAGS.save_path):
      os.makedirs(FLAGS.save_path)
    callbacks = [
      keras.callbacks.ModelCheckpoint(
          filepath=os.path.join(FLAGS.save_path,FLAGS.filename),
          save_best_only=True,
          monitor="sparse_categorical_accuracy",
          verbose=1,
      )]
    steps_per_epoch = FLAGS.steps_per_epoch if FLAGS.steps_per_epoch else np.ceil(TRAIN_NUM/FLAGS.batch_size)
    model.fit(train_data,
            epochs=FLAGS.epochs,
            callbacks=callbacks,
            steps_per_epoch=steps_per_epoch,
            validation_freq=FLAGS.eval_every_epoch,
            validation_steps = EVAL_NUM/FLAGS.eval_batch_size,
            validation_data=train_data) #eval_data)
  if not FLAGS.eval_images:
    print("evaluate model using tf_records data format")
    model.evaluate(eval_data, steps=EVAL_NUM/FLAGS.eval_batch_size)
  if FLAGS.eval_images and FLAGS.eval_only:
    print("we are running this path")
    img_paths, labels = get_images_infor_from_file(FLAGS.eval_image_path,FLAGS.eval_image_list, FLAGS.label_offset)
    imagenet_seq = ImagenetSequence(img_paths, labels, FLAGS.eval_batch_size)
    res = model.evaluate(imagenet_seq, steps=EVAL_NUM/FLAGS.eval_batch_size, verbose=1)


if __name__ == "__main__":
  os.environ['CUDA_VISIBLE_DEVICES'] = FLAGS.gpus
  main()

# Copyright 2021 Xilinx Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import numpy as np
import cv2
import os
labels= ["event","none1"]

validation_images="/workspace/Videos/AS_to_TFR/Test"
validation_output="/workspace/Videos/val/val"
validation_labels="/workspace/Videos/val/val_labels.txt"


validation_labels_file= open(validation_labels,"w")

index=1

for (dirpath, dirnames, filenames) in os.walk(validation_images):
    for filename in filenames:
        folder=dirpath.split(os.sep)
        folder = folder[-1]
        output_label_name = folder
        output_label_name = output_label_name.replace(" ","")
        label_idx = labels.index(folder)
        image = cv2.imread(dirpath+ '/'+ filename)
        output_filepath = validation_output + '/' + output_label_name + filename
        cv2.imwrite(output_filepath,image)
        validation_labels_file.write(output_label_name + filename + " " + str(label_idx) + "\n")
        print("wrote: ", output_filepath)
validation_labels_file.close()

/*
 * Copyright 2019 Xilinx Inc.
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

cv::Mat process_result(cv::Mat& image, vitis::ai::ClassificationResult& result,
                       bool is_jpeg) {
  auto x = 10;
  auto y = 20;
  auto i = 0;
  char cats [5][10] = {"event","none","none","none","none"};
  for (auto& r : result.scores) {
    i++;
    LOG_IF(INFO, is_jpeg) << "r.index " << r.index << " "  //
                          << cats[r.index] << " "
                          << "r.score " << r.score << " "  //
                          << std::endl;
    auto cls = std::string("") + cats[r.index] + " prob. " +
               std::to_string(r.score);
    cv::putText(image, cls, cv::Point(x, y + 20 * i), cv::FONT_HERSHEY_SIMPLEX,
                0.5, cv::Scalar(20, 20, 180), 1, 1);
    if(i>=2) break;
  }
  return image;
}


/* original - BK
cv::Mat process_result(cv::Mat& image, vitis::ai::ClassificationResult& result,
                       bool is_jpeg) {
  auto x = 10;
  auto y = 20;
  auto i = 0;
  for (auto& r : result.scores) {
    i++;
    LOG_IF(INFO, is_jpeg) << "r.index " << r.index << " "  //
                          << result.lookup(r.index) << " "
                          << "r.score " << r.score << " "  //
                          << std::endl;
    auto cls = std::string("") + result.lookup(r.index) + " prob. " +
               std::to_string(r.score);
    cv::putText(image, cls, cv::Point(x, y + 20 * i), cv::FONT_HERSHEY_SIMPLEX,
                0.5, cv::Scalar(20, 20, 180), 1, 1);
  }
  return image;
}
*/

Credits

Brian Kincaid

1 project • 1 follower

Career electrical engineer, semiconductor

Thanks to Gorodenkoff, flashmovie, and gonin.

AI assisted video adjustment for Photosensitive Viewers

Things used in this project

Hardware components

Software apps and online services

Story

The Idea

The Tools

Getting to know the VCK5000

Baseline hardware check (from container with pretrained model)

Build the model - Step 1 Prepare dataset

Build the model - Step 2 Train floating point model

Build the model - Step 3 Quantize the floating point model

Build the model - Step 4 Compile the model

Build the model - Step 5 Copy the model and test

Optional: modify the demo.hpp to slow down the video

Run the system!

Full flow result

Final project notes and future development

Parting advice for development on containers and AI toolchain

Xilinx Tools and Examples helpful for our project

Schematics

Flow_Chart

tf2_resnet50 xmodel, ready to run on VCK5000 ES1

Code

train_eval_h5.py

gen_validation_set.py

process_result.hpp

Credits

Brian Kincaid

Comments

Embed the widget on your own site

AI assisted video adjustment for Photosensitive Viewers

AI assisted video adjustment for Photosensitive Viewers

Things used in this project

Hardware components

Software apps and online services

Story

The Idea

The Tools

Getting to know the VCK5000

Baseline hardware check (from container with pretrained model)

Build the model - Step 1 Prepare dataset

Build the model - Step 2 Train floating point model

Build the model - Step 3 Quantize the floating point model

Build the model - Step 4 Compile the model

Build the model - Step 5 Copy the model and test

Optional: modify the demo.hpp to slow down the video

Run the system!

Full flow result

Final project notes and future development

Parting advice for development on containers and AI toolchain

Xilinx Tools and Examples helpful for our project

Schematics

Flow_Chart

tf2_resnet50 xmodel, ready to run on VCK5000 ES1

Code

train_eval_h5.py

gen_validation_set.py

process_result.hpp

Credits

Brian Kincaid

Comments

Related channels and tags