The implemented image segment in video color processing module uses K-Mean clustering algorithm for FPGA Devices, and it has been designed with a standard Xilinx AXI4 streaming interface, so that it can be inserted as module within any image processing pipeline.
This design takes 10-bit RGB format video stream and apply various filters and color space conversions. At high level, the design comprises of three pipelines: Capture, Process and Output pipelines. During process pipeline each rgb pixel either filtered or converted into color space which is implemented video color processing (VCP) module.
VCP module contains a collection of control registers and maintain a local small buffers to store video frame lines. A VCP module takes input video pixel stream, performs computation, or adds new contents to the stream through filters, and then outputs the processed pixel stream. VCP can be either a pixel transformation or a pixel generation.
VCP have following filters and color space conversion which can enabled before build:
Filters
- SHARP
- BLUR
- EMBOSS
- SOBEL EDGE DETECTION
- CONTRAST
Color Space Conversions
- RGB TO HSL
- HSL TO RGB
- RGB TO YCBCR
- YCBCR TO RGB
- RGB TO CMYK
- RGB TO YDBDR
- RGB TO CIEXYZ
- RGB TO CIEYUV
- RGB TO YIQ
- RGB TO YPBPR
- RGB TO LMS
- RGB TO ICTCP
- RGB TO HED
- RGB TO YC1C2
- RGB TO YCBCR
Color Correction Matrix
K-Mean Color Clustering
Test-Pattern
The Sony IMX477 12.5MP sensor is a 1/2.3-inch CMOS digital image sensor with an active imaging pixel array of 4056H x 3040V 60fps@max resolution. This sensor is configured by I2C interface and has MIPI output interface that is connected to the MIPI CSI-2 RX subsystem inside.
The AR 1335 13.0MP sensor is a 1/3.2-inch CMOS digital image sensor with an active imaging pixel array of 4208H x 3120V 30fps@max resolution. This sensor is configured by I2C interface and has MIPI output interface that is connected to the MIPI CSI-2 RX subsystem inside.
The MIPI CSI-2 receiver subsystem includes a MIPI D-PHY core and that connects 2 data lanes for imx477 camera device and 4 lanes for AR1335 camera device. The subsystem captures video stream frames from the AR1335 camera and IMX477 camera in RAW10 format.
The Demosaic module converts bayer pattern input frame to RGB color frame.
The VCP custom module apply various filters, color space conversion and K-Mean color clusters. It also applies image enhancement control to improve image quality which include contrast, brightness, saturation, white/black balance and rgb gain through axi4 lite configuration registers.
VDMA (Video Direct Memory Access) transfer video streaming to and from external memory and operate under software control.
Image frame resolution is set to 2048x1080 at 60 frames per second and maximum full resolution of 4056x3040 is also supported but limited to 15 frames per second.
K-MEAN COLOR CLUSTERThis module takes a stream of camera data in pixel pipeline format. This stream must be presented to the inputs data (iRgb.red, iRgb.green, iRgb.blue) and control signals (iRgb.valid, iRgb.eol, iRgb.eof, iRgb.sof). The result of this module steam K-Mean rgb cluster color space in output oRgb channel.
This module is synthesized and implemented using Vivado 2022.1 for KRIA KV260 board and verified using ModelSim 2020 edition simulator.
The Functional block diagram of the implemented rgb data path to n clustered conversion is shown in Figure below.
In this module, K mean-based color quantization algorithm is applied to rgb input stream pixels.
- Select rgb k color references for k points to decide number of clusters.
- Find the distance of each input pixels with selected rgb k points.
- Assign each new pixel value to closest selected rgb n values using Euclidean distance.
- Pixels which are nearest to the selected rgb values allocated to a cluster.
Euclidean distance is calculated between original image pixel Red1, Green1 and Blue1 and reference pixel color schemes.
Rgb Image of video frame consist of 0 to 255 values per rgb channel which gives 256*256*256 colors, and the goal is to use color k mean cluster to set number of clusters. Minimum distance is the final candidate for being closest to the source rgb color. If an RGB image color depth of 24 bits which is 16 millions of colors, after K-mean clustering with value n, then image is converted to a version of n colors.
The codebook created for K-Means is called the color palette or reference color scheme.
K-mean cluster module 24bit pixels convert 16 millions of rgb colors into n color version. The module has clock and reset ports. Port iRGB and oRGB consist of red, green, and blue rgb channels with valid signal.
K EQUALTO 6 REFERENCE COLOR SCHEMES
Generated image below is the result of k-mean clustering using 6 colors references of palette schemes.
K EQUALTO 51 REFERENCE COLOR SCHEMES
In this reference, k parameter set to k = 51, where k is the number of clusters. K-mean color quantization quantizes input image to number of colors into 51 clusters from 51 refences of color schemes.
IMX477 CAMERA: K EQUALTO 90 REFERENCE COLOR SCHEMES
In this reference, k parameter set to k = 90, where k is the number of clusters. K-mean color quantization quantizes input image to number of colors into 90 clusters from 90 refences of color schemes.
90 REFERENCE PROGRAMABLE COLOR SCHEMES
VIDEO CONTROLLER INTERFACE
IMX477-MIPI-CS is a high-resolution digital camera. It incorporates a Sony 1/2.3" CMOS digital image sensor with an active imaging pixel array of 4056H x3040V.
SPECIFICATIONS
DRIVE MODE
FEATURES
- Back-illuminated and stacked CMOS image sensor Exmor RS.
- Digital Overlap High Dynamic Range (DOL-HDR) mode with raw data output.
- High signal to noise ratio (SNR).
- Full resolution @60 frame/s (Normal), 4K2K @60 frame/s (Normal), 1080p @240 frame/s.
- Full resolution @40 frame/s (12 bit Normal), Full resolution @30 frame/s (DOL-HDR, 2 frame).
- Output video format of RAW12/10/8, COMP8.
- Power Save Mode.
- Pixel binning readout and V sub-sampling function.
- Independent flipping and mirroring.
- Input clock frequency 6 to 27 MHz
- CSI-2 serial data output (MIPI 2lane/4lane)
- Two PLLs for independent clock generation for pixel control and data output interface.
- Ambient Light Sensor (ALS)
- Dual sensor synchronization operation (Multi camera compatible)
- 7 K bit of OTP ROM for users
- Built-in temperature sensor
- 10-bit/12-bit A/D conversion on chip
INTERFACE
IMX477 Camera module and the KRIA KV260 development board are connected through the FPC flexible flat cable. Camera pixels data is transferred over dual -lane MIPI CSI-2 interface which is connected to development board via 15-pin flat flexible cable.
The interface is defined as follows:
The Camera Serial Interface (CSI), a division from the MIPI Alliance which have formulated set of interface standards to standardize the interface of mobile devices such as camera and display. MIPI's full name is "Mobile Industry Processor Interface", MIPI DSI correspond to video display and MIPI CSI correspond to video input standards respectively. CSI interface is divided into physical layer (D-PHY) and protocol layer (CSI-2).
CONFIGURATION
For this design application, the IMX477 camera device is configured to output 1920x1080p video frames with RGB 30 bits pixel format, at 60 frames rate. Both the camera device and MIPI CSI-2 subsystem module is configured in advance.
The MIPI CSI-2 subsystem module is used to captures images from IMX477 camera sensor and outputs to axi4 stream video stream. Demosaic module is used to convert raw axi4 stream video data to rgb format video stream.
ETHERNET UDP VIDEO STREAMING
To successfully run the real-time network udp video streaming, this reference design uses the KRIA KV260 development board developed by Xilinx.
The design is composed of two platforms: the transmitter platform and the receiver platform. The receiver platform which receives input video stream from imx477 camera implements PL logic design of device. The transmitter platform which is PS Side of device implements a UDP/IP hardware protocol stack that enables high speed communication over a LAN or a point-to-point connection.
UDP/IP video streaming is implemented by using LWIP (LightWeight Internet Protocol) which is open-source networking stacked design for embedded system. LWIP supports both TCP and UDP at transport layer and application-level protocols.
KRIA KV260 board connect directly to the host PC and not running a DHCP server and host PC need to configure to an IP address of 192.168.0.42 and subnet mask of 255.255.255.0. Default IP address for FPGA development board is configured to “192.168.0.10”.
At startup LWIP fetch an IP address from DHCP server. If there no DHCP server than DHCP server request will timeout and the board will default to preset default to an IP address of 192.168.0.10.
The FFMPEG application on the host PC will connect with the video transmitter application on board and video data transmission will commence. When the video images are received, the FFMPEG invoke ffplay.exe which decode first 54 bytes source video type. Below figure shows 54 bytes of video header format.
The BSP required to run the lwip application and it includes many settings that impact ethernet performance. Below figure shows lwip BSP settings.
FPGA Firmware
- MIPI Camera interfacing
- Modular and scalable
- IP-Cores for Vivado Design Suite
- AXI4 / AXI-Stream compliancy
- Support for Xilinx 7Series, Ultrascale, Ultrascale+, SoC and MPSoC
Software
- Controlling and application
- Modular and scalable
- Written in C
PL Design include following IPs
PS Design include following Modules
Configuration and Intialization of VTC, VDMA, Camera Sensors And Lwip.
On PC run ffplay to watch realtime UDP streaming
- ffplay.exe -i -vf vflip -framerate 60 "udp://127.0.0.0:8080" -loglevel quiet
Comments