Published April 18, 2022

Local Super-Resolution Transforms for Low-Resolution Images

Combined with the local implicit image function model and VCK5000, high-fidelity images can be captured at any resolution.

243

Local Super-Resolution Transforms for Low-Resolution Images

Things used in this project

Hardware components

AMD VCK5000 Versal Development Card

Raspberry Pi Camera Module

Software apps and online services

Snappy Ubuntu Core

AMD Vitis Unified Software Platform

MATLAB

Story

In our visual world, images appear before our eyes in consecutive frames. Then, during image processing, machines can often only store and present images in discrete frames. Images are usually stored in machines as two-dimensional arrays, but the complexity and precision of images are often governed by resolution. Furthermore, we often refer to distortions in which objects, images, sounds, waveforms, or other forms of information change their original shape (or other characteristics). And for this project, it's mostly about the high fidelity of the images that can be extracted at any resolution we want. When we look at a picture of an area of an image, we tend to see it by zooming in, but for low-resolution images it becomes blurry because when zooming in, when the resolution is too low, the stored numerical values in the computer The value of leads to the accuracy of the image. low, and remove some necessary or unnecessary pixel values (or features), so that the computer cannot display the features of the area in high-definition, which will cause a certain degree of distortion in the image or computer vision, and train an ideal neural network. For the network, we need to preprocess the images before training. This preprocessing reduces the resolution of all images in the dataset to images of the same size, which can reduce memory consumption. We then packaged and loaded the processed dataset into the network for training, but the computer caused a dramatic drop in image fidelity during processing. Therefore, we use the VCK5000 accelerated inference local implicit image function model to predict RGB values for the desired image resolution, so that we can take advantage of this accelerated model when capturing low-resolution images in low-resolution cameras or in dataset preprocessing, to obtain images of any resolution and size, while ensuring image fidelity or image accuracy.

The steps for my project are as follows:

First, let me introduce the local implicit image function proposed by Yinbo Chen et al. It takes the pixel coordinates and pixel values of an image and the two-dimensional depth feature values of all coordinates of the image as input, and then queries the image according to the coordinate information. Local depth feature values near the coordinates, predict their local RGB values, and reconstruct a new resolution image as output.

Secondly, I will introduce the project operation framework of the devices and functions used. I use the functions provided by them to further combine the low-resolution Raspberry Pi CSI camera to take pictures, and then input the pictures taken by the camera into the local implicit image function model quantized by VCK5000, and at the same time accelerate the inference, and finally output the desired arbitrary High-resolution images, so as to solve the problem that low-cost low-resolution cameras cannot capture high-resolution images, reduce the cost of the camera's high-resolution sensor module, and help convert images to high-resolution images, effectively ensuring high image fidelity.

Finally, I will introduce the steps of how to obtain and generate pictures. I use the camera to save low-resolution pictures after shooting, combined with the self-made GUI interface human-computer interaction, input the resolution size of the picture you want into the size input box, and then put your image save path into the path box, and then click the "Generate" button to generate a picture of any resolution you want, and save the picture to the file path you specify.

In this way, I can reduce the memory consumption of the image in the computer, and can also meet the requirement that when you zoom in and observe the image, you can see the local details of the image more clearly.

import argparse
import os
import re
import sys
import pdb
import random
import time
from tkinter import N
from pytorch_nndct.apis import torch_quantizer, dump_xmodel
import torch
import torchvision
import torchvision.transforms as transforms


from tqdm import tqdm
import models
import test_liif

device = torch.device("cpu")


def quantize(args):

    # data_dir = args.input_dir
    float_model = args.model_dir + '/rdn-liif.pth'
    quant_model = './quant_model'
    quant_mode  = args.quant_mode
    batchsize = args.batchsize
    finetune = args.fast_finetune  # 
    subset_len = args.subset_len
    deploy = args.deploy  # xmodel

    if quant_mode != 'test' and deploy:
        deploy = False
        print(r'Waring: exporting xmodel needs to be done in quantizaiton test mode, turn off it in this running!')

    if deploy and (batchsize != 1 or subset_len != 1):
        print(r'Warning: Exporting xmodel needs batch size to be 1 and only 1 iteration of inference, change them automatically!')
        batchsize = 1
        subset_len = 1

    # load trained model
    model = models.make(torch.load(float_model, map_location='cpu')['model'], load_sd=True)

    # force to merge BN with
    optimize = 1

    inp = torch.randn([batchsize, 3, 48, 48])
    cool = torch.randn([batchsize, 2304, 2])
    cell = torch.randn([batchsize, 2304, 2])

    if quant_mode == 'float':
        quant_mode = model
    else:
        # using python pytorch quantizer api
        quantizer = torch_quantizer(quant_mode, model, (inp, cool, cell), device=device)
        quantized_model = quantizer.quant_model
        # evaluate
        test_liif.test(args, quantized_model, device)

        if args.quant_modee == 'calib':
            quantizer.export_quant_config()
        if deploy:
            quantizer.export_xmodel(deploy_check=False, output_dir=quant_model)


    return


if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    # vck5000_quantizer_config
    # parser.add_argument('--input_dir', default="/home/cc/Vitis-AI-1.4.1/liif-main/data",
    #                     help="Data set dirtectory, please input your picture dir.")
    parser.add_argument('--model_dir', default='./liif_model',
                        help="Model set dirtectory, please check your model dir")
    parser.add_argument('--subset_len', default=300, type=int, help='subset_len to evaluate model, using the whole vaildtion dataset if it is not set')
    parser.add_argument('-q', '--quant_mode', type=str, default='calib', choices=['float', 'calib', 'test'],
                        help='Quantization mode (calib or test). Default is calib')
    parser.add_argument('--fast_finetune', dest='fast_finetune', action='store_true', help='fast finetune model before calibration')
    parser.add_argument('--deploy', dest='deploy', action='store_true', help='export xmodel for deployment')
    parser.add_argument('-b', '--batchsize', type=int, default=16,
                        help='Training batchsize, Must be an integer, default is 16')

    # test_liif_config
    parser.add_argument('--config', type=str, default='configs/train-div2k/train_edsr-baseline-liif.yaml')
    # parser.add_argument('--resolution', help="Please input your H*D size of picture that you want.")
    # parser.add_argument('--output', default='output.png')

    args, _ = parser.parse_known_args()

    quantize(args)

import os
import math
import random
from re import T

import yaml
import torch
from functools import partial
from torch.utils.data import DataLoader
from tqdm import tqdm
import datasets
import models
import utils


def batched_predict(model, inp, coord, cell, bsize):
    with torch.no_grad():
        model.gen_feat(inp)
        n = coord.shape[1]
        ql = 0
        preds = []
        while ql<n:
            qr = min(ql + bsize, n)
            pred = model.query_rgb(coord[:, ql: qr, :], cell[:, ql: qr, :])
            preds.append(pred)
            ql = qr
        pred = torch.cat(preds, dim=1)
    return pred


def eval_psnr(loader, model, config, device=None):
    model.eval()
    model.to(device)
    loss_fn = torch.nn.L1Loss()
    train_loss = utils.Averager()

    data_norm = config['data_norm']
    t = data_norm['inp']
    inp_sub = torch.FloatTensor(t['sub']).view(1, -1, 1, 1)
    inp_div = torch.FloatTensor(t['div']).view(1, -1, 1, 1)
    t = data_norm['gt']
    gt_sub = torch.FloatTensor(t['sub']).view(1, 1, -1)
    gt_div = torch.FloatTensor(t['div']).view(1, 1, -1)

    for batch in tqdm(loader, leave=False, desc='train'):
        for k, v in batch.items():
            batch[k] = v

        inp = (batch['inp'] - inp_sub) / inp_div
        pred = model(inp, batch['coord'], batch['cell'])

        gt = (batch['gt'] - gt_sub) / gt_div
        loss = loss_fn(pred, gt)

        train_loss.add(loss.item())
    return train_loss.item()


def loader_data(datasets, subset_len=None, batch_size=128, sample_method='random', distributed=False):
    if subset_len:
        if sample_method == 'random':
            datasets = DataLoader(datasets, batch_size=batch_size, shuffle=True)
        else:
            datasets = DataLoader(datasets, batch_size=batch_size, shuffle=True)

        return datasets


def test(args, model, device):
    args = args
    with open(args.config, 'r') as f:
        config = yaml.load(f, Loader=yaml.FullLoader)

    # load_test_dataset
    spec = config['train_dataset']
    dataset = datasets.make(spec['dataset'])
    dataset = datasets.make(spec['wrapper'], args={'dataset': dataset})
    # loader = DataLoader(dataset, batch_size=args.batchsize, pin_memory=True)
    loader = loader_data(datasets=dataset, subset_len=args.subset_len, batch_size=args.batchsize, sample_method='random')
    if args.fast_finetune:
        ft_loader = loader_data(datasets=dataset, subset_len=1024, batch_size=args.batchsize, sample_method=None)

    res = eval_psnr(loader, model,
                    config=config,
                    device=device)
    print('result:{:.4f}'.format(res))

train_dataset:
  dataset:
    name: image-folder
    args:
      root_path: ./load/div2k/DIV2K_train_HR
      repeat: 20
      cache: in_memory
  wrapper:
    name: sr-implicit-downsampled
    args:
      inp_size: 48
      scale_max: 4
      augment: true
      sample_q: 2304
  batch_size: 16

val_dataset:
  dataset:
    name: image-folder
    args:
      root_path: ./load/div2k/DIV2K_valid_HR
      first_k: 10
      repeat: 160
      cache: in_memory
  wrapper:
    name: sr-implicit-downsampled
    args:
      inp_size: 48
      scale_max: 4
      sample_q: 2304
  batch_size: 16

data_norm:
  inp: {sub: [0.5], div: [0.5]}
  gt: {sub: [0.5], div: [0.5]}

model:
  name: liif
  args:
    encoder_spec:
      name: rdn
      args:
        no_upsampling: true
    imnet_spec:
      name: mlp
      args:
        out_dim: 3
        hidden_list: [256, 256, 256, 256]

optimizer:
  name: adam
  args:
    lr: 1.e-4
epoch_max: 1000
multi_step_lr:
  milestones: [200, 400, 600, 800]
  gamma: 0.5

epoch_val: 1
epoch_save: 100

import copy


models = {}


def register(name):
    def decorator(cls):
        models[name] = cls
        return cls
    return decorator


def make(model_spec, args=None, load_sd=False):
    if args is not None:
        model_args = copy.deepcopy(model_spec['args'])
        model_args.update(args)
    else:
        model_args = model_spec['args']
    model = models[model_spec['name']](**model_args)
    if load_sd:
        model.load_state_dict(model_spec['sd'])
    return model

import copy


datasets = {}


def register(name):
    def decorator(cls):
        datasets[name] = cls
        return cls
    return decorator


def make(dataset_spec, args=None):
    if args is not None:
        dataset_args = copy.deepcopy(dataset_spec['args'])
        dataset_args.update(args)
    else:
        dataset_args = dataset_spec['args']
    dataset = datasets[dataset_spec['name']](**dataset_args)
    return dataset

Credits

Cao_Chao

2 projects • 2 followers

Thanks to Yinbo Chen， Sifei Liu，Xiaolong Wang.

Local Super-Resolution Transforms for Low-Resolution Images

Things used in this project

Hardware components

Software apps and online services

Story

Code

Quantitative Model Program

Model testing

Data sets and parameter description files

Model loader

Dataset load file

Credits

Cao_Chao

Comments

Embed the widget on your own site

Local Super-Resolution Transforms for Low-Resolution Images

Local Super-Resolution Transforms for Low-Resolution Images

Things used in this project

Hardware components

Software apps and online services

Story

Code

Quantitative Model Program

Model testing

Data sets and parameter description files

Model loader

Dataset load file

Credits

Cao_Chao

Comments

Related channels and tags