INTRODUCTION
HOW IT WAS DONE
Capturing the Video
Analyzing the Video
Transmitting the Key Points
Predicting Motions
Robot Control
CONNECTIVITY
DEMONSTRATIONS
CONCLUSION

Created November 24, 2019 © LGPL

Digital Twin

Digital twin is a digital replica of a physical entity in the virtual space. But what if you can go a step further with AI and Robotics.

AdvancedFull instructions providedOver 8 days68

Things used in this project

Hardware components

STMicroelectronics STM32F103RB

LDX-227 Servo

LDX-218 Servo

DFRobot 6 DOF Sensor - MPU6050

Buzzer

Humanoid Robot Kit

NodeMCU ESP8266 Breakout Board

Software apps and online services

Node-RED

Arduino IDE

TensorFlow

Story

INTRODUCTION

When growing old with age, mobility and safety becomes a constant worry. Declining agility and unsteadiness can make it dangerous for older citizens to go out from the safety of their homes to do their day to day activities.

This project introduces the concept of an extremely futuristic initiative called the digital twin to help our very own older generation stay safe from the dangerous they could face in the outside world.

A digital twin is a digital replica of a living or non-living physical entity. By bridging the physical and the virtual world, data is transmitted seamlessly allowing the virtual entity to exist simultaneously with the physical entity. But what if we can go a step further and create a digital twin for living beings in the physical space using advanced AI and Robotics.

In other words what we created was a humanoid robot who can replicate exact motions done by a human who is inside the safety of their homes. Those who have watched “Real Steel” might have a slight idea about what I’m talking about.

This project consists of three main sections,

1. Capturing the motions made by humans

2. Identifying and predicting the motion on the cloud and transmitting that data to the robot.

3. Controlling the robot to Replicate the motion

Now this seems very futuristic and complicated and sounds like it would require large amount of sensors. But thanks to the advancements in Artificial Intelligence, it does not have to be like that anymore. In this project the motion capturing is done using AI and Computer Vision using a typical web camera connected the computer. This project was done using an open source AI model the captured video is evaluated and all the joints in the body are identified.

Then their XY coordinates are transmitted in to a NodeJS based cloud server which analyzes the received numerical data and predicts the motion of the person. Then the commands related to that particular motion are transmitted to the robots ESP8266 microcontroller over the internet. Based on the commands received the ESP8266 will communicate with the STM32 controller and move the motors of the robot accordingly.

Complete high-level flow of the project is shown below,

Process Flow

HOW IT WAS DONE

Capturing the Video

First stage of the project consists of capturing the video feed. In order to do that a typical web camera was connected to laptop and script was written using java script to open the camera and capture the footage.

const videoWidth = 600;
const videoHeight = 500;
const stats = new Stats();

async function setupCamera() {
    if (!navigator.mediaDevices || !navigator.mediaDevices.getUserMedia) {
        throw new Error(
            'Browser API navigator.mediaDevices.getUserMedia not available');
    }

    const video = document.getElementById('video');
    video.width = videoWidth;
    video.height = videoHeight;

    const mobile = isMobile();
    const stream = await navigator.mediaDevices.getUserMedia({
        'audio': false,
        'video': {
        facingMode: 'user',
        width: mobile ? undefined : videoWidth,
        height: mobile ? undefined : videoHeight,
        },
    });
    video.srcObject = stream;

    return new Promise((resolve) => {
        video.onloadedmetadata = () => {
            resolve(video);
        };
    });
}

async function loadVideo() {
    const video = await setupCamera();
    video.play();

    return video;
}

Analyzing the Video

In here open source AI model called “pose-net” was used. Pose-net is a machine learning model which allows for real-time human pose estimation in the browser.

So what is pose estimation anyway? Pose estimation refers to computer vision techniques that detect human figures in images and video, so that one could determine, for example, where someone’s elbow shows up in an image. To be clear, this technology is not recognizing who is in an image — there is no personal identifiable information associated to pose detection. The algorithm is simply estimating where key body joints are.

[Source : https://medium.com/tensorflow/real-time-human-pose-estimation-in-the-browser-with-tensorflow-js-7dd0bc881cd5]

At a high level pose estimation happens in two phases:

1. An input RGB image is fed through a convolutional neural network.

2. Either a single-pose or multi-pose decoding algorithm is used to decode poses, pose confidence scores, keypoint positions, and keypoint confidence scores from the model outputs.

Hence the TensorFlow pose net libraries were imported in to the code base.

import * as posenet from '@tensorflow-models/posenet';
import * as tf from '@tensorflow/tfjs';

Then the library functions were called to detect the poses in real time through the video input.

function detectPoseInRealTime(video, net) {
    const canvas = document.getElementById('output');
    const ctx = canvas.getContext('2d');
    
    const flipPoseHorizontal = true;
    
    canvas.width = videoWidth;
    canvas.height = videoHeight;
    
    async function poseDetectionFrame() {

        stats.begin();
        let poses = [];
        let minPoseConfidence;
        let minPartConfidence;

        switch (guiState.algorithm) {
            case 'single-pose':
            const pose = await guiState.net.estimatePoses(video, {
                flipHorizontal: flipPoseHorizontal,
                decodingMethod: 'single-person'
            });
            poses = poses.concat(pose);
            minPoseConfidence = +guiState.singlePoseDetection.minPoseConfidence;
            minPartConfidence = +guiState.singlePoseDetection.minPartConfidence;
            break;

        stats.end();
        requestAnimationFrame(poseDetectionFrame);
    }
poseDetectionFrame();
}

Transmitting the Key Points

Once the key points were identified through the model, it was transmitted in to a Node-Red based web server running on AWS to analyze and predict the movements done by the user. In order to transmit the data, MQTT protocol was used.

MQTT is a machine-to-machine (M2M)/"Internet of Things" connectivity protocol which was designed as an extremely lightweight publish/subscribe messaging transport.

import {connectMQTT} from './MQTTData';

function initialize() {
    connectMQTT();
}

When connecting to MQTT you need to specify the MQTT parameters as shown below

let mqtt = require('mqtt');
let websocket='your_aws_host_ip';
let port=8080;
let user='your_user_name';
let pass='your_password;
let client;

Then export function was used to export the data on to the server as follows.

export function connectMQTT() {
    console.log('Connecting to MQTT');
    client = newPaho.MQTT.Client(websocket, port, clientname);

function sendMessage(message, topic){
    var message = new Paho.MQTT.Message(message);
    message.destinationName ='hr1/'+topic;
    client.send(message);
}

Predicting Motions

Data was captured on node using the MQTT input nodes based on each topic. Then the received messages were formatted into json files and were evaluated. When evaluating, the arms and legs were considered independently and their movements were predicted.

Node-Red Flows

Based on the predictions, the commands were sent to the ESP8266 micro controller connected to the robot over the UDP through the internet.

Robot Control

Robot consisted of two main microcontrollers, the ESP8266 and the STM32 connected over a serial interface for commanding and controlling. The ESP8266 microcontroller was connected to internet over WiFi and was responsible for receiving the data related to the motion predictions.

void UDPConnect() {
    Serial.println("Connecting to %s", ssid);
    WiFi.config(ip, gateway, subnet);
    WiFi.begin(ssid, password);
    while (WiFi.status() != WL_CONNECTED) {
        delay(500);
    }

    Serial.println("connected");
    Udp.begin(localUdpPort);
}
 
void UDP() {
    int packetSize = Udp.parsePacket();
    if (packetSize) {
        int len = Udp.read(incomingPacket, 255);
        if (len > 0) {
            incomingPacket[len] = 0;
        }

        input = incomingPacket;
        Serial.println(input);
        Udp.beginPacket(Udp.remoteIP(), Udp.remotePort());
        Udp.write(replyPacket);
        Udp.endPacket();
    }
}

Based on the commands captured through the UDP connection, the relevant Servo controlling commands were transmitted to the STM32 controller through the Serial Interface.

servos[0].ID = 6; servos[0].Position = 1500;
servos[1].ID = 7; servos[1].Position = 2500;
servos[2].ID = 15; servos[2].Position = 500;
servos[3].ID = 14; servos[3].Position = 1500;
servos[4].ID = 1; servos[4].Position = 1500;
servos[5].ID = 5;  servos[5].Position = 1500;
servos[6].ID = 13; servos[6].Position = 1500;
servos[7].ID = 9;  servos[7].Position = 1500;
servos[8].ID = 8;  servos[8].Position = 1900;
servos[9].ID = 16; servos[9].Position = 1100;
servos[10].ID = 4; servos[10].Position = 1750;
servos[11].ID = 12; servos[11].Position = 1250;
servos[12].ID = 3;  servos[12].Position = 1500;
servos[13].ID = 11; servos[13].Position = 1500;
servos[14].ID = 2;  servos[14].Position = 1200;
servos[15].ID = 10; servos[15].Position = 1800;

myse.moveServos(servos, 16, 300);

CONNECTIVITY

This use case was tested on a pilot 5G connection which achieved real time motion replication even when the human and the robot were in two different locations.

DEMONSTRATIONS

Initial Testing - Unoptimized

Initial Testing - Unoptimized Version

Final Demonstrations - Optimized

Optimized Testing

Demonstration

The project was demonstrated as an advanced use case of machine learning as well as the 5G internet at various exhibitions and forums in Sri Lanka and was able to generate lots of interest among the community regarding the advancements of technology and what to expect in the near future.

CONCLUSION

The technology explained in the project will cater to wider spectrum of verticals in the near future. Among them, one of the major use cases will be the creation of a digital twin for human beings in the physical space in form of a robot which would replicate their exact movements. Since the total connectivity happens over the internet, the robot can be controlled from anywhere in the world, almost instantaneously. This will act as a core innovation which would be beneficial for our very own older generation to perform various activities by being within the safety of their homes without exposing them to the dangerous of the outside world.

On a completely different vertical, technologies like this would enable advanced use cases such as remote surgery. If there is a really skilled surgeon in different part of the world it wouldn’t be practical for them to travel all around the world to perform operations. However, using this technology, the doctors will be able to perform operations on patients on the other side of the globe using a robot who replicates the motions done by the doctor. It would also minimize the risks the humans have to take in activities such as bomb disposal where the human can control the robot from a safe remote location and perform the activities which would be replicated by the robot to successfully dispose the bombs. In the verge of Industrial revolution 4.0, this technology will completely revolutionize world by bringing the things we could have only imagined in to reality in the very near future.

Code

void UDP() {
  int packetSize = Udp.parsePacket();
  if (packetSize) {
    // receive incoming UDP packets
  Serial.printf("Received %d bytes from %s, port %d\n", packetSize, Udp.remoteIP().toString().c_str(), Udp.remotePort());
    int len = Udp.read(incomingPacket, 255);

    if (len > 0) {
      incomingPacket[len] = 0;
    }

    input = incomingPacket;
    Serial.println(input);
   
   // send back a reply, to the IP address and port we got the packet from
    Udp.beginPacket(Udp.remoteIP(), Udp.remotePort());
    Udp.write(replyPacket);
    Udp.endPacket();
  }
}

void UDPConnect() {
  Serial.printf("Connecting to %s ", ssid);
  WiFi.config(ip, gateway, subnet);
  WiFi.begin(ssid, password);
  while (WiFi.status() != WL_CONNECTED) {
    delay(500);
    Serial.print(".");
  }

  Serial.println(" connected");

  Udp.begin(localUdpPort);
  Serial.printf("Now listening at IP %s, UDP port %d\n", WiFi.localIP().toString().c_str(), localUdpPort);

}

/**
 * @license
 * Copyright 2018 Google Inc. All Rights Reserved.
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 * https://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 * =============================================================================
 */
import * as posenet from '@tensorflow-models/posenet';
import * as tf from '@tensorflow/tfjs';

const color = 'aqua';
const boundingBoxColor = 'red';
const lineWidth = 2;

export const tryResNetButtonName = 'tryResNetButton';
export const tryResNetButtonText = '[New] Try ResNet50';
const tryResNetButtonTextCss = 'width:100%;text-decoration:underline;';
const tryResNetButtonBackgroundCss = 'background:#e61d5f;';

function isAndroid() {
  return /Android/i.test(navigator.userAgent);
}

function isiOS() {
  return /iPhone|iPad|iPod/i.test(navigator.userAgent);
}

export function isMobile() {
  return isAndroid() || isiOS();
}

function setDatGuiPropertyCss(propertyText, liCssString, spanCssString = '') {
  var spans = document.getElementsByClassName('property-name');
  for (var i = 0; i < spans.length; i++) {
    var text = spans[i].textContent || spans[i].innerText;
    if (text == propertyText) {
      spans[i].parentNode.parentNode.style = liCssString;
      if (spanCssString !== '') {
        spans[i].style = spanCssString;
      }
    }
  }
}

export function updateTryResNetButtonDatGuiCss() {
  setDatGuiPropertyCss(
      tryResNetButtonText, tryResNetButtonBackgroundCss,
      tryResNetButtonTextCss);
}

/**
 * Toggles between the loading UI and the main canvas UI.
 */
export function toggleLoadingUI(
    showLoadingUI, loadingDivId = 'loading', mainDivId = 'main') {
  if (showLoadingUI) {
    document.getElementById(loadingDivId).style.display = 'block';
    document.getElementById(mainDivId).style.display = 'none';
  } else {
    document.getElementById(loadingDivId).style.display = 'none';
    document.getElementById(mainDivId).style.display = 'block';
  }
}

function toTuple({y, x}) {
  return [y, x];
}

export function drawPoint(ctx, y, x, r, color) {
  ctx.beginPath();
  ctx.arc(x, y, r, 0, 2 * Math.PI);
  ctx.fillStyle = color;
  ctx.fill();
}

/**
 * Draws a line on a canvas, i.e. a joint
 */
export function drawSegment([ay, ax], [by, bx], color, scale, ctx) {
  ctx.beginPath();
  ctx.moveTo(ax * scale, ay * scale);
  ctx.lineTo(bx * scale, by * scale);
  ctx.lineWidth = lineWidth;
  ctx.strokeStyle = color;
  ctx.stroke();
}

/**
 * Draws a pose skeleton by looking up all adjacent keypoints/joints
 */
export function drawSkeleton(keypoints, minConfidence, ctx, scale = 1) {
  const adjacentKeyPoints =
      posenet.getAdjacentKeyPoints(keypoints, minConfidence);

  adjacentKeyPoints.forEach((keypoints) => {
    drawSegment(
        toTuple(keypoints[0].position), toTuple(keypoints[1].position), color,
        scale, ctx);
  });
}

/**
 * Draw pose keypoints onto a canvas
 */
export function drawKeypoints(keypoints, minConfidence, ctx, scale = 1) {
  for (let i = 0; i < keypoints.length; i++) {
    const keypoint = keypoints[i];

    if (keypoint.score < minConfidence) {
      continue;
    }

    const {y, x} = keypoint.position;
    drawPoint(ctx, y * scale, x * scale, 3, color);
  }
}

/**
 * Draw the bounding box of a pose. For example, for a whole person standing
 * in an image, the bounding box will begin at the nose and extend to one of
 * ankles
 */
export function drawBoundingBox(keypoints, ctx) {
  const boundingBox = posenet.getBoundingBox(keypoints);

  ctx.rect(
      boundingBox.minX, boundingBox.minY, boundingBox.maxX - boundingBox.minX,
      boundingBox.maxY - boundingBox.minY);

  ctx.strokeStyle = boundingBoxColor;
  ctx.stroke();
}

/**
 * Converts an arary of pixel data into an ImageData object
 */
export async function renderToCanvas(a, ctx) {
  const [height, width] = a.shape;
  const imageData = new ImageData(width, height);

  const data = await a.data();

  for (let i = 0; i < height * width; ++i) {
    const j = i * 4;
    const k = i * 3;

    imageData.data[j + 0] = data[k + 0];
    imageData.data[j + 1] = data[k + 1];
    imageData.data[j + 2] = data[k + 2];
    imageData.data[j + 3] = 255;
  }

  ctx.putImageData(imageData, 0, 0);
}

/**
 * Draw an image on a canvas
 */
export function renderImageToCanvas(image, size, canvas) {
  canvas.width = size[0];
  canvas.height = size[1];
  const ctx = canvas.getContext('2d');

  ctx.drawImage(image, 0, 0);
}

/**
 * Draw heatmap values, one of the model outputs, on to the canvas
 * Read our blog post for a description of PoseNet's heatmap outputs
 * https://medium.com/tensorflow/real-time-human-pose-estimation-in-the-browser-with-tensorflow-js-7dd0bc881cd5
 */
export function drawHeatMapValues(heatMapValues, outputStride, canvas) {
  const ctx = canvas.getContext('2d');
  const radius = 5;
  const scaledValues = heatMapValues.mul(tf.scalar(outputStride, 'int32'));

  drawPoints(ctx, scaledValues, radius, color);
}

/**
 * Used by the drawHeatMapValues method to draw heatmap points on to
 * the canvas
 */
function drawPoints(ctx, points, radius, color) {
  const data = points.buffer().values;

  for (let i = 0; i < data.length; i += 2) {
    const pointY = data[i];
    const pointX = data[i + 1];

    if (pointX !== 0 && pointY !== 0) {
      ctx.beginPath();
      ctx.arc(pointX, pointY, radius, 0, 2 * Math.PI);
      ctx.fillStyle = color;
      ctx.fill();
    }
  }
}

/**
 * Draw offset vector values, one of the model outputs, on to the canvas
 * Read our blog post for a description of PoseNet's offset vector outputs
 * https://medium.com/tensorflow/real-time-human-pose-estimation-in-the-browser-with-tensorflow-js-7dd0bc881cd5
 */
export function drawOffsetVectors(
    heatMapValues, offsets, outputStride, scale = 1, ctx) {
  const offsetPoints =
      posenet.singlePose.getOffsetPoints(heatMapValues, outputStride, offsets);

  const heatmapData = heatMapValues.buffer().values;
  const offsetPointsData = offsetPoints.buffer().values;

  for (let i = 0; i < heatmapData.length; i += 2) {
    const heatmapY = heatmapData[i] * outputStride;
    const heatmapX = heatmapData[i + 1] * outputStride;
    const offsetPointY = offsetPointsData[i];
    const offsetPointX = offsetPointsData[i + 1];

    drawSegment(
        [heatmapY, heatmapX], [offsetPointY, offsetPointX], color, scale, ctx);
  }
}

Credits

Pasindu Liyanage

3 projects • 23 followers

Electrical and Electronics Engineering Undergraduate

Digital Twin

Things used in this project

Hardware components

Software apps and online services

Story

INTRODUCTION

HOW IT WAS DONE

Capturing the Video

Analyzing the Video

Transmitting the Key Points

Predicting Motions

Robot Control

CONNECTIVITY

DEMONSTRATIONS

CONCLUSION

Schematics

NodeMCU Circuits

Code

UDP Connector

Pose Net

Credits

Pasindu Liyanage

Comments

Embed the widget on your own site

Digital Twin

Digital Twin

Things used in this project

Hardware components

Software apps and online services

Story

INTRODUCTION

HOW IT WAS DONE

Capturing the Video

Analyzing the Video

Transmitting the Key Points

Predicting Motions

Robot Control

CONNECTIVITY

DEMONSTRATIONS

CONCLUSION

Schematics

NodeMCU Circuits

Code

UDP Connector

Pose Net

Credits

Pasindu Liyanage

Comments

Related channels and tags