When growing old with age, mobility and safety becomes a constant worry. Declining agility and unsteadiness can make it dangerous for older citizens to go out from the safety of their homes to do their day to day activities.
This project introduces the concept of an extremely futuristic initiative called the digital twin to help our very own older generation stay safe from the dangerous they could face in the outside world.
A digital twin is a digital replica of a living or non-living physical entity. By bridging the physical and the virtual world, data is transmitted seamlessly allowing the virtual entity to exist simultaneously with the physical entity. But what if we can go a step further and create a digital twin for living beings in the physical space using advanced AI and Robotics.
In other words what we created was a humanoid robot who can replicate exact motions done by a human who is inside the safety of their homes. Those who have watched “Real Steel” might have a slight idea about what I’m talking about.
This project consists of three main sections,
1. Capturing the motions made by humans
2. Identifying and predicting the motion on the cloud and transmitting that data to the robot.
3. Controlling the robot to Replicate the motion
Now this seems very futuristic and complicated and sounds like it would require large amount of sensors. But thanks to the advancements in Artificial Intelligence, it does not have to be like that anymore. In this project the motion capturing is done using AI and Computer Vision using a typical web camera connected the computer. This project was done using an open source AI model the captured video is evaluated and all the joints in the body are identified.
Then their XY coordinates are transmitted in to a NodeJS based cloud server which analyzes the received numerical data and predicts the motion of the person. Then the commands related to that particular motion are transmitted to the robots ESP8266 microcontroller over the internet. Based on the commands received the ESP8266 will communicate with the STM32 controller and move the motors of the robot accordingly.
Complete high-level flow of the project is shown below,
First stage of the project consists of capturing the video feed. In order to do that a typical web camera was connected to laptop and script was written using java script to open the camera and capture the footage.
const videoWidth = 600;
const videoHeight = 500;
const stats = new Stats();
async function setupCamera() {
if (!navigator.mediaDevices || !navigator.mediaDevices.getUserMedia) {
throw new Error(
'Browser API navigator.mediaDevices.getUserMedia not available');
}
const video = document.getElementById('video');
video.width = videoWidth;
video.height = videoHeight;
const mobile = isMobile();
const stream = await navigator.mediaDevices.getUserMedia({
'audio': false,
'video': {
facingMode: 'user',
width: mobile ? undefined : videoWidth,
height: mobile ? undefined : videoHeight,
},
});
video.srcObject = stream;
return new Promise((resolve) => {
video.onloadedmetadata = () => {
resolve(video);
};
});
}
async function loadVideo() {
const video = await setupCamera();
video.play();
return video;
}Analyzing the VideoIn here open source AI model called “pose-net” was used. Pose-net is a machine learning model which allows for real-time human pose estimation in the browser.
So what is pose estimation anyway? Pose estimation refers to computer vision techniques that detect human figures in images and video, so that one could determine, for example, where someone’s elbow shows up in an image. To be clear, this technology is not recognizing who is in an image — there is no personal identifiable information associated to pose detection. The algorithm is simply estimating where key body joints are.
At a high level pose estimation happens in two phases:
1. An input RGB image is fed through a convolutional neural network.
2. Either a single-pose or multi-pose decoding algorithm is used to decode poses, pose confidence scores, keypoint positions, and keypoint confidence scores from the model outputs.
Hence the TensorFlow pose net libraries were imported in to the code base.
import * as posenet from '@tensorflow-models/posenet';
import * as tf from '@tensorflow/tfjs';Then the library functions were called to detect the poses in real time through the video input.
function detectPoseInRealTime(video, net) {
const canvas = document.getElementById('output');
const ctx = canvas.getContext('2d');
const flipPoseHorizontal = true;
canvas.width = videoWidth;
canvas.height = videoHeight;
async function poseDetectionFrame() {
stats.begin();
let poses = [];
let minPoseConfidence;
let minPartConfidence;
switch (guiState.algorithm) {
case 'single-pose':
const pose = await guiState.net.estimatePoses(video, {
flipHorizontal: flipPoseHorizontal,
decodingMethod: 'single-person'
});
poses = poses.concat(pose);
minPoseConfidence = +guiState.singlePoseDetection.minPoseConfidence;
minPartConfidence = +guiState.singlePoseDetection.minPartConfidence;
break;
stats.end();
requestAnimationFrame(poseDetectionFrame);
}
poseDetectionFrame();
}Transmitting the Key PointsOnce the key points were identified through the model, it was transmitted in to a Node-Red based web server running on AWS to analyze and predict the movements done by the user. In order to transmit the data, MQTT protocol was used.
MQTT is a machine-to-machine (M2M)/"Internet of Things" connectivity protocol which was designed as an extremely lightweight publish/subscribe messaging transport.
import {connectMQTT} from './MQTTData';
function initialize() {
connectMQTT();
}When connecting to MQTT you need to specify the MQTT parameters as shown below
let mqtt = require('mqtt');
let websocket='your_aws_host_ip';
let port=8080;
let user='your_user_name';
let pass='your_password;
let client;Then export function was used to export the data on to the server as follows.
export function connectMQTT() {
console.log('Connecting to MQTT');
client = newPaho.MQTT.Client(websocket, port, clientname);
function sendMessage(message, topic){
var message = new Paho.MQTT.Message(message);
message.destinationName ='hr1/'+topic;
client.send(message);
}Predicting MotionsData was captured on node using the MQTT input nodes based on each topic. Then the received messages were formatted into json files and were evaluated. When evaluating, the arms and legs were considered independently and their movements were predicted.
Based on the predictions, the commands were sent to the ESP8266 micro controller connected to the robot over the UDP through the internet.
Robot ControlRobot consisted of two main microcontrollers, the ESP8266 and the STM32 connected over a serial interface for commanding and controlling. The ESP8266 microcontroller was connected to internet over WiFi and was responsible for receiving the data related to the motion predictions.
void UDPConnect() {
Serial.println("Connecting to %s", ssid);
WiFi.config(ip, gateway, subnet);
WiFi.begin(ssid, password);
while (WiFi.status() != WL_CONNECTED) {
delay(500);
}
Serial.println("connected");
Udp.begin(localUdpPort);
}
void UDP() {
int packetSize = Udp.parsePacket();
if (packetSize) {
int len = Udp.read(incomingPacket, 255);
if (len > 0) {
incomingPacket[len] = 0;
}
input = incomingPacket;
Serial.println(input);
Udp.beginPacket(Udp.remoteIP(), Udp.remotePort());
Udp.write(replyPacket);
Udp.endPacket();
}
}Based on the commands captured through the UDP connection, the relevant Servo controlling commands were transmitted to the STM32 controller through the Serial Interface.
servos[0].ID = 6; servos[0].Position = 1500;
servos[1].ID = 7; servos[1].Position = 2500;
servos[2].ID = 15; servos[2].Position = 500;
servos[3].ID = 14; servos[3].Position = 1500;
servos[4].ID = 1; servos[4].Position = 1500;
servos[5].ID = 5; servos[5].Position = 1500;
servos[6].ID = 13; servos[6].Position = 1500;
servos[7].ID = 9; servos[7].Position = 1500;
servos[8].ID = 8; servos[8].Position = 1900;
servos[9].ID = 16; servos[9].Position = 1100;
servos[10].ID = 4; servos[10].Position = 1750;
servos[11].ID = 12; servos[11].Position = 1250;
servos[12].ID = 3; servos[12].Position = 1500;
servos[13].ID = 11; servos[13].Position = 1500;
servos[14].ID = 2; servos[14].Position = 1200;
servos[15].ID = 10; servos[15].Position = 1800;
myse.moveServos(servos, 16, 300);CONNECTIVITYThis use case was tested on a pilot 5G connection which achieved real time motion replication even when the human and the robot were in two different locations.
DEMONSTRATIONSInitial Testing - Unoptimized
Final Demonstrations - Optimized
The project was demonstrated as an advanced use case of machine learning as well as the 5G internet at various exhibitions and forums in Sri Lanka and was able to generate lots of interest among the community regarding the advancements of technology and what to expect in the near future.
CONCLUSIONThe technology explained in the project will cater to wider spectrum of verticals in the near future. Among them, one of the major use cases will be the creation of a digital twin for human beings in the physical space in form of a robot which would replicate their exact movements. Since the total connectivity happens over the internet, the robot can be controlled from anywhere in the world, almost instantaneously. This will act as a core innovation which would be beneficial for our very own older generation to perform various activities by being within the safety of their homes without exposing them to the dangerous of the outside world.
On a completely different vertical, technologies like this would enable advanced use cases such as remote surgery. If there is a really skilled surgeon in different part of the world it wouldn’t be practical for them to travel all around the world to perform operations. However, using this technology, the doctors will be able to perform operations on patients on the other side of the globe using a robot who replicates the motions done by the doctor. It would also minimize the risks the humans have to take in activities such as bomb disposal where the human can control the robot from a safe remote location and perform the activities which would be replicated by the robot to successfully dispose the bombs. In the verge of Industrial revolution 4.0, this technology will completely revolutionize world by bringing the things we could have only imagined in to reality in the very near future.









Comments