Project Runu is interacting with Microsoft cognitive APIs to analyze an Image and tell the user the details about it.
MotivationThe motivation of project runu came from Microsoft cognitive API demo. Here it showed a blind person leveraging the power of cognitive API and learning about his surroundings.
This led me think of an alternate solution for security cameras, face detection, badge swapping machines, tourist helper, etc. I will mention about these ideas later.
Preparation TimeI am a great fan of Node.js. With Node.js you can easily devise a lightweight JavaScript application for your system. The increasing popularity of Node.js made Technical Machine team to device an unique IoT device which we know as Tessel.
Coding with IoT is no longer a challenge for any node.js enthusiast like myself, as deployment of code on IoT has never been so easy. In this project I am using Tessel 2, which has support for io.js. Tessel2 comes with USB support, so any USB supported camera can easily marry with it.
Action TimeTo start with I explored the npm modules for tessel camera, when I got this resource. Tessel-AV helps you interact with image, video and audio streams easily.
The next thing was to marry this with the Microsoft cognitive API. For that one needs to register with Microsoft cognitive service. On registering it will give a API key; for our project we just need this API key.
Here is the node.js script which interacts with USB supported camera, capturing a picture, sending it to cognitive API, and coming back with the analysis
var fs = require('fs');
var av = require('tessel-av');
var os = require('os');
var http = require('http');
var port = 8082;
var camera = new av.Camera();
null
//Create a http server and capture the image from tessel
//supported usb camera
http.createServer((request, response) => {
response.writeHead(200, { 'Content-Type': 'image/jpg' });
var capture = camera.capture();
capture.pipe(response);
capture.on('data',function(data){
saveimage(data);
analyzeimage();
});
}).listen(port, () => console.log(`http://${os.hostname()}.local:${port}`));
null
//Analyze the captured image by sending it to the cognitive
//api for description
function analyzeimage(){
fs.readFile(__dirname+'/captured-image.jpg',function(err,data){
if(err)
console.log(err);
else{
parseURL = require('url');
parsedURL = parseURL.parse("https://api.projectoxford.ai/vision/v1.0/analyze?visualFeatures=Description");
console.log(parsedURL.host);
var options = {
host: parsedURL.host,
path: "/vision/v1.0/analyze?visualFeatures=Description,Tags",
headers: {
'Content-Type':'application/octet-stream',
'Content-Length':data.length,
'Ocp-Apim-Subscription-Key':'<>'
},
method:'POST',
};
callback = function(response){
var str ='';
response.on('data', function(chunk){
str+= chunk;
});
response.on('end', function(){
var obj = JSON.parse(str);
//Print out the caption
console.log("I think the caption of this object should be " + obj.description.captions[0].text+". I am "+obj.description.captions[0].confidence+"sure.");
});
}
var post_req = http.request(options, callback);
post_req.write(data);
post_req.end();
}
});
}
This simple, approx 100 line of code, helped me create this low cost solution which can act as a mobile image identification system.
Here is the demo POC for the same, where I am analyzing a vase of flower and a table clock. The technology not only gives the correct observation but also it considers the environment, like table clock sitting on a desk
Proof of ConceptConclusionThis technology can prove to have immense potential in a few fields. Some of the ideas are as follows:
1. Security cameras: Security cameras can talk to cognitive APIs and send the video stream. As soon as there is any kinds of violence detected by the API, a security alarm can be sent in order to prevent it. The detection alarm can work on captions which have text like gun, knife, bomb etc. A captured image is sent to the officials who can decide whether or not to respond to the alert. Face detection is another option for the same.
2. Badge swapping systems: Currently badge surfing is quite common. There needs to be another system which detects the person's face and then allows him/her. To strengthen the system, we can use this technology for verification.
These are few of my ideas where I see this technology useful apart from visual impaired support system. If you have any such ideas let us know.
Thanks for reading through this!!
Comments