The smart home is one of the most exiting developments currently going on, but has still to really hit a homerun. Most of the devices are cumbersome to configure and use, and often requires an app to be installed. For instance a phone app to turn your light on is one of the most round about and slow ways to turn on a light bulb, and doesn't improve on the generic light switch. I can get up from the couch and flip the switch faster then unlocking my phone, find and launch the app, wait for a connection, just to turn off the light or change the brightness. Now there's other advantages like scheduling etc, but we're giving up an awful lot of simplicity in the process. I believe that lights and other ordinary devices that we are putting "smarts" into needs to add value without taking away the existing value.
But even the good old light switch isn't the perfect solution. How often have you come into a room you haven't been to before, and trying to figure out which switch turns on which light? Perhaps there's a better more natural way by taking some cues from Natural User Interfaces.
As a comparison think of how we interact with people and ask them to do something. There the most natural way is to look at a person and ask them to do something, for instance "Could you please pass the salt?", The voice is the action, and the who is based on who you're looking at. It's completely natural way for interacting between us. A switch is a very indirect way. You move a lever, and perhaps it connects some wires that ultimately runs through a bulb somewhere else in the room. Wouldn't it be much better if we could just look at a light and tell it to turn on or change its brightness like we're used to interact with each other? Anyone entering a room knows exactly how to turn on any device simply by looking at it and say "on". No need to guess light switches, or find the app that works with the particular bulb. Farfetched? Not really.
With augmented reality starting to become a reality (no pun intended), these types of scenarios starts becoming possible. We can use augmented reality to tell you about what in the smart home that can be interacted with, their status etc by overlaying them with information (for instance currently playing song on the speaker), but also understand what you're looking at and use that as the context for voice commands. This makes voice control much more natural and can greatly improve interpreting the intent of the user.
In this project we'll be using Microsoft's HoloLens to make real objects in the smart home augmented with information and controllable with gestures and voice commands.
Now I just said that launching app on a phone was a slow cumbersome way to control a light. Surely putting on a HoloLens, wearing it around the house and launching apps to flip light-switches aren't much better, and I agree that is the state today. However it's safe to assume these types of devices will shrink down to a practical wearable - it's not farfetched to think we in the near future would rather wear a pair of light smart glasses instead of carrying our phones around in our pockets all day - so keep a mind open and think of this project as what could be in the near future, but at the same time marvel that we already today can use existing technology to truly make interacting with the devices in your home natural.
Windows 10 and the HoloLens' sensors provides all the pieces we need to build this type of app. Here's a description of each piece we'll use to put it all together.
Windows 10 comes with built-in voice command APIs. We can use this to register a set of voice commands we can interpret, like "Turn On", "Set Color [color]", etc. Voice control is covered in this tutorial: https://developer.microsoft.com/en-us/windows/holographic/holograms_212
The HoloLens has sensors that tracks your hands and can detect when your hand performs a tap or drag. In addition to voice, we can also use this to control the devices. For instance tap to turn on, or drag to change brightness. Gestures are covered in this tutorial: https://developer.microsoft.com/en-us/windows/holographic/holograms_211
3. Spatial Mapping
The HoloLens' sensors all constantly scans the rooms it is in and builds a 3D model of the room. We can use this to understand where for instance a lamp is. Spatial mapping is covered in this tutorial: https://developer.microsoft.com/en-us/windows/holographic/holograms_230
The HoloLens defines the concept "Gaze" as what you're directly looking at. It's simply done by drawing a line from the hololens directly forward and intersect with the room that has been spatially mapped or any holograms rendered in the room. Gaze is covered in this tutorial: https://developer.microsoft.com/en-us/windows/holographic/holograms_210
5. World Anchors
In addition to the spatial mapping, "anchors" serves as fixed locations in the room. The HoloLens has the ability to recognize rooms it's been in and restore the anchor. This means we can use the location of a gaze and save it. So the user can gaze on a lamp and "store" that location for later, but can even be shared with other HoloLenses. If a "world anchor" service is running on the network, any HoloLens that enters a room for the first time will after a few seconds of scanning quickly be able to recognize the location of a world anchor and know this is for instance a lamp that can be interacted with. World anchors and sharing them is covered in this tutorial: https://developer.microsoft.com/en-us/windows/holographic/holograms_240
6: AllJoyn: Windows 10 ships with built-in support for AllJoyn - a home automation standard that helps abstracting other automation standards. While this is not strictly a requirement, I believe to be able to to make the smart home work, we can't have each device siloed into each their app, but should all talk the same language. With AllJoyn Device Service Bridges we can abstract these devices so they all talk the same language, and not worry about which protocol is used. I'll be using a LIFX bulb which already "speaks" AllJoyn and in addition use my "Philips Hue Device Service Bridge" so bridge Hue bulbs over to AllJoyn. That way one code base can control multiple type of light bulbs, and if a new protocol comes around, we just add another DSB and deploy it to for instance a Raspberry PI on the network. You can follow my tutorial how to control AllJoyn Devices using my tutorial here or use my AllJoyn Client library for simple quick discovery and usage of common AllJoyn devices. I also cover the AllJoyn bridges in this video
Using all of the above pieces enables us to put together a solution that allows us to discover and interact with lights, simply by looking at and talking to or tapping them, and it can easily be expanded into controlling more device types.
I've used the tutorials and my libraries referenced above to build an app that allows you to "place" a sphere around a lamp using gaze and spatial mapping. The sphere acts as a placeholder for the area you can use for interaction. When a lamp is placed, a spatial anchor is stored on the device and also sent to a service running on the network for other HoloLenses to pick up. Note: The rendering of the placeholder sphere is more for illustration purposes, and doesn't really need to happen.
Using Gaze we place a cursor in the front of the user to signify "what" you're looking at. If the cursor is on the sphere, any voice control or gesture is applied to that device and only responds to gestures and commands valid for the device. This reduces errors, so you don't turn on all devices in the house when saying "on".
The "Hue DSB" then handles converting my AllJoyn commands into Philips Hue commands, so that the HoloLens has no idea that this isn't actually an AllJoyn bulb.
This all brings it into a prototype app with the promised simplicity of looking at a device and telling it what to do. See the recording below which is a view of what I see through the HoloLens glasses using my App.
Every single piece of this app has been built by putting together existing technology and libraries, and proves that already today we can create a more natural and smart way for interacting with our home.