Future Colossal Blog
Virtual Reality Interfacing: Hand Tracking and Spacial Controllers
In Part 1, I discussed traditional interfaces. Part 2 focuses on hand tracking, but first let’s start with spacial controllers, the controller/hand tracking hybrids referenced in the last section. At Future Colossal, we first explored these with the Razor Hydra by Sixense (a tethered electromagnetic duel controller solution). The initial limitation is obvious, having your hands tethered greatly limits your abilities for position tracking (especially 360 positional tracking). This is solved or will be solved with future iterations of the technology like their wireless STEM system (in constant delay of being released) and the Valve/HTC Vive controller.
The benefits over the gamepad are plentiful. Through the hand tracking, we are able to create virtual hands within the experience. Through inverse kinematics we are able to approximate arm positions. Combined, this starts to define the users virtual body. This becomes an extremely powerful tool within the experience design. One of the main goals in creating Shadows of Isolation was to play with the sense of scale. For the most part we are doing this by shifting the scale of the environment around the user, but there are moments where we could benefit from having a frame of reference for the user. The hands become this frame of reference. We are used to seeing our hands in the forefront of our vision so it makes sense that this would be one of our tools for understanding the world around us.
In Shadows of Isolation there is a scene where two whales swim within feet of the user. At first I was interested in this as a more realist demonstration of the scale of these whales and we created them at a lifelike scale. Amazingly they seemed small within the virtual environment. We ended up scaling them up 4x or more the actual size of the whales. This also accounted for the magnifying properties of vision underwater. Even with the increased scale, I’ve heard people say that the whales still feel much smaller than they would in real life. At times they already fill the viewer's full field of vision though, so scaling them up further is not a solution. The problem is that there is no scale of reference. As soon as we give the user virtual hands, they are able to tell just how large these creatures are and they become much more intimidating. The user feels more vulnerable once they sense a physical presence within the virtual world. Without the simulated hands there is no visual correlation with the user’s physical body. The user exists within the virtual world as a viewer, but doesn't effect or become effected by the virtual environments around them.
The next technology we worked with was mounting the Leap Motion to the front of the HMD in order to track and translate hand movement into the virtual world. The advantage is that there are no cords limiting user hand movement. They can move their hands as they would in the physical world instead of holding and operating controllers for each hand. The disadvantage is that there is a large loss in non-hand related input. The hand tracking/controller hybrid allows for thumb joysticks for user controlled movement as well as additional buttons for option and menu navigation. The option and menu navigation can be solved with smart game mechanics and user interfaces, but user movement needs to be solved with another hardware approach (see body in VR Inputs Part 3).
One of the aspects I was most struck by when working with the Leap Motion was when we used it as an augmented reality camera and controller input. The Leap Motion is a stereo based computer vision approach to hand tracking. This means it utilizes two cameras to approximate depth along with a technique like HAAR classifiers for object recognition and skeletal reconstruction of the hands for use in virtual interfacing. Sadly the solution does not generate a true depth image and instead relies more on object recognition for tracking (more about this later).
When we display the leap motion camera feed and overlay virtual objects that our hands can interact with, it creates an exceptional experience. This is leaps (pardon the pun) above the experience of the tracked hands in a fully virtual environment. The reason: tracked and translated avatar hands are uncanny. It is not that the rendering is an issue (our minds quickly adapt to this), it’s that we are acutely aware of our physical bodies even when we can’t see them. Presence is broken when hand tracking faults and your virtual hand closes when you know your hand is open. Even if the hand tracking is perfect, when using inverse kinematics to create arms (to avoid floating hands) the user can easily detect when the virtual arm movement doesn’t perfectly match that of their own.
By using the actual camera feed of the hands instead of avatar hands constructed from skeletal tracking, there are no discrepancies between the user’s physical and virtual hands/arms. The problem comes from the lack of the depth image mentioned earlier. Because of this, virtual objects cannot move under/behind the user’s hand and the use of a full virtual world is impossible. Depth sensing (made famous by the Kinect) is able to detect how far away objects are from the sensor. With this we are able to subtract irrelevant data (everything other than the hands/arms) and superimpose the user’s physical body into the virtual world.
Future of Hand Tracking in VR
Oculus is keenly aware of the need to develop an input device designed specifically for the challenges of VR. They have recently acquired Nimble VR, a company that was developing a hand tracking solution similar to the Leap Motion. The difference is that they are using a time of flight depth sensing solution similar to the Kinect V2, but for close range. This allows their system to be much more accurate and ultimately allow for background/foreground separation and the ability to augment the virtual world with live feeds of the user’s body.
Meanwhile the Valve and HTC partnership Vive HMD will come with a hand tracking/controller hybrid solution. This makes sense for Valve which is focused almost solely on gaming. The benefits from this solution are that it is not tethered… in fact both the user wearing the HMD and the controllers are able to move freely about their environment while being tracked. This definitely seems like the fastest approach to getting a quality product to market, but I feel it is a stop gap solution that will feel archaic in the years to come.
Magic Leap is another interesting company to discuss. While their product is still largely a mystery and we know it is more focused on being an AR solution, some of their recent patents lend themselves to interesting user interfaces based around hand tracking. Their idea is to create totums, or objects that users interact with using their hands in order to control the virtual world. These totums could be normal everyday objects that users "embed with powers” in the virtual world or specifically designed objects for particular experiences. While I’m not sure how effective this will be in actuality, I feel there are a lot of lessons to be learned for both AR and VR with this form of innovative interface.
HoloLens is Microsoft's approach to a Augmented Reality device that utilizes hand tracking and the ability to empower “dumb” objects. Details about much of their hardware is not released, but it appears they are using a depth sensor based hand tracking. This is most likely using the same technologies as the Kinect but in a smaller form factor. With Kinect, Microsoft has spent several years developing a common gesture language. 3D Gestures are not as intuitive as touch gestures, so having a unified language and utilizing perceived touch through virtual hands will become increasingly important.
Advantages and Challenges in Hand Tracking
One of the first things people do when they enter into Virtual Reality is put their hands in front of their face. Our hands are a vital tool for how we perceive and interact with the world around us. Without hands in the virtual world we are subject to remain spectators. We can move through the worlds but don’t have a physical presence within it. Integrating hand tracking into VR not only allows for new forms of interaction and input, but it furthers the sense of presence.
While all the advantages that we have discussed above open up new and exciting possibilities for Virtual Reality they also come with significant challenges. Any discrepancies between the physical and virtual hand movement breaks presence and glitches in tracking become extremely frustrating. Leap Motion, the best hand tracking solution currently available for VR, is more of a proof of concept then a reliable product. The hand tracking routinely faults making it unreliable for UI control, object interaction, or merely advanced visual presence. These are technical limitations though, and we can assume that these will be addressed in the near future.
The real challenges with hand tracking in Virtual Reality come once there is a fully capable tracking solution. At that point integrating hands into VR experiences will become the norm, but this means significant amounts of added work for game designers/developers. Once everyone has hands, every object will need to be able to be interacted with. Static objects become a thing of the past. The coffee cup on the corner of a table sitting in a dark hallway off the main game path now needs to be able to picked up, thrown, dropped, placed on another table in a different scene and more. What happens when this cup is thrown at an enemy? Can it distract them, hurt them, or change the gameplay in ways that the game creators may not have anticipated? These challenges can’t be avoided though, because if the user walks up to that cup and finds they cannot grab it, then persistence is broken. The user is quickly reminded that their virtual hands are merely virtual, as is the world and objects around them.
The next challenge is with the virtual/physical feedback loop. In real life if we try to reach through a wall our hands will be stopped. In the virtual world when we try to reach through a wall our virtual hands may stop, but our physical body has no restrictions and will continue to move forward. This discrepancy breaks persistence in the same way that faulty tracking does. As game/experience designers do we risk breaking persistence or do we create a world without virtual collisions and in doing so have no ability to restrict the user’s movements?
In an interesting experiment in our studio, we utilized Leap Motion hand tracking to allow users to reach in front of them and grab hold of a floating cube. Oddly, users almost felt like they could feel the cube in their hands. One user described it as if they “were holding an object slightly heavier then air.” This perceived sense of touch shows how powerful presence can be when our mind accepts the virtual hands as an extension of their physical body. It may be that user’s train their brains to react to the virtual environment as if it is real. Users know they cannot reach through the virtual wall so they automatically stop their physical hands when their virtual hands collide.
It is not hard to imagine furthering this perceived sense of touch with simple haptic embedded gloves. Much study has been focused on advancing haptics to read textures derived from normal maps that are used to skin traditional 3D game objects. This will empower a pseudo tangible world within VR while not requiring new game or art assets to be designed. This could be furthered with hand, arm, or even body exoskeletons that have the ability to physically restrict body movements in relation to in game collisions. However, I don’t foresee many people being willing to strap on a whole body haptic suit let alone a mechanical exoskeleton.
Undoubtedly new tools, both hardware and software, will be designed to empower the player and game designer/developer with virtual hands in a realistic and unified manner. This will be a long process though and we can expect many years of experimentation, failures, and stop gap solutions like the hybrid positional game controllers. It will be interesting to see if it is the hardware/software that solves these challenges or if the true solution is a change in the way we, as users, perceive ourselves in the virtual space.
Continuing this discussion on Virtual Reality interfacing, the next installment will look at full body tracking and biometric input. I will conclude with discussing our custom control scheme designed for Shadows of Isolation and where I see our control schemes advancing in future experiences.
The Evolution of Arm HUD is an interesting article on the Arm HUD gif seen in the middle of this article. The Augmented Hand Series is the glitched hand artwork by Golan Levin, Chris Sugrue, and Kyle McDonald seen in the image towards the bottom of the page.