In September Microsoft presented their vision of future gaming and it was a wild ride to read about to be sure. Today, we take a look at what Sony has on their mind for advancing gaming from one of their more recent patent filings. While it's definitely a different vision from Microsoft's, it's equally fascinating in shear scope. Sony envisions a day when your physical controller will be replaced or enhanced with a combination of a human gaze, a voice command, hand gesturing and even telekinetic-like powers. The latter, believe or not, is based on using the gamers own brainwaves that are picked up by an apparatus hidden on the inside of a new gaming headset. And if you're thinking that this will never happen, then think again. We have a video that proves it's already working in early testing.
Sony's Patent Background
Recently, the Kinect device sold by Microsoft was introduced, which allows users to control and interact with a computer game console without the need to use a game controller. The user interacts with the user interface using gestures and spoken commands via the Kinect device. Specifically, the Kinect device includes a video camera, a depth sensor and a microphone to track the user's gestures and spoken commands. The video camera and depth sensor are used together to create a 3-D model of the user. The Kinect device however only recognizes limited types of gestures. Users can point to control a cursor but the Kinect device doesn't allow a user to click the cursor requiring the user to hover over a selection for several seconds to make a selection.
Simple Overview of Sony's Gaming Patent
Embodiments of the Sony invention relate to user interface technology that provides feedback to the user based on the user's gaze and a secondary user input, such as a hand gesture. In one embodiment, a camera-based tracking system tracks the gaze direction of a user to detect which object displayed in the user interface is being viewed. The tracking system also recognizes hand or other body gestures to control the action or motion of that object, using, for example, a separate camera and/or sensor.
Exemplary gesture input can be used to simulate a mental or magical force that can pull, push, position or otherwise move or control the selected object. The user's interaction simulates a feeling in the user that their mind is controlling the object in the user interface--similar to telekinetic power, which users have seen simulated in movies (e.g., the Force in Star Wars).
Gaze Tracking + Gesture Input
Sony's patent FIG. 1 shown below schematically illustrates user interface technology that provides feedback based on gaze tracking and gesture input according to one embodiment of the invention. In FIG. 1, the user is schematically illustrated with the user's eye and hand. The user views a display which displays a user interface (e.g., a video game, an Internet browser window, word processing application window, etc.). The display includes a computing device or is coupled to a computing device, such as a video game console or computer.
In Sony's patent FIG. 1, a camera (124) is shown positioned over the display with the lens of the camera pointed generally in the direction of the user. In one embodiment, the camera uses infrared illumination to track the user's gaze that is illustrated as patent point #128. The computing device analyzes the input from at least one camera with infrared illumination to determine the area of the display where the user is looking, and then determines the specific object (140) that the user is looking at. Alternatively, the camera may include a processor that determines the user's gaze.
Sony goes on to state that the same camera or separate camera (not shown in FIG. 1) may be used to track hand gestures (i.e., movements made by the user's hand as shown as patent point #112. In embodiments in which a separate camera is used, the camera alone may be used, a camera in combination with a near-infrared sensor, or a camera in combination with another depth sensor may be used to track hand gestures.
Alternatively, it will be appreciated that a controller or inertial sensor may be used to track the user's hand gestures. For example, the hand gesture may be a flick of an inertial sensor or controller that includes an accelerometer. The computing device then correlates the input from the gesture camera (or other gesture tracking device) to a movement of the object (shown on the patent figures as patent #144) or a command relating to the object displayed in the user interface (i.e., movement of the object 140 in the direction of arrow 144). Alternatively, the gesture sensor may include a processor that determines the user's gesture.
In use, the eye gaze is used to select the object #140 displayed in the user interface, and the hand gesture or body movement is used to control or move the object. It will be appreciated that these steps may order in any order.
Various Gaming Examples
It will be appreciated that there are many different applications for user interface technology that provides user feedback based on the combination of eye gaze tracking and hand gesture tracking For example, the hand gesture may launch a spell at a character on the user interface based on the character that the user is looking at. Another exemplary hand gesture may be a trigger (e.g. shooting action) in a shooting game.
The gaze and gestures may also be used to select virtual buttons by simulating the action of pressing a button (e.g., pointing a finger and moving the finger forward while the user's gaze is focused on the button). In another example, the gaze and user gesture may be used to zoom in or out of a particular portion of the user interface (e.g., zoom in to a particular portion of a map).
In still another example, a forward flick of a pointing hand could start an interaction with the object being watched by the user as detected by the gaze tracker. In yet another example, a beckoning gesture may be used to make the object the user is looking at move closer to the user in the user interface; similarly, a waving gesture could make the object recede.
Gaze tracking is advantageous because, to the user, it feels like a natural or even unconscious way to indicate an intent to interact with an object displayed in the user interface. Hand gestures are advantageous because the power of hand movement can be used to affect the power of the action on the screen, and hand gestures are a natural to way to interact with the selected objects to communicate a desired motion or to directly control motion.
To some extent this kind of gesturing was discussed at Intel's recent IDF conference in San Francisco. We covered this briefly in our report titled "Intel's Next Wave: Transparent Computing" under the topic of "Advancing Hand Gesturing." The technology behind this next generation gesturing came from SoftKinetic.
Getting back to this current patent, Sony states that although their invention has been described in terms of using hand gesturing, the fact is that other user gestures, such as foot gestures (i.e., movement of the user's foot) or facial gestures (i.e., movement of the user's head or movement of certain features of the user's face) may be used to interact with the user interface. For example, a foot gesture, such as swinging the user's foot, may be used to simulate kicking a ball in a video soccer game. In particular, a user may simulate a shot on goal (similar to a shot on goal in a real soccer game) by changing their gaze just prior to kicking the ball to trick the goalie--the ball is kicked in the direction of the user's gaze--and (hopefully) score a goal.
Sony's Technology Could Apply Equally to a Future Gaming Console, Computer or Smart TV
According to Sony's patent filing, the computing device 204 noted below in patent FIG. 2 may be a gaming console, a personal computer, a game kiosk, a smart-television that includes a computer processor, or other computing system. The system includes a computing device coupled to a display, a gaze sensor 212 and a gesture sensor 216.
More on the Gaze & Gesture Sensors
According to Sony, the gaze sensor tracks the user's eye. The gaze sensor may include a light source, such as near infrared illumination diodes, to illuminate the eye, and, in particular, the retina, causing visible reflections and a camera that captures an image of the eye showing the reflections. The image is then analyzed by the computing device to identify the reflection of the light, and calculate the gaze direction. Alternatively, the gaze sensor itself may analyze the data to calculate the gaze direction.
In one embodiment, the gaze sensor is the camera and light source and is positioned near the display, such as the TOBII X60 and X120 eye trackers. In another embodiment, the gaze sensor is integrated into the display (i.e., the camera and light source are included in the display housing), such as the TOBII T60, T120 or T60 XL eye trackers. In yet another embodiment, the gaze sensor are glasses worn by the user that include the camera and light source, such as the TOBII GLASSES eye tracker as noted in the graphic above. These are not 3D glass as much as they are eye tracking glasses.
Sony notes that it should be appreciated that these are merely exemplary and other sensors and devices for tracking gaze may be used. In addition, it will be appreciated that multiple cameras and light sources may be used to determine the user's gaze.
Playing Angry Birds using with the Tobii X120 Eye Tracker
Demo of Hand Free 3D Angry Birds game prototype: This game uses the Tobii X120 Eye Tracker to control the bird and blink detection to launch the bird from the slingshot.
Sony continues by stating that the gesture sensor may be a standard or 3-D video camera. It will be appreciated that multiple cameras and/or depth may be used to determine the user's gestures. Sony states that "In one particular embodiment, the gesture sensor is the Kinect device or a similar Kinect-like device," which is why Sony listed the Tobii products that we linked to above. In February we noted in one of our reports that Microsoft was in talks with Sony to license Kinect.
In Sony's opening summary they stated that "Exemplary gesture input can be used to simulate a mental or magical force that can pull, push, position or otherwise move or control the selected object." Well, this is the section covering that aspect of the invention. Sony states that "in yet another embodiment, the secondary input may be brainwaves and/or user emotions. In this example, the secondary sensor may be a sensor (or plurality of sensors) that measures and produces graphs of brainwaves, such as electroencephalogram (EEG). For example, several pairs of electrodes or other sensors may be provided on the user's head using a headset, such as, for example, the Emotiv EPOC headset. We covered the Emotiv EPOC headset back in our January 2011 report titled "Next Generation Interfaces Give New Meaning to Mind Control."
Tan Le is the co-founder and president of Emotiv Systems, a firm that's working on a new form of remote control that uses brainwaves to control digital devices and digital media. It's long been a dream to bypass the mechanical (mouse, keyboard and clicker) and have our digital devices response directly to what we think. Emotiv's recently released EPOC headset uses 16 sensors to listen to activity across the entire brain. Software "learns" what each user's brain activity looks like when one, for instance, imagines a left turn or a jump. To really get a feel for this technology, you should take a few moments and check out the video below.
Sony states that the headset may also be used to detect facial expressions. The brainwaves and/or facial expressions data collected may be correlated into object actions such as lifting and dropping an object, moving an object, rotating an object and the like, into emotions such as excitement, tension, boredom, immersion, mediation and frustration, and into character actions, such as winking, laughing, crossing eyes, appearing shocked, smiling, getting angry, smirking, grimacing and the like.
For example, a user may gaze at an object that the user wants to move, and the user may use his brainwaves to move the object – just as was shown in the video. In another example, a user may gaze at a character, and control the user's facial expressions, emotions and/or actions using the headset sensor system.
It will be appreciated that gaze tracking may be used with various combinations of gesture input, voice input, brainwave input and emotion input. For example, gaze tracking may be used with each of gesture input, voice input, brainwave input and emotion input. In another example, gaze tracking may be used with voice input, brainwave input and emotion input. In another example, gaze tracking may be used with voice input and brainwave input.
For example, received voice data may be analyzed to determine a user command (e.g., "scroll down", "scroll up", "zoom in", "zoom out", "cast spell", etc.), and then modify the user interface based on the command (e.g., by scrolling down, scrolling up, zooming in, zooming out, casting the spell, etc.).
At the End of the Day
At the end of the day, the race is on to take gaming to the next level of experiences. In September we presented Microsoft's vision for gaming in our report titled "Microsoft Envisions Where Gaming is going, and it's Wild!" Today we got to see what Sony is envisioning. It's about replacing and/or enhancing the physical controller with a simle gaze, a hand gesture, a voice command or even with brainwave controls.
As you were able to see in the videos that we provided, the concepts outlined today are actually producing decent results. Sony is determined to accelerate these developments and bring them to market as soon as they can. In some ways, Sony and Microsoft may in fact be collaborating so as to advance Kinect for the good of gaming. Collaboration to advance Kinect on an accelerated timetable would be beneficial for all gamers and developers. Whether this loosely knit coalition will actually succeed at accomplishing this goal is much too early to judge. Yet I think that it's a safe bet to say that gamers will be hoping that it will. For now, sit back and do a little dreaming.
Sony's patent application was originally filed in Q2 2011.
Note that technological revelations revealed in Intellectual Property filings are not to be interpreted as rumor. Furthermore, fictitious rumor site timetables should be dismissed.
NOTICE: The Patent Bolt blog presents a detailed summary of patent applications with associated graphics for journalistic news purposes as each such patent application is revealed by the U.S. Patent & Trade Office. Readers are cautioned that the full text of any patent application should be read in its entirety for full and accurate details. About Comments: Patent Bolt reserves the right to post, dismiss or edit comments.