Microsoft has filed a patent pertaining to multi-modal user interfaces to electronic computing devices, and in particular, to the use of gestures to trigger different input modalities associated with functions implemented on a smart phone. Specifically, Microsoft reveals "Tilt-to-Talk" and "Point to Scan" smartphone gesturing to initiate voice search and searches related to photos taken in real time.
Basic Introduction to Microsoft's Newly Proposed Gesturing Commands
Microsoft states that due to the size and mobility of today's smartphones it would be both natural and feasible to use hand, wrist, or even arm gestures to communicate commands as if the device were an extension of the user's hand. Some user gestures are detectable by electro-mechanical motion sensors within the circuitry of the smartphone. The sensors can sense a user gesture by detecting a physical change associated with the device, such as motion of the device or a change in orientation. In response, voice-based or image-based input modes can be triggered based on the gesture. Methods and devices described in the invention provide a way to select from among different input modes to a device feature, such as a search, without reliance on manual selection.
Today's patent report focuses on two of Microsoft's newly proposed gesturing commands.
Microsoft Introduces the "Tilt to Talk" Gesture
The first new gesture is described as "Tilt to Talk." Microsoft states that a smartphone as shown in patent FIG. 5 can be configured with a microphone at the proximal end (bottom) of the phone and a camera lens at the distal end (top) of the phone. With such a configuration, detecting elevation of the bottom end of the phone indicates the user's intention to initiate a search using voice input to the search engine, ("Tilt to talk").
In Microsoft's patent FIG. 6 we see a smartphone in a pair of sequential snapshot frames, (692) and (694). Upon sensing that the bottom end of the phone is elevated above the distal or top end of the phone the smartphone is aware that it is in an "inverse tilt" orientation and the gestural interface triggers initiation of a search application wherein the input mode is voice input.
Microsoft Introduces the "Point to Scan" Gesture
The second new gesture is described as "Point to Scan." Microsoft states that detecting elevation of the top end of the phone indicates the user's intention to initiate a search using camera images as input to the search engine ("Point to scan"). Once the search engine has received the input the search engine is activated to perform a search, and results of the search can be received and displayed on the screen of the smart phone. If a different type of phone motion is detected, the gestural interface can be programmed to execute a different feature other than a search.
In Microsoft's patent FIG. 7 we see that the smartphone appears in a series of three sequential snapshot frames (792), (793), and (794) that demonstrates the new "Point to Scan" gesture.
As the user's hand tilts backward and upward, from the user's point of view, the orientation of the smartphone changes from a substantially horizontal position to a substantially vertical position, exposing the camera lens located at the top of the smartphone. By pointing the smartphone a user is able to aim the camera lens and scan a particular target scene. Upon sensing a change in orientation of the smartphone such that the distal or top end of the phone is elevated above the proximal or bottom end (bottom) of the smartphone by a predetermined threshold angle, (which is consistent with a motion to point the camera lens at a target scene) the gestural interface interprets such a motion as being a pointing gesture.
The predetermined threshold angle can take on any desired value. Typically, values are somewhere in the range of between 45 and 90 degrees. The gestural interface then responds to the pointing gesture by triggering initiation of a camera-based search application wherein the input mode is a camera image, or a "scan" of the scene in the direction that the mobile device (700) is currently aimed. Alternatively, the gestural interface can respond to the pointing gesture by triggering initiation of a camera application, or another camera-related feature.
When the sensors detect a backward and upward motion of the user's hand a camera mode is triggered. In response, a search function is activated, for which the camera lens provides input data. The words "traffic" "movies" and "restaurants" then appear on the display and the background scene is updated from the previous scene shown in screenshot (799a), to the current scene shown in screen shot (799b). Using GPS mapping data, the identification function can deduce that the current location is Manhattan, and using a combination of GPS and image recognition of buildings, the location can be narrowed down to Times Square.
Microsoft's patent application was originally filed in Q3 2011 and recently revealed by the US Patent and Trademark Office.
A Note for Tech Sites covering our Report: We ask tech sites covering our report to kindly limit the use of our graphics to one image. Thanking you in advance for your cooperation.
Patent Bolt presents a detailed summary of patent applications with associated graphics for journalistic news purposes as each such patent application is revealed by the U.S. Patent & Trade Office. Readers are cautioned that the full text of any patent application should be read in its entirety for full and accurate details. Revelations found in patent applications shouldn't be interpreted as rumor or fast-tracked according to rumor timetables. About Comments: Patent Bolt reserves the right to post, dismiss or edit comments.