Last Thursday the US Patent & Trademark Office published a patent application from Google regarding Project Glass. Project Glass is Google's marketing brand for what is known in the industry as a "Head Mounted Display" or HMD. Google's design patent for Project Glass surfaced in May of this year which put the project into higher gear. This weeks patent application was more focused on the HMD's functionality, multimodal features, apps and noteworthy patent claims which will go a long way in protecting Google's Project Glass device in the future.
Key to Google's Patent: The Multimode Input Field
Google's Project Glass patent focuses on a multimode input field that may be incorporated as a graphical element in the display of a wearable computer which is widely known as a head-mountable display or HMD. Google's patent FIG. 1 which is illustrated below is a simplified illustration of an HMD that is displaying a multimode input field in patent point #14.
In an exemplary embodiment, Google states that the multimode input field is configured to accept and display a number of different modalities of content. The displayed content may be generated based on input data that is received from a number of different input sources. The input data on which the displayed content is based may include a number of different modalities of data. As such, the multimode input field may be configured to receive typed text, text corresponding to speech in an audio signal, images from various sources, and/or video from various sources, and to provide various functions such as text-based searches, image-based searches, face detection and/or face recognition, contact look-up, and/or an application-based functions such as composing an email or editing a document. In a further aspect, the multimode input field may be moveable, resizable, and otherwise adjustable to accommodate various modalities of content from the various different input sources.
Project Glass: Drag and Drop Actions
One of core features that Project Glass is likely to debut with is the classic drag and drop feature modified for an HMD. According to Google, input content for the multimode input field may be specified via a drag and drop instruction, which corresponds to a drag and drop action by the user. Googles' patent FIG. 7 shown below is a simplified illustration of an HMD during an exemplary drag and drop action.
In particular, patent FIG. 7 illustrates a display (701) of an HMD, which is displaying a multimode input field as well as an application (704). As shown, the user may perform a drag and drop action by selecting content (703) from the application and then dragging and dropping the content in the multimode input field.
Google further states that a drag and drop action may be performed on a touchpad which controls the movement of a selection icon within the display of the HMD. You could see the touchpad Google describes in patent FIG. 2 above. A drag and drop instruction may correspond to various gestures or combinations of gestures on a touchpad. For example, the selection icon may be moved over content. A single-tap, double-tap, or tap and hold gesture could then be used to select the content before a swiping gesture is used to drag and drop the content in multimode input field.
Google also notes that it's very possible that a drag and drop instruction may correspond to other kinds of gestures, and/or may correspond to actions on other input sources (e.g., keystrokes on a keyboard, a voice command, and/or hand gestures detected in a video feed, among others).
Displaying Content in the Multimode Input Field
Google notes that patent FIG. 8A presented above is an illustration of an HMD displaying a multimode input field that is in full screen mode. Google's patent FIG. 8B is an illustration of an HMD displaying an image within the multimode input field that is in a reduced screen mode. Google further points out that certain gestures on a touchpad, or a command received via another input source, may allow a user to switch between the modes that were noted above.
Moreover, in some scenarios, text may be overlaid on video or on an image that is displayed in the multimode input field. In such a scenario, Google states that the text may be obtained from a keyboard, obtained via conversion of speech in audio data from microphone or another audio source. It should be noted that Google never describes the keyboard in any noteworthy manner and we're forced to assume that the user will be able to wirelessly connect their Project Glass HMD to an Android smartphone in order to use its keyboard for input into the glasses.
In Viewfinder Mode an Image is Moveable and Resizable via Gestures or Touchpad
Another noteworty feature to be associated with coming Project Glass device is their unique Viewfinder Mode. In Google's patent FIG. 8C shown below we're able to see an additional illustration of an HMD displaying a multimode input field that encloses a portion of a displayed image. In their new viewfinder mode, the multimode input field may be movable and/or resizable in response to certain adjustment instructions, in order to allow a user to identify a specific portion of an image for an action to be taken on.
As you could clearly see below, Google illustrates multi-touch gestures used on a touchpad in the form of "pinch" or "reverse-pinch" gestures that may be mapped to adjustment instructions for resizing the multimode input field. Further, a single-tap, a double-tap, a tap and hold gesture, or another type of gesture may then be used to select the multimode input field, before a swiping gesture is used to move the multimode input field to a new location in the display.
Image Searches
Being Google, we could expect another feature to debut with Project Glass involving image search. Google states that once a user has identified an object by enclosing it with the multimode input such as a building noted in the input filed noted above, various actions may be taken in association with the object. For example, Project Glass may initiate an image-based search in response to an instruction to do so. For example, the wearable computer may be configured to respond to a double-tap gesture within the multimode input field by: (i) applying an object recognition technique to the enclosed portion of the image, in order to identify the enclosed object, and/or (ii) initiating an image-based search on the enclosed object.
Face Recognition Searches
In a further aspect, Google states that the viewfinder mode may additionally or alternatively allow a user to identify the face of a person in an image. In such an embodiment, the wearable computer may be configured to respond to a double-tap gesture within the multimode input field, or another predefined instruction, by initiating a face-detection function on the enclosed portion of the video or image.
If that feature ever went "live" in identifying faces Google would have a killer app on their hands for the enterprise. It's a feature that would be great for salesmen or other professionals that have to remember hundreds of contacts faces and names. Bumping into to a client at an event and not remembering their name is embarassing. If Project Glass ever went live with their face recognition, then the scenario noted above would simply be a thing of the past.
Exemplary Functionality for Video Content
In another aspect of Project Glass, Google states that if a video is being displayed, the wearable computer may allow a snapshot of the video to be taken. In an exemplary embodiment, the wearable computer may allow a snapshot to be taken with an image capture instruction received from one of the various input sources. The snapshot may be of the entire video frame, or only of the portion of the video frame enclosed by the multimode input field.
Application-Based Functionality
In another aspect of Project Glass, Google states that their wearable computer may be further configured to run various applications in the multimode input field. For example, the multimode input field may serve as an application window for various types of applications such as word processing applications, web-browsing applications, and/or e-mail applications, among others.
When an application is open, a wearable computer may be configured to use incoming data from the selected input source as content for the application. For example, when a word-processing application is open in the multimode input field, the multimode input field may display a document. In such an embodiment, a wearable computer may be configured to use text received from a keyboard, or possibly text produced by speech-to-text conversion of speech in audio data, as input for the document.
Further, when speech by a wearer is detected it may be analyzed for information that may imply certain content might be desirable. For instance, when a wearer says a person's name, an exemplary system may search various sources for the named person's contact information or other information related to the named person. The system may do so when, for example, the person's name is stated in the midst of a conversation, and the user is not explicitly requesting the person's contact information.
If contact information for the named person is located, the contact information may be displayed in the multimode input field. Furthermore, the contact information may be displayed in various forms. For example, the multimode input field may display phone numbers, an email, an address, a photograph of the contact, or possibly even the contact's profile on a social network, among other types of contact information.
The Project Glass Patent Briefly Highlights Possible Built-in Apps or Features
At one point in Google's patent application they presents a running list of features or apps that they're considering for Project Glass over time. Here are just a few of the noteworthy ones:
A Built-in Airline App: In one example Project Glass' input selection module may detect a data pattern in incoming audio data that is characteristic of announcements during a commercial airline flight (e.g., flight-attendant safety briefings). The input selection module may interpret this as an indication that the wearer is on a commercial airline flight, and responsively display a map with flight progress and/or flight status information.
A Built-in Music App: As yet another example, the input selection module may detect a data pattern in incoming audio data that is characteristic of music. The input selection module may interpret this as an indication that the wearer is listening to a song, and may responsively send the incoming audio data to a song-recognition application, which may output information such as the name of the song, the performing artist or artists, the name of an album that includes the song, an image associated with the song or artist (e.g., an album cover), and information for purchasing and/or downloading the identified song. This information may then be displayed in the multimode input field.
Additionally or alternatively, the input selection module may search a library of song files associated with the system and/or the wearer, and determine whether the library includes the song. If the song is found in the library, then various actions may be taken. For instance, a prompt to play the song may be displayed in the multimode input field. Various alternative actions are possible if the song is not found in the library. For example, information for purchasing and/or downloading the song may be displayed in the multimode input field.
A Built-in Movie App: Google describes a scenario where the user chooses a movie to play and the goggles automatically turn the glass lenses into "Dark" mode so that you only see the movie and nothing in the periphery.
Other possible apps and/or features noted in Google's filing include interactive gaming, augmented reality and more. While some of the apps and/or features noted in the filing were interesting to note, it also had its share of less than practical ones as well.
One of the noted apps involved drivers. In that scenario, the Project Glass user would simply look at their garage door as entering the driveway and voila, the door would automatically open. Yet in most short driveways, it would take longer to open the garage door that way than to use the practical garage clicker in your car that opens the garage door half way down the street and is already open by the time you turn in. I'm sure that Google's engineers will figure out how to get that door open quicker over time.
Another lame app involved automatically launching Google Maps when the glasses recognized that the user was in their car. The app would launch when the glasses would recognize (via a built-in microphone) that the user's car motor was running. Yes, please, get users used to staring at the map while you're driving. That's all we need on the road right now, more distracted drivers. I'm sure that Google's approach to this app will be modified over time so as to only present maps to a driver when the vehicle is at a full stop.
Some of Google's key Patent Claims are quite Extensive
Reading about Google's many interesting applications and features goes a long way in assisting their fan base understand what they could come to expect from Project Glass in the many years ahead. Though legally speaking, it's the patent claims that could indicate what ideas that Google is actually trying to protect. Google's latest Project Glass patent really nails down some very important aspects of their future head mounted system.
Yet one must never forget Apple on this front. Yes, the Verge posted a report in July stating that one of Apple's video glass patents contained very weak patent claims. But that was in July and since then a new patent application from Apple that was published by US Patent and Trademark Office in December revealed video glasses with a segment on using the device for communications in connection with an iPhone.
Apple's patent claims are stronger in patent application 20120310391. In many ways Apple's patent is more practical in the short term in that Apple's glasses would work in conjunction with an iPhone. Though in the end, I'm sure that there are many patents from both Apple and Google on this front are still to be published and each time one of them surfaces, we'll be certain to pay closer attention to those patent claims.
In the end, the next generation of smart devices in the form of glasses may very spark a Glass War as some have predicted. In a recent report we pointed to Canadian Steve Mann, the forefather of wearable computing, who recently stated that "Yes. There will be Apple Glass, and Google Glass, and RIM Glass. These companies are all working on glass. I think everyone is going to be making glass. I think we're also going to have a glass war instead of a smartphone war."
For consumers, product wars spells nothing but good news. It will make these companies fight for our business by providing us with cooler devices, apps and associated services. There's a lot at stake for these leading tech companies. For Google, it's an opportunity to get the jump on Apple and provide their Android partners and fan base with something to really brag about. For Apple, it's going to be a way to extend their iDevice universe and prevent Google from running away with this next-wave device category.
In the end: may the best device win.
Google's patent application was originally filed under serial number 529957 in June 2012. A 2011 patent acts as a priority patent which is incorporated into the current patent.
NOTICE: The Patent Bolt blog presents a detailed summary of patent applications with associated graphics for journalistic news purposes as each such patent application is revealed by the U.S. Patent & Trade Office. Readers are cautioned that the full text of any patent application should be read in its entirety for full and accurate details. Revelations found in patent applications shouldn't be interpreted as rumor or fast-tracked according to rumor timetables. About Comments: Patent Bolt reserves the right to post, dismiss or edit comments.
Notice to Patent Bolt Facebook Fans
Due to constructive feedback from our many Facebook fans, we've now added the handy "Subscribe" button to our Facebook page as of December 10, 2012.