As mentioned in the article on enterprise AR display technologies, there are many ways for users to control and interact with AR-assisted applications. Touch or tactile interfaces are now ubiquitous on smartphones and tablets. A user can tap on the screen at the place where an augmentation appears to see more details. Alternatively a user can manipulate the digital object with pinching and dragging motions on the screen.
Speech Recognition
Some hands-free systems support speech recognition. The user’s voice is captured by a microphone, and converted to a signal that is analyzed by a speech recognition library. The command is compared with options available as part of the control and interaction vocabulary of the system. Recognized commands are then executed by the system. Speech recognition has a role as an interaction technology in enterprise environments that possess uniform acoustics and low interference from other sources of noise, as well as where users’ commands are neither overheard by other users nor by their systems.
Gaze Tracking
Gaze tracking is another type of interaction technology for hands-free selection and interaction with an AR system. User focus on a target and, through a series of menus selected via eye focus, can choose commands or interact with augmentations. Gaze tracking itself can be a way for a user to interact with digital content that is not registered on or with the real world (e.g., instructions in a manual may be read without being overlaid on the object). Gaze tracking is highly suitable as an enterprise AR interaction technology when the interactions are simple and menu driven.
Gesture Recognition
Gestures include all movements the user can perform with their hands, head and other parts of their body. Movements of the head or other body parts are easily detected by appropriately mounted sensors (i.e., accelerometer and gyroscope). The choice of hardware, the mounts and their proper calibration will determine how accurate signals are captured. These signals are translated into commands and, if recognized, the commands are executed by the AR software.
Sensors for gesture recognition can be integrated into a user’s garments or safety wear but care must be taken with power management, with radio or other communication interfaces with logic processing and with the weight of these additional systems.
The most commonly used gesture recognition in Augmented Reality systems involves hand movements. Depth sensing and hand-tracking technology send captured signals to a gesture-recognition library for recognition and translation into commands. The possible commands are compared with a set defined for AR experiences for potential execution, in the same manner as for speech control and interaction.
Gesture recognition is highly intuitive for users, but as with recognition for speech and gaze requires some training. Also, in use cases where the user’s hands are needed for performing an action (e.g., using a tool), gesture recognition involving hands may not be suitable or should be considered in combination with other interaction technologies.
In Virtual Reality, users commonly have the option to use gestures for interacting with virtual objects. Gloves worn by the user have markers that are more easily and reliably detected than hands alone. Some gloves also have pressure-sensing technologies. The user sends commands and receives inputs from the system through changes in pressure.