Detection and Recognition

As discussed in the Authoring Technologies section, planning the extraction of features of interest from real-world targets is part of AR project creation. Once the project is implemented and available on the user’s platform, the same features need to be detected in the real world. To detect and then recognize a target, the technologies are most often coded into a final application using the libraries provided in the SDK.

An Augmented Reality developer building an application from “scratch” (without a commercial SDK), may need to use separate libraries. Even when using a commercial SDK, developers may want to integrate more advanced libraries, or substitute libraries that are better suited to targets or conditions in the environment or to the user’s platform or other constraints defined as part of the project.

For example, if the experience’s targets are curved objects then the optimal detection and recognition algorithm will be different from the algorithm used in a project whose targets are 2D images or faces.

In addition, some devices with detection and recognition technologies well suited to AR use cases are becoming available (e.g., Google Tango Project). The trend of integrating environmental detection into mobile platforms will only increase. In some respects, performance will be improved when providers (e.g., Samsung, Apple, Google) have optimized their hardware and software to work with sensors. Since providers of advanced detection and recognition technologies also want to maintain as much control over their added-value components as possible, they may not offer open APIs. Without documented and/or open APIs, developers may be forced to use the platform’s technologies without being able to adapt/control them.

Depth Sensing

Important depth sensing technologies include those based on electromagnetic radiation detection, mechanical sensors and beacon-and-receiver technologies. An enterprise AR developer seeking highly reliable and precise depth sensing should be familiar with the strengths and weaknesses of these technologies. A tutorial on such technologies is outside the scope of this article, but this article in Embedded provides a good overview of the principles, strengths and weaknesses of these technologies.

At this time a great amount of development activity in AR platforms is focused on the use of technologies based on infrared (IR) detection. IR energy is detected and combined with the observations captured with a natural light camera for best results. Metaio’s thermal touch technology utilizes IR and natural light to register the heat signature left by a person’s finger when touching a surface, enabling user input in AR applications. Another example is the LeapMotion system, which tracks the position of hands, fingers and finger-like tools.

This is the same approach that was well proven for tracking body motion and gestures in Microsoft Kinect. The Kinect IR technology developed by PrimeSense was further reduced in size and optimized (in 2014, PrimeSense was acquired by Apple Inc.). Google is embedding technologies based on the same principles in its Advanced Technology products such as the Tango smartphone and tablet.

There are smaller companies in this space as well. Structure Sensor is an iPad accessory that provides depth-sensing technology using IR and sold by Occipital.

Even a single camera can be used for depth sensing. GestureTek offers software that can be used with any webcam on a PC to detect the position of the user’s head, torso and hands.

Stereo cameras (stereoscopic vision) using visible light are another feature of Kinect and could be attractive to enterprise AR architects working in controlled environments.

In addition, there are camera array (multi-camera) technologies for depth sensing. Camera array providers include Heptagon and Pelican Imaging.

Object and Target Recognition

Reliably and quickly recognizing an object in an enterprise can be performed by sensing its depth, distance from the user and size. But, the results of sensing must be compared with a database of known objects before its true identity is available to the AR application and can be the basis for providing an experience. An enterprise AR platform may need to include recognition algorithms that are highly “tuned” to an enterprise’s needs. For example, recognition and tracking of automotive components needs to take into account reflections due to lights in the environment. Recognition and tracking of a patient’s gait will need to include algorithms that take into account their joint movements, as well as the terrain over which they are moving. Facial recognition algorithms are in another completely different category, however.

Geolocation

The user’s context may be defined and recognized entirely on the basis of geospatially-referenced position. The user’s position on the earth or in a known environment could be used to reduce the search field for images and objects as well. The enterprise AR developer may elect to use one or more positioning technologies available.

The most common positioning technology in use for AR experiences in mobile platforms is based on GPS and on a compass, with supplemental information from a gyroscope and accelerometers. Such sensors are widely implemented in mobile phones and tablets as well as in wearable technologies. Assisted-GPS and other GPS-based technologies with even greater precision are being built into professional devices such as surveying tools, and could be useful in supplementing mass market devices.

Indoor Positioning

Indoor positioning technologies are rapidly advancing and are valuable for providing precise indoor AR experiences. These include technologies based on movement from a known “anchor point” in a space and signal triangulation, using technologies such as WiFi and other beacon technologies that work on the same principles. Indoor positioning technologies can also use a camera, which detects stationary features (markers) in the environment.

Detection and Recognition

Depth Sensing

Object and Target Recognition

Geolocation

Indoor Positioning

Related Information

Why AR Links