kl3m3n's blog

Using Microsoft Kinect to visualize 3D objects with texture in LabVIEW in real-time

The acquisition of depth and texture (RGB) information is based on the PrimeSense technology (http://www.primesense.com/) and OpenNI framework (http://www.openni.org/) using the PCL (http://pointclouds.org/) to interface with the Kinect sensor. Additionally, the CLNUI drivers (http://codelaboratories.com/nui) are used to control the motor of the Kinect, since the PrimeSense drivers do not support this feature (as far as I know). The C++ code for point cloud and texture acquisition was built as a dynamic link library using Microsoft Visual Studio 2010. This enables us to call the library from within the LabVIEW environment. Similarly, a dynamic link library is also used to control the Kinect motor.

The Point Cloud Library can acquire and process data from the Kinect sensor. This means that we can obtain a calibrated point cloud (X,Y,Z) and a texture image for each frame. Furthermore, the texture image can be directly overlaid over the point cloud as shown in Figure 1.

Figure 1. 3D point cloud with overlaid texture acquired from a single Kinect sensor.

After calling the dynamic link library in LabVIEW, the point cloud is reshaped to a 2D array for each corresponding dimension. Next, a smoothing filter is applied to the depth image (Z dimension) to reduce the deviation of the reconstructed surface (noise removal). Simultaneously, the texture image is also stored in a 2D array. Since the depth image and the texture image are aligned, it is trivial to extract the spatial coordinates of the desired object features from the texture image. It is less trivial to detect these features in prior to extraction. There a lot of known algorithms for object detection and tracking, most of them out of my league (I wish they weren't ). The functions and VI's that are included in LabVIEW's NI Vision Module are stable, fast and most important – they perform well (sincere thanks to the developers for this). I just wish the Vision Module would by default include other popular computer vision algorithms for features detection, object tracking, segmentation, etc. (SIFT, MSER, SURF, HOG, GraphCut, GrowCut, RANSAC, Kalman filter…). Of course, one could write algorithms for achieving this, but like I said previously this is not so trivial for me, since a lot of background in mathematics and computer science is needed here. For example, Matlab has a lot of different computer vision algorithms from various developers that can be used. Since LabVIEW includes MathScript (support for .m files and custom functions) it would be interesting to try and implement some of the mentioned algorithms.

Ok, back to the main thread (got a little lost there)!

Here is one feature detection algorithm that I do understand, and it works extremely stable when used in the right circumstances – image correlation. And here, LabVIEW shows its power. In my experience the pattern matching algorithms are accurate and fast. It all comes down to choosing the right algorithm for the task. An example of 3D (actually its 2D) features tracking is shown in Figure 2. The texture contains four circular red objects that are tracked using the normalized cross-correlation algorithm (actually the centers of the reference templates are tracked). They are tracked on the extracted green plane channel. Each object has its own region of interest, which is updated for each subsequent frame (green square). This also defines the measuring area (yellow rectangle).

Figure 2. Depth image (left) with overlaid textured image (middle) forms the reconstructed 3D shape with texture.

The tracking algorithm works really well (it takes about 1ms to detect all four objects), the only problem is that the specific algorithm is not scale-invariant (±5% according to the NI Vision Concepts manual). In this case, scale-invariance is not so important, but again it all comes down to the nature of your application. It is necessary to define your constraints and conditions in prior to constructing the application/algorithm.

To summarize: LabVIEW (mostly NI Vision Module) is used to perform 2D object tracking with simultaneous 3D shape acquisition by interfacing with the Microsoft Kinect sensor via a dynamic link library. The algorithm acquires, processes and displays the results in real-time with the frequency of 30 Hz (@320x240 pix).

Thanks for reading.

Be creative.

https://decibel.ni.com/content/blogs/kl3m3n