Equation 6. Representation of point q that is located on the calibration plane
Equation 7. Vector equation of calibration plane that is satisfied by point q
Note that the vector difference,, defines a translation vector that lies on the calibration plane. Since the normal vector of the plane and the defined translation vector must be perpendicular, the plane equation may defined by evaluating their dot product and setting it equal to zero due to their orthogonality. This result may be expanded and expressed as in Equation 8.
Equation 8. Scalar equation of calibration plane
The calibration plane equation, Equation 8, provides a necessary constraint on the 3D metric points that will lie on the plane. This will later be used to define the scale parameter of the rays of the corresponding 3D metric points.
3.3.2 LOCALIZATION OF 3D POINTS
The 2D image locations of each projected calibration target corner are extracted from the RGB image using a corner extraction algorithm. So now it possible to define the 3D rays that are projected from the RGB camera optical center through each 2D image detected corner in the image plane. Equation 9 defines the RGB camera projection of a 2D image point to its corresponding 3D metric point counterpart. The corresponding 3D metric position ( ) of each projected corner on the actual calibration plane depend on the scale parameter, , which is unknown.
where:
Equation 9. RGB camera projection of 2D image point to its 3D metric point
Using the previous development of the calibration plane equation the scale parameter, λ, can be solved for, thus defining the true 3D metric locations of the projected corner points. Specifically, the intersection between the calibration plane and the 3D rays determines the 3D positions of the projected corners. So the correct scale parameter, λ, is the one that satisfies the equation of the calibration plane. Substituting the Cartesian components of 3D rays the equation of the calibration plane may expressed as Equation 10.
Equation 10. Intersection of 3D rays and calibration plane
At this point all parameters of the plane (n̂, ) are known, the components of the 3D ray (, , ) are also known and the only unknown is scale parameter, λ. So we have one equation, Equation 11, and one unknown, thus allowing for a simple solution.
Equation 11. Expression for scale parameter,, in RGB camera frame
Knowing the scale parameter, the 3D metric position of each projected corner can now be determined, according to Equation 12. This 3D position is computed for each detected projected corner found in the RGB image, and for each calibration plane configuration.
Equation 12. Application of scale parameter,, to define 3D metric points in the RGB Camera reference frame
So now the 2D image point and 3D metric point correspondences are established for the projector digital image and the real world projected image displayed on the calibration plane. This allows using Zhang's method to perform calibration of the projector to determine its intrinsic parameters and the extrinsic parameters between the projector and calibration plane.
3.3.3 PROJECTOR CALIBRATION RESULTS
The previously detailed calibration procedure is implemented in the developed software to allow for auto-calibration for different SAR system settings. OpenCV's calibrateCamera() function is used to execute Zhang's method on the determined 2D image point and 3D metric point correspondences for the projector. The OpenCV calibration function does not report uncertainty associated with each intrinsic parameter. Also, the reported RMS pixel re-projection error is expressed as the square-root of the sum of squared RMS pixel re-projection errors for the two image axes (.
Unlike a camera, the intrinsic parameters of the projector can change for various projector configurations. This is due to the fact the projector has an adjustable focal length that can be physically manipulated by the user, to adjust focus on the imaging plane. Also, the projector is prone to keystone distortion which occurs when the projector is aligned non-perpendicularly to the projection screen, or when the projection screen has an angled surface. The image that results from one of these misalignments will look trapezoidal rather than square. Figure 17 visualizes the possible configurations of the projector. Note the 'on-axis' case displays an ideal configuration where the principal axis of the projector is perpendicularly aligned with the projection screen. In this case there will be no keystone distortion and only projector lens distortions will affect the displayed image.
Figure 17. Possible projector configurations that affect keystone distortions
The utilized projector incorporates manual keystone distortion correction. If this setting is changed, the vertical principal point position of the projector is changed as well to account for the off axis projection onto the display screen. When the principal point of the projector is changed the image is projected through either the upper or lower part of the projector image acquisition lens. In result this also affects the intrinsic radial lens distortion parameters. However, the tangential lens distortion parameters are not affected because tangential distortion takes place when projector lens is not perfectly parallel to the imaging plane. During keystone adjustment, the projector lens orientation is not changed with respect to the imaging plane; only the vertical principal offset is adjusted.
Due to all of the previously mentioned changes in projector's intrinsic parameters, the projector has to be fully calibrated for each new SAR system setup configuration. Table 3 summarizes the intrinsic calibration results obtained for a unique projector configuration using a set of ten images.
Intrinsic Parameter
|
Parameter Value
|
Focal Length
|
[1772.61112 1869.60390 ]
|
Principal Point Offset
|
[375.22251446.22731 ]
|
Radial Distortion
(, ,)
|
[0.77017, -20.44847171.57636]
|
Tangential Distortion (,
|
[ -0.00230 0.00204 ]
|
Pixel Re-projection Error
|
[ 0.582298 ]
|
Table 3. Projector camera intrinsic parameters
For the given SAR system using greater than ten images per calibration set tends to only increase the pixel re-projection error and decrease the accuracy of both intrinsic and extrinsic parameters. The extrinsic parameters found for the following calibration setup are defined in Equation 13, where the translation vector is expressed in millimeter metric units.
Equation 13. Extrinsic parameter result between RGB camera and projector
The extrinsic parameters between the RGB camera and projector will change for every unique configuration of the Kinect sensor and projector. This means that if the Kinect sensor is moved with respect to the projector to a different position and orientation that pre-calibrated extrinsic results are no longer valid and need to be solved for again. If the focal length and keystone stone distortion of the projector are not changed during the re-positioning of the SAR system the intrinsic parameters will remain the same and only the extrinsic parameters will change.
CHAPTER IV
USER INTERACTION WITH SAR SYSTEM
The described SAR system is designed to primarily support IR stylus user input, however it can also obtain depth information due the Kinect's structured light stereo pair that consists of the IR Camera and an IR projector. Unfortunately, both inputs cannot be obtained at the same time due to Kinect hardware limitations that allow it to only stream IR or depth data at a single point in time. Even if the Kinect could stream both data streams at the same time, Kinect's IR projector would add its projection pattern to each IR image, thus greatly increasing image noise. This would hinder any computer vision application that is applied to the IR camera data stream due to additional IR noise. Figure 18 displays the Kinect's IR projector pattern on a planar target.
Figure 18. Kinect IR projector pattern
As previously stated, IR stylus input is the primary method for user and system interaction. The IR stylus is used to write or draw content on the 'digital whiteboard' region and the projector will display the traced movement of the stylus that characterizes the user defined contour. In case of depth interaction, the system will display an image of a button on the calibration plane using the projector and the user can physically displace depth in the button region by 'pressing' and activating the button.
For both of these types of user interaction, various transformations must be applied to the sensor inputs for the system to provide coherent feedback. The following sections explain the required transformations and their corresponding mathematical representations.
4.1 PROJECTIVE TRANSFORMATIONS
In order to display the IR stylus movement on the calibration plane using the projector, a correspondence problem must be solved that maps the detected IR stylus centroid position in the IR image to the corresponding 2D image point that will be displayed by the projector, resulting in a 3D metric point projected onto the calibration plane. Since the IR camera of the Kinect can also function as a depth sensor, the presented transformations apply to both IR and depth data streams.
The IR camera is mainly sensitive to light wavelengths above the visible range, so it does not register the light output of the projector. This restricts a direct extrinsic calibration between the two devices, and thus presents a problem for a direct mapping from the IR camera to the projector. Due to this the RGB camera is used as an intermediate sensor to which both the IR and projector are calibrated to. Consequently, the SAR system coordinate reference frame is placed at the optical center of the RGB camera to facilitate perspective transformations. This way a transformation may be defined from the IR camera to the RGB camera, and then another transformation from the RGB camera to the projector to solve the presented correspondence problem. The figure below visualizes these described transformations. The solid orange lines represent projection transformations of a 2D image point to a 3D metric point. Whereas, the dashed orange lines represent back-projection transformations from 3D metric point back to 2D image points in alternate reference frame. In essence, Figure 19 visualizes all the transformations applied to each acquired point by the IR camera.
Figure 19. SAR system with visualized transformations
4.1.1 IR CAMERA TO RGB CAMERA TRANSFORMATION
The following transformations take a 2D image point of the IR camera and map it to the corresponding 2D image point of the RGB camera. In order to accomplish this, first the 2D image point of the IR camera must be back-projected to a metric 3D world point. This is accomplished by the following transformation expressed in Equation 14.
Equation 14. IR camera projection of 2D image point to corresponding 3D metric point
The 3D metric point of the back-projected 2D point of the IR is image expressed in homogeneous coordinates, , which are defined by a scale factor, . It can be obtained from the extrinsic parameters between the IR camera and the calibration plane, similar to how the RGB camera computed the 3D metric points of the detected projector pattern during projector calibration. The direct expression for the scale parameter is expressed in Equation 15.
Equation 15. Expression for scale parameter,, in IR camera frame
Knowing the scale parameter, the 3D metric position of each projected corner can now be determined by Equation 16. This 3D position is computed for each detected projected corner found in the IR image, and for each calibration plane configuration.
Equation 16. Application of scale parameter,, to define 3D metric point in IR camera reference frame
It is important to note that the obtained 3D metric points are represented in the IR camera coordinate system. In order to obtain the corresponding pixel positions on the image plane of the RGB camera, the 3D points need to be converted to the RGB Camera coordinate system using the extrinsic parameters between the two cameras obtained from previous calibration. Then these transformed 3D points can be projected onto the image plane of the RGB camera using the intrinsic parameters ( of the RGB camera. This transformation is expressed in Equation 17.
Equation 17. Projection of 3D metric point expressed in IR camera reference frame to the corresponding 2D image point in the image plane of the RGB camera
4.1.2 RGB CAMERA TO PROJECTOR TRANSFORMATION
The following transformations take a 2D image point of the RGB camera and map it to the corresponding 2D image point of the projector. The methodology of this transformation is identical to the previous mapping of a 2D image point of the IR camera to a 2D image point of the RGB camera, except that different intrinsic and extrinsic parameters need to be used, that correspond to the RGB camera and projector. Accordingly, first the 2D image point of the RGB camera must be back-projected to a metric 3D world point. This is accomplished by the following transformation expressed in Equation 18.
Equation 18. RGB camera projection of 2D image point to corresponding 3D metric point
The 3D metric point of the back-projected 2D point of the RGB is image expressed in homogeneous coordinates, , which is defined by a scale factor, . It can be obtained from the extrinsic parameters between the IR camera and the calibration plane. The direct expression for the scale parameter is expressed in Equation 19.
Equation 19. Expression for scale parameter,, in RGB camera frame
Knowing the scale parameter, the 3D metric position of each projected corner can now be determined by Equation 20. This 3D position is computed for each detected projected corner found in the RGB image, and for each calibration plane configuration.
Equation 20. Application of scale parameter,, to define 3D metric point in RGB camera reference frame
It is important to note that the obtained 3D metric points are represented in the RGB camera coordinate system. In order to obtain the corresponding pixel positions on the image plane of the projector, the 3D points need to be converted to the projector coordinate system using the extrinsic parameters between the two cameras obtained from previous calibration. Then these transformed 3D points can be projected onto the image plane of the projector using the intrinsic parameters ( of the projector. This transformation is expressed in Equation 21.
Equation 21. Projection of 3D metric point expressed in RGB reference frame to the corresponding 2D image point in the image plane of the Projector
4.2 IR STYLUS AND DEPTH DETECTION
IR Stylus detection is performed by performing digital image processing on each acquired frame by the IR camera. This involves thresholding the IR digital image by a given high intensity value, like 250, since the IR stylus return will be very bright on the acquired image. All pixels below this threshold are zeroed out to the lowest intensity, and all pixels above or equal to this threshold are assigned the highest possible intensity. Performing this operation yields a binary black and white image. Then a blob detection algorithm is executed on the binary image according to the cvBlobsLib software library. This provides a contour of the detected IR stylus blob in the image of the IR camera. Afterward the detected blob is centroided to give its sub-pixel position. The result is the floating point coordinates of the center pixel of the detected blob corresponding to IR stylus return. This procedure defines the detection of the IR stylus.
Depth information is registered by the IR camera when coupled with the IR projector. Its detection is simply the displacement of depth. For the case of the projected button, a region of interest is selected in the IR camera that corresponds to the observed region of the projected depth button. The program monitors the change in depth of this region and once the depth has changed past a certain threshold the button is considered to be pressed by the user and activated.
4.3 IR STYLUS INPUT VISUALIZATION
Having detected the IR stylus sub-pixel position in a set of consecutive IR frames, it is possible to connect these points and estimate a contour that represents the user's input according to the IR stylus. This detected contour needs to be transformed to the appropriate image coordinates of the displayed image of the projector in order to coherently visualize the contour on the calibration/whiteboard surface.
Given a dense set of neighboring points, it possible to simply connect adjacent points with a small line; producing a contour defining the IR stylus movement. However, this can lead to noise contour detection, and it may be not possible to acquire these IR points in a dense enough manner due to the computational burden placed on the computer. For fast multi core central processing units (CPUs) that have a frequency of above 3.0 GHz this is not a problem, however it becomes apparent for older hardware that utilizes slower CPUs. In this case the resulting contour appears jagged and unpleasing to the human eye. This can be corrected for by applying the 'Smooth Irregular Curves' algorithm developed by Junkins [23]. The following section details its application.
4.3.1 PARAMETRIC CURVE FITTING
The following algorithm called 'Smooth Irregular Curves' developed by Junkins considers a set of 2D points much like those attained from IR stylus detection and aims at estimating the best smooth and accurate contour passing through these input points by sequential processing. The arc length of the curve is chose as an independent variable for interpolation because its length is monotonically increasing along the curve at every discrete point of the data set. Also, the method treats both x and y coordinates independently thus allowing for a parametric estimation of each individual variable. The method considers a local subset of six points and attempts to fit the best smooth curve passing through all of these points, by approximating the best fit cubic polynomial for both x and y coordinates in between each data point segment. The is performed by diving the six point data set into a left, middle and right region, where data point position is considered along with the slope at each point. The contour solution is constrained at each data point end that defines a region. The point position and the slope at the point position are used as constrained parameters. As this is a sequential algorithm it marches from point to point reassigning the left, middle and right regions and using their constraint information provides the best smooth cubic polynomial approximation at the middle segment. Figure 20 visualizes the output of the algorithm applied to a discrete set of detected IR stylus points.
Share with your friends: |