Module 3: Camera/Image Calibration
Camera Calibration is the procedure of extracting 3D information of the real world from its 2D images. The need for the camera calibration arises as a means of to support and enhance the interaction and coordination between the user and the actual environment. In this framework, the camera calibration matrix serves to correlate the control actions or commands from the user interface with the actual operating environment.
3.1 PROCEDURES
The camera calibration procedure can be easily solved through the understanding of the coordinate systems between the actual environment, the image and the camera itself as depicted in Figure 1.
Assuming for a pinhole camera with its origin as the centre of projection C, a coordinate frame for the camera can then be established. It should be noted that the camera’s coordinate frame needn’t be the world coordinate frame. Thus, the camera’s coordinate frame can be arbitrarily determined yet maintaining a mapping from the world coordinate frame.
From the camera’s coordinate frame, one could then define the optical axis as the z-axis of the coordinate frame. With this assumption, the 2D image of the environment can be projected with its image origin c to be the intersection of the optical axis and image plane, usually with a distance of separation between the camera’s centre of projection C and the image plane which is termed the focal length f.
On top of that, the uv and xy coordinate frames of the image plane could also be defined to be parallel to the XY axis of the camera’s coordinate frame. Thus, the image plane will always be well-defined with a mapping from the camera’s coordinate frame, and similarly vice-versa.
It should be noted that these parameters; focal length f, pixel width, pixel height, u pixel coordinate of the optical centre uc and v pixel coordinate of the optical centre vc are the camera parameters.
Figure 13
The coordinate systems involved in camera calibration [Taken from Peter Kovesi's lecture materials on Camera Calibration]
When an object in the real world is pictured into a 2D image, a point M = (X,Y,Z) on the object will be imaged to a point m = (x,y) in the image plane. From Figure 13, a set of equations relating to the relationship of the two coordinate frames could be determined as follows:
(1)
(2)
(3)
(4)
where uc is the u pixel coordinate of the optical centre and vc is the v pixel coordinate of the optical centre .
Rewriting Equations 1 and 2 linearly in homogeneous coordinates,
(5)
where s is a scaling factor and s 0.
The transformation from the 3D world coordinates to the image pixel coordinates thus can be expressed using a 3 x 4 matrix called the camera calibration matrix which will be presented later on.
Firstly, substituting Equations 1 and 2 into Equations 3 and 4,
(6)
(7)
and rewriting Equations 6 and 7 in matrix representation,
(8)
where the scaling factor s now has the value Z.
or in shorthand notation,
(9)
where represents the homogeneous vector of image pixel coordinates, is the perspective projection matrix or camera matrix, is the homogeneous vector of world coordinates and is a 4 x 4 homogeneous transformation matrix
The matrix K acts as a change of coordinates to account for frame(s) that may be defined with the desired 3D point as the origin at the centre of projection and the z-axis along the optical axis.
The matrix K holds within it a 3 x 3 rotation matrix R and a homogeneous column vector t and is defined as follows:
(10)
The rotation matrix R encodes the camera orientation with respect to a given world frame while the vector t captures the camera displacement from the world frame origin. Thus, the matrix K has 6 degrees of freedom; 3 for orientation and 3 for translation of the camera, where these parameters are known as the extrinsic camera parameters.
Extrinsic parameters are simply parameters that depend on the position and orientation of the camera, i.e. rotation matrix R and translation vector t. On the other hand, there are also intrinsic camera parameters.
Intrinsic camera parameters don’t depend on the position and orientation of the camera in space. Thus, these parameters generally relates to internal geometric and optical characteristics of the lenses and the imaging device. These are inclusive and defined as follows:
(11)
(12)
Thus, the 3 x 4 camera matrix P and the 4 x 4 homogeneous transform K combine to form a single 3 x 4 matrix called the camera calibration matrix C. The camera calibration matrix can be written as function of the intrinsic and extrinsic parameters as such:
(13)
where the vectors , and are the row vector of the matrix R and t = .
In order to solve for the camera calibration matrix C, there are generally 2 algorithms, i.e. the linear or nonlinear approach [1,5]. Only the linear method will be considered here.
Firstly, recalling Equation 8 as,
(14)
From Equation 14, the following set of equations can be derived as such,
(15)
(16)
Using the given structure of C and with a given set of n 3D world points and their image coordinates, the following matrix can be obtained and solved.
(17)
where the matrix of knowns is 2n x 11.
Representing Equation 17 in shorthand notation as:
(18)
where the camera calibration matrix C can be solved using pseudo-inverse where
(19)
However, there are two conditions that must be satisfied for the calibration matrix C to be represented as in Equation 13. These conditions are:
(20)
(21)
where for i = 1,2 and 3.
Share with your friends: |