III. THEORETICAL BASIS
In this section we are going to introduce some important theoretical concepts, needed to better understand the following sections. Of course, we are not pretending of summarize the whole theory of vision systems, color spaces and image processing techniques in the following pages. For that, please refer to the relative references.
A. Vision Systems
We have already introduced in section I the concept of Vision System as "the component of a robot aimed at the extraction of some features from a scene and their translation in some kind of high-level information".
Of course, a vision system and its internal organization are deeply application dependent. Anyway, it is possible to find some main features that are almost independent from the specific implementation and that are actually present in every CV system. In particular, it is possible to identify the following stages:
Image acquisition;
Pre-processing;
Feature extraction;
Detection/Segmentation;
High-level processing.
In the Image acquisition stage, a digital image is produced by one or more image sensors, such as cameras. Depending on the type of the image sensor, the image data can be two or three dimensional. The value of each pixel may represent the intensity of the light in one or several spectral bands (gray or color images) or can be related to other physical measures, such as depth, absorption or reflectance of sonic or electromagnetic waves, or nuclear magnetic resonance. In the Pre-processing phase, the image grabbed during the previous stage needs to be pre-processed, in order to ensure it satisfies certain assumptions implied by the CV method adopted (such as re-sampling, noise reduction, contrast enhancement, etc).
With the Feature extraction step, features of the image like lines, edges, blobs or points are extracted from the image data. The Detection of relevant features of the image and the Segmentation of the image itself in sub-images represent the fourth step and allow the distinction in relevant and not relevant features for the next stage.
The High-level processing is the last step of the computer vision system. The input of this step usually is a small set of data that contains the detected objects of the image. The task of this stage is to extract some high level information about the real objects (like object type, position and size) from the input image.
We will analyze in detail the above stages in the following sections, while studying the algorithm used for the vision system in our robot.
B. The CIE L*a*b* Color Space
The algorithm we propose is strictly connected with the concepts of color space, color model and human color perception [7, 8].
A lot of studies about color perception, in fact, showed that the human eye has some photo receptors to catch short (or S), middle (or M) and Long (or L) waves, better known as blue, green and red photo receptors (the notorious RGB). In other words, the color sensation of man is given by an appropriate use of these three parameters. These concepts have been translated for the first time in a more formal way by the International Commission on Illumination (or CIE) in 1931, with the mathematical definition of the first color space: the CIE XYZ color space[9]. One of the variants of the CIE XYZ color space is represented by the so called CIELAB (to be more precise, the CIE L*a*b*) color space, actually considered the most complete color model for the description of the full set of colors visible to the human eye.
Let's try, now, to better understand the structure of this color space and the reasons that moved us to adopt this color space in our vision system.
Unlike the "standard" (and probably better known) RGB color space, in the CIELAB the three parameters represent:
lightness (L*);
position between magenta and green (a*);
position between yellow and blue (b*).
Going into details, lightness ranges between 0 and 100 (L*=0 yields black, L*=100 indicates white). Negative values for the a* parameter represent a movement along the magenta-green axis towards the green, while positive values for a* indicate a movement towards the red along the same axis. At the same way, negative values for the b* parameter represent a movement along the yellow-blue axis towards the blue, while positive values for a* indicate a movement towards the yellow along the same axis. For more clarity, please refer to figures 3 and 4, in which the graphs relative to the LAB color space for two different values of lightness have been reported.
Fig. 3. LAB color space for a 25% lightness
Fig. 4. LAB color space for a 50% lightness
Unlike the other color spaces, like the RGB or the CMYK ones, the CIE L*a*b* color space is an absolute color space. It has been developed to serve as a device independent model to be used as a reference for the other color spaces. Notice that the LAB model is a three dimensional model and it can only be properly represented in a three dimensional space. Of course, converting images from a color space to an other one provokes the raising of some error. However, according to the results of the tests performed by Dan Margulis [10], the loss can be considered completely negligible. But why we're going to use such a color space in our algorithm?
The answer to this question is embedded into the nature of the LAB model itself. To better understand this concept, let's have a look to figure 5.
Fig. 5. How many colors can you distinguish in the picture?
If someone would ask you how many colors are present within the previous image, in a way that is almost independent from the lighting information, it should be easy for you to say that there are essentially three colors present: pink, blue and white. The CIE L*a*b* color space essentially has this capability. It can separate the color information from the lighting information, reducing the color parameters from three (R, G and B) to two (a* and b*). Not that bad!
The reason that moved us towards the adoption of this color space in our project are related to the Eurobot 2006 experience [11]. In the previous edition of the competition, in fact, the designed vision system based is color analysis simply on the analysis of the R, G and В channels. The system worked correctly, but it was necessary to set the correct thresholds for the analysis operation for different conditions of lightning. With the new approach we totally overcome the problem.
C. The OpenCV Libraries
OpenCV is a set of open source computer vision libraries originally developed by Intel3. The libraries are cross-platform and mainly aimed at real-time image processing. They showed great results in a very huge amount of application areas, ranging from the Human-Computer Interface field to the robotics one, passing through the areas of the biometrics and of the information security. For all of these reasons we decided to use the set of OpenCV libraries in our application. We will comment in section IV the specific OpenCV functions used, while discussing the proposed algorithm.
Share with your friends: |