function. The comb function comprises our sampling grid which is conveniently nonzero
only at integral (x,y) coordinates. Therefore, gs(x,y) is now a disarete-continuous image
with intensity values defined only over integral indices ofx and y.
Y Figure 2.9: Comb function.
Even after sampling, the intensity values continue to retain infinite precision. Since
computers have finite memory, each sampled point must be quanfized. Quantization is a
point process that safisenonlinear function of the form shown in Fig. 2.10. It reflects
the fact that accuracy iMimtted by the system's resolution.
g ß Input Input
Figure 2.10: Quantization function. (a) Uniform; (b) Nonuniform.
2.2 IMAGE ACQUISITION 31
The horizontal plateaus in Fig. 2.10a arc due to the fact that the continuous input is
truncated to a fixed number of bits, e.g., N bits. Consequently, all input ranges that share
the first N bits become indistinguishable and are assigned the same output value. This
form of quantization is known as uniform quantization. The difference q between suc-
cessive output values is inversely proportional to N. That is, as tbe precision rises, the
increments between successive numbers grows smaller. In practice, quantization is inti-
mately coupled with the precision of the image pickup device in the imaging system.
Quantization is not restricted to be uniform. Figure 2.10b depicts nonuniform
quantization for functions that do not require equispaced plateau intervals. This permits
us to incorporate properties of the imaged scene and the imaging sensor when assigning
discrete values to the input. For instance, it is generally known that the human visual
system has greater acuity for low intensities. In that case, it is reasonable to assign more
quantization levels in the low intensity range at the expense of accuracy in the high inten-
sity range where the visual system is less sensitive anyway. Such a nonuniform quantiza-
tion scheme is depicted in Fig. 2.10b. Notice that the nonuniformity appears in both the
inarcments between successive levels, as well as the extent of tbese intervals. This is
equivalent to performing a nonfincar point transformation prior to performing uniform
Returning to Fig. 2,8, we see that gs(X,Y) passes through a quantizer to yield the
discrete-discrete (digital) image gd(x,y). The actual quantization is achieved through the
use of an analog-to-digital converter. Together, sampling and quantization comprise the
process known as digitization. Note that sampling actually refers to spatial quantization
(e.g., only a discrete set of spatial positions are defined) while the term quantization is
typically left to refer to the discretization of image values.
A digital image is an approximation to a continuous image f(x,y). It is usually
stored in a computer as an N x M array of equally spaced discrete samples:
f (0,0) f (0,1) .... f (0,m-l)
f (1,0) f (1,1) .... f (1,m-l)
f (x,y) = ß" (2.2,4)
f (N-l,0) f (N-1,1) .... f(N-1,M-i
Each sample is referred to as an image element, picture element, pixel, or pel, with
the last two names being commonly used abbreviations of "picture elements." Collec-
tively, they comprise the 2-D array of pixels that serve as input to subsequent computer
processing. Each pixel can be thought of as a finite-sized rectangular region on the
screen, much like a file in a mosaic. Many applications typically elect N = M = 512
with 8-bits per pixel (per channel). In digital image processing, it is conmon practice to
let the number of samples and quantization levels be integer powers of two. These stan-
dards are derived from hardware and software considerations. For example, even if only
6-bit pixels are required, an entire g-bit byte is devoted to it because packing 6-bit quan-
tities in multiples of 8-bit memory locations is impractical.
Digital images are the product of both spatial sampling and intensity quantization.
As stated earlier, sampling can actually be considered to be a form of spatial quantiza-
tion, although it is normally treated as the product of the continuous input image with a
sampling grid. Intensity quantization is the result of discretizing pixel values to a finite
number of bits. Note that these two forms of quantization apply to the image indices and
vaines, respectively. A tradeoff exists between sampling rate and quantization levels.
An interesting review of work in this area, as well as related work in image coding, is
described in [Netravali 80, 88]. Finally, a recent analysis on the tradeoff between sam-
pling and quantization can be found in [Lee 87].
2.3. IMAGING SYSTEMS
A continuous image is generally presented to a digitization system in the form of
analog voltage or current. This is usually the output of a transducer that transforms light
into an electrical signal that represents brightness. This electrical signal is then digitized
by an analog-to-digital (A/D) converter to produce a discrete representation that is suit-
able for computer processing. In this section, we shall examine several imaging systems
that produce an analog signal from scene radiance.
There are three broad categories of imaging systems: electronic, solid-state, and
mechanical. They comprise some of the most commonly used input devices, including
ridicon cameras, CCD cameras, film scanners, flat-bed scanners, microdensitometers,
and image dissectors. The imaging sensors in these devices are essentially transducers
that convert optical signals into electrical voltages.
The primary distinction between these systems is the imaging and scanning
mechanisms. Electronic scanners use an electron beam to measure light falling on a pho-
tosensitive surface. Solid-state imaging systems use arrays of photosensitive cells to
sense incident light. In these two classes, the scanned material and sensors are station-
try. Mechanical scanners are characterized by a moving assembly that transports the
scanned material and sensors past one another. Note that either electronic or solid-state
sensors can be used here. We now describe each of these three categories of digital
image acquisition systems in more detail.
2.3.1. Electronic Scanner?
The name flying spot scanner is given to a class of electronic scanners that operate
on the principle of focusing an electron beam on a photodetector. The photodetector is a
surface coated with photosensitive material that responds to incident light projected from
an image. In this assembly, the image and photodetector remain stationary. Scanning is
accomplished with a "flying spot," which is a moving point of light on the face of a
cathode-ray tube (CRT), or a laser beam directed by mirrors. The motion of the point is
controlled electronically, usually through deflections induced by electromagnets or elec-
trostatics. This permits high scanning speeds and flexible control of the scanning pattern.
23 IMAGING SYSTEMS 33
220.127.116.11. Vidicon Systems
One of the most frequently utilized imaging devices that fall into this class are vidi-
con systems, shown in Fig. 2.11. These devices have traditionally been used in TV cam-
eras to generate analog video signals. The main component is a glass vidicon tube con-
taining a scanning electron beam mechanism at one end and a photosensitive surface at
the other. An image is focused on the front (outer) side of the photosensitive surface,
producing a charge depletion on the back (inner) side that is proportional to the incident
light. This yields a charge distribution with a high density of electrons in the dark image
regions and a low electron density in the lighter regions. This is an electrical analog to
the photographic process that produces a negative image.
Figure 2.11: Vidicon tube [Ballard 82].
The charge distribution is "mad" through the use of a scanning electron beam. The
beam, emanating from the cathode at the rear of the tube, is made to scan the charge dis-
tribution in raster order, i.e., row by row. Upon contact with the photosensitive surface,
it replaces the electron charge in the regions where the charge was depleted by exposure
to the light. This charge neutralization process generates fluctuations in the electron
beam current, generating the analog video signal. In this manner, the intensity values
across an image are encoded as analog currents or voltages with fluctuations that are pro-
portional to the incident light. Once a physical image has been converted to an analog
signal, it is sampled and digitized to produce a 2-D array of integers that becomes avail-
able for computer processing.
The spatial resolution of the acquired image is determined by the spatial scanning
frequency and the sampling rate: higher rates produce more samples. Sampling rates also
have an impact on the choice of photosensitive material used. Slower scan rates require
photosensitive material that decays slowly. This can introduce several artifacts. First,
high retention capabilities may cause incomplete readout of the charge distribution due to
the sluggish response. Second, slowly decaying charge gives rise to temporal blurring in
time-varying images whereby charge distributions of several images may get merged
together. This problem can be alleviated by saturating the surface with electrical charge
between exposures in order to reduce any residual images.
Vidicon systems often suffer from geometric distortions. This is caused by several
factors. First, the scanning electron beam often does not precisely retain positional
Clyde N, Herrick, TELEVISION 'DtEORY AND SERVICING: Black/White md Color, 20.,
¸1976, p. 43. ReprLnted by Fermls sion of Prentice Hall, Inc., Englewood Cliffs, New ersoy.
linearity across the full face of the surface. Second, the electron beam can be deflected
off course by high contrast charge (image) boundaries. This is particularly troublesome
because it is an image-dependent artifact. Third, the photosensitive material may be
defective with uneven charge retention due to nonuniform coatings. Several related sys-
tems offer more stable performance, including those using image orthicon, plumbicon,
and saticon tubes. Orthicon tubes have the additional advantage of accommodating flexi-
ble scan patterns.
2,3,1.2, Image Dissectors
Video signals can also be generated by using image dissectors. As with vidicon
cameras, an image is focused directly onto a cathode coated with a photosensitive layer.
This time, however, the cathode emits electrons in proportion to the incident light. This
produces an electron beam whose cross section is roughly the same as the geometry of
the tube surface. The beam is accelerated toward a target by the anode. The target is an
electron multiplier covered by a small aperture, or pinhole, which allows only a small
part of the electron beam emitted by the cathode to reach the target. Focusing coils focus
the beam, and deflection coils then scan it past the target aperture, where the electron
multiplier produces a varying voltage representing the video signal. The name "dissec-
tor" is derived from the manner in which the image is scanned past the target. Figure
2.12 shows a schematic diagram.
Figure 2.12: Image dissector [Ballard 82].
Image dissectors differ from vidicon systems in that dissectors are based on the
principle of photoemission, whereas vidicon tubes are based on the principle of photo-
conductivity. This manifests itself in the manner in which these devices sense the image.
In ridicon tubes, a narrow beam emanates from the cathode and is deflected across the
photosensitive surface to sense each point. In image dissectors, a wide electron beam is
Figure 2,21 of Computer Vision, o. ditl by Dana BaUard and Christopher Brown, 1982. Copyright
¸1982 by Prentice Ha/i, Inc., Englewood Cliffs, New lrsey. Reprinted courtesy of Michel
produced by the photosensitive cathode, and each point is sensed by deflecting the entire
beam past a pinhole onto some pickup device. This method facilitates noise reduction by
integrating the emission of each input point over a specified time interval. Although the
slow response of photoemissive materials limits the speed of image dissectors, the
integration capability makes image dissectors attractive in applications requiring high
signal-to-noise ratios for stationary images.
2.3.2. Solid-State Sensors
The most recent developments in image acquisition have come from solid-state
imaging sensors, known as charge tran.fer devices (CTD). There are two main classes
of CTDs: charge-coupled devices (CCDs) and charge-injection devices (CIDs). They
differ primarily in the way in which information is read out.
18.104.22.168. CCD Cameras
A CCD is a monolithic array of closely spaced MOS (metal-oxide semiconductor)
capacitors on a small rectangular solid-state surface. Each capacitor is often referred to
as a photosite, or potential well, storing charge in response to the incident light intensity.
An image is acquired by exposing the array to the desired scene. The exposure creates a
distribution of electric potential throughout all the capacitors. The sampled analog, or
discrete-continuous, video signal is generated by reading each well sequentially. This
signal is then digitized to produce a digital image.
The electric potential is read from the CCD in a process known as bucket brigade
due to its resemblance to shift registers in computer logic circuits. The first potential
well on each line is read out. Then, the electric potential along each line is shifted by one
position. Note that connections between capacitors along a line permit charge to shift
from element to element along a row. The read-shift cycle is then repeated until all the
potential wells have been shifted out of the monolithic array. This process is depicted in
CCD arrays are packaged as either line sensors or area sensors. Line sensors consist
of a scanline of photosites and produce a 2-D image by relative motion with the scene.
This is usually integrated as part of a mechanical scanner (more on this later) whereby
some mechanical assembly moves the line sensor across the entire physical image. Area
sensors are composed of a 2-D matrix of photosites.
CCDs have several advantages over vidicon systems. The chief benefits are derived
from the extremely linear radiometric (intensity) response and increased sensitivity.
Unlike vidicon systems that can yield no more than 8 bits of precision because of analog
noise, a CCD can easily provide 12 bits of precision. Furthermore, the fixed position of
each photosite yields high geometric precision. The devices are small, portable, reliable,
cheap, operate at low voltage, consume little power, are not damaged by intense light,
and can provide images of up to 2000 x 2000 samples. As a result, they have made their
way into virtually all modem TV cameras and cameorders. CCD cameras also offer
superior performance in low lighting and low temperature conditions. As a result, they
Figure 2.13: CCD readout mechanism [Green 89].
are even utilized in the NASA Space Telescope project and are found aboard the Galileo
spacecraft that is due to orbit Jupiter in the early 1990s. Interested readers are referred to
[Janesick 87] for a thorough treatment of CCD technology.
22.214.171.124. CID Cameras
Charge-injection devices resemble charge-coupled devices except that the readout,
or sensing, process is different. Instead of behaving like a shift register during sensing,
the charges are confined to the photosites where they were generated. They are read by
using a row-column addressing technique similar to that used in conventional computer
memories. Basically, the stored charge is "injected" into the substrate and the resulting
displacement current is detected to create the video signal. CIDs are better than CCDs in
the following respects: they offer wider spectral and dynamic range, increased tolerance
to processing defects, simple mechanization, avoidance of charge transfer losses, and
minimized blooming. Thevare, however, not superior to CCD cameras in low light or
low temperature settingS. )
2.3.3. Mechanical Scanners
A mechanical scanner is an imaging device that operates by mechanically passing
the photosensors and images past one another. This is in contrast to electronic and
solid-state scanners in which the image and photodetector both remain stationary. How-
ever, it is important to note that either of these two classes of systems can be used in a
There are three primary types of mechanical scanners: fiat-bed, dram, and scanning
cameras. In flat-bed scanners, a film or photograph is laid on a flat surface over which
the light source and the sensor are transported in a raster fashion. In a drum digitizer, the
image is mounted on a rotating dram, while the light beam moves along the drum
Digital Image Processing: by W.B. Green ¸1989 Van Nestzend Reinhold. Reprinted by
permlasion of the Publisher. All Rights Reserved.
2.3 IMAGING SYSTEMS 37
parallel to its axis of rotation. Finally, scanning cameras embed a scanning mechanism
directly in the camera. In one manifestation, they use stationary line sensors with a mir-
ror to deflect the light from successive image rows onto the sensor. In a second manifes-
tation, the actual line sensor is physically moved inside the camera. These techniques
basically address the manner in which the image is presented to the photosensors. The
actual choice of sensors, however, can be taken from electronic scanners or solid-state
imaging devices. Futhermore, the light sources can be generated by a CRT, laser beam,
lamp, or light-emitting diodes (LEDs).
Microdensitometers are film scanners used for digitizing film transparencies or pho-
tographs at spot sizes ranging down to one micron. These devices are usually fiat-bed
scanners, requiring the scanned material to be mounted on a flat surface which is
translated in relation to a light beam. The light beam passes through the transparency, or
it is reflected from the surface of the photograph. In either case, a photodetector senses
the transmitted light intensity. Since microdensitometers are mechanically controlled,
they are slow image acquisition devices, but offer high geometric precision.
2.4. VIDEO DIGITIZERS
Many image acquisition systems generate television signals. These are analog
video signals that are acquired in a fixed format, according to one of the three color telev-
ision standards: National Television Systems Committee (NTSC), Sequential Couleur
Avec Memoire (SECAM, or sequential chrominance signal with memory), and Phase
Alternating Line (PAL). These systems establish format conventions and standards for
broadcast video transmission in different parts of the world. NTSC is used in North
America and Japan; SECAM is prevalent in France, Eastern Europe, the Soviet Union,
and the Middle East; and PAL is used in most of Western Europe, including West Ger-
many and the United Kingdom, as well as South America, Asia, and Africa.
The NTSC system requires the video signal to consist of a sequence of frames, with
525 lines per frame, and 30 frames per second. Each frame is a complete scan of the tar-
get. In order to reduce transmission bandwidth, a frame is composed of two interlaced
fields, each consisting of 262.5 lines. The first field contains all the odd lines and the
second field contains the even lines. To reduce flicker, alternate fields are sent at a rate
of 60 fields per second.
The NTSC system further reduces transmission bandwidth by compressing chromi-
hence information. Colors are represented in the YIQ color space, a linear transforma-
tion of RGB. The term Yrefers to the monochrome intensity. This is the only signal that
is used in black-and-white televisions. Color televisions have receivers that make use of
I and Q, the in-phase and quadrature chominance components, respectively. The conver-
sion between the RGB and YIQ color spaces is given in [Foley 90]. Somewhat better
quality is achieved with the SECAM and PAL systems. Although they also bandlimit
chrominance, they both use 625 lines per frame, 25 frames per second, and 2:1 line inter-
In recent years, many devices have been designed to digitize video signals. The
basic idea of video digitizers involves freezing a video frame and then digitizing it. Each
NTSC frame contains 482 visible lines with 640 samples per line. This is in accord with
the standard 4:3 aspect ratio of the screen. At 8 bits/pixel, this equates to roughly one
quarter of a megabyte for a monochrome image. Color images require three times this
amount. Even more memory is needed for high-definition television (HDTV) images.
Although no HDTV standard has yet been formally established, HDTV color images
with a resolution of, say, 1050x 1024 requires approximately 3 Mbytes of data! Most
general-purpose computers cannot handle the bandwidth necessary to transfer and pro-
cess this much information, especially at a rate of 30 frames per second. As a result,
some form of rate buffering is required.
Rate buffering is a process through which high rate data are stored in an intermedi-
ate storage device as they are acquired at a high rate and then mad out from the inter-
mediate storage at a lower rate. The intermediate memory is known as a frame buffer or
frame store. Its single most distinguishing characteristic is that its contents can be writ-
ten or read at TV rates. In addition, it is sometimes enhanced with many memory-
addressing modes, including real-time zoom (pixel replication), scroll (vertical shifts),
and pan (horizontal shifts). Such video digitizers operate at frame rates, and are also
known as frame grabbers. Frame grabbers attached to CCD or vidicon cameras have
become popular digital image acquisition systems due to their low price, general-purpose