Digital image warping

Download 2.54 Mb.
Size2.54 Mb.
1   ...   12   13   14   15   16   17   18   19   ...   30

of the filters to a high contrast image with edges oriented in many directions. The

Madonna image is typical of many natural images with smoothly varying regions (skin),

high frequency regions (hair), and sharp transitions on curved boundaries (cheek). Also,

a human face (especially one as famous as this) comes with a significant amount of a

priori knowledge, which may affect the subjective evaluation of quality. Only mono-

chrome images are used here to avoid obscuring the results over three color channels.

In Fig. 5.21, a small 50x50 section was taken from the center of the Star image,

and magnified to 500 x 500 by using the following interpolation metbeds: nearest neigh-

bor, linear interpolation, cubic convolution (with A =-1), and cubic convolution (with

A =-.5). Figure 5.22 shows the same image magnified by the following interpolation

metheds: cubic spline, Lanczos2 windowed sine function, Hamming windowed sine, and

the exponential filter derived from the tanh function.

The algorithms are rated according to the passband and stopband performances of

their interpolation kernels. If an additional process is required to compute coefficients

used together with the kernel, its effect must be evaluated as well. In [Parker 83], the

authors failed to consider this when they erroneously concluded that cubic convolution is

superior to cubic spline interpolation. Their conclusion was based on an inappropriate

comparison of the cubic B-spline kernel with that of the cubic convolution. The fault lies

in neglecting the effect of computing the coefficients in Eq. (5.3.1). Had the data sam-

ples been directly convolved with the cubic B-spline kernel, then the analysis would have

been correct. However, in performing a matrix inversion to determine the coefficients, a

certain periodic filter must be multiplied together with the spectrum of the cubic B-spline

in order to preduce the interpolation kernel. The resulting kernel can be easily demon-

strated to be of infinite support and oscillatory, sharing the same properties as the Cardi-

nal spline (sine) kernel [Madand 88]. This is reasonable, considering the recursive

nature of the interpolation kernel. By a direct comparison, cubic spline interpolation per-

forms better than cubic convolution, albeit at slightly greater computational cost.

It is important to note that high quality interpolation algorithms are not always war-

ranted for adequate reconstruction. This is due to the natural relationship that exists

between the rate at which the input is sampled and the interpolation quality necessary for

accurate reconstruction. If a bandiimited input is densely sampled, then its replicating

spectra are spaced far apart. This diminishes the role of frequency leakage in the degra-

dation of the reconstructed signal. Consequently, we can relax the accuracy of the inter-

polation kernel in the stopband. Therefore, the stopband performance necessary for ade-

quate reconstruction can be made a function of the input sampling rate. Low sampling

rates require the complexity of the sine function, while high rates allow simpler algo-

rithms. Although this result is intuitively obvious, it is reassuring to arrive at the same


(a) (b)

(a) (b)


(c) (d)

Figure 5.21: Image reconstmction. (a) Nearest neighbor; (b) Linear interpolation; (c)

Cubic convolution (A =-1); (d) Cubic convolution (A =-.5).

conclusion from an interpretation in the frequency domain.

The above discussion has focused on reconstructing gray-scale (color) images.

Complications emerge when the attention is resMcted to bi-level (binary) images. In

[Abdou 82], the authors analyze several interpolation schemes for biqev½l image applica-

tions. This is of practical importance for the geometric transformation of images of

black-and-white documents. Subtleties are introduced due to the nonlinear elements that

(c) (d)

Figure 5.22: Image reconstruction. (a) Cubic spline; (b) Lanczos2 window; (c) Hamming

window; (d) Exponential filter.

enter into the imaging process: quantization and thresholding. Since binary signals are

not bandlimited and the nonlinear effects are difficult to analyze in the frequency

domain, the analysis is performed in the spatial domain. Their results confirm the con-

clusions already derived regarding interpolation kernels. In addition, they arrive at useful

results relating the errors introduced in the tradeoff between sampling rate and quantiza-




In this section, we present two methods to speed up the image resampling stage.

The first approach addresses the computational bottleneck of evaluating the interpolation

function at any desired position. These computed values are intended for use as weights

applied to the input. The second method we describe is a fast 1-D resampling algorithm

that combines image reconstraction with antialiasing to perform image resampling in

scanline order. This algorithm, as originally proposed, implements reconstruction using

linear interpolation, and implements antialiasing using a box filter. It is ideally suited for

hardware implementation for use in nonlinear image warping.

5.6.1. Interpolation with Coefficient Bins

When implementing image resampling, it is necessary to weigh the input samples

with appropriate values taken from the interpolation kernel. Depending on the inverse

mapping, the interpolation kernel may be centered anywhere in the input. The weights

applied to the neighboring input pixels must be evaluated by sampling the centered ker-

nel at positions coinciding with the input samples. By making some assumptions about

the allowable set of positions at which we can sample the interpolation kernel, we can

accelerate the resampling operation by precomputing the input weights and storing them

in lookup tables for fast access during convolution [Ward 89].

In [Ward 89], image resampling is done by mapping each output point back into the

input image, i.e., inverse mapping. The distance between input pixels is divided into a

number of intervals, or bins, each having a set of precomputed coefficients. The set of

coefficients for each bin corresponds to samples of the interpolation function positioned

at the center of the bin. Computing each new pixel then requires quantization of its input

position to the nearest bin and a table lookup to obtain the corresponding set of weights

that are applied to the respective input samples. The advantage of this method is that the

calculation of coefficients, which requires evalntion of the interpolation function at posi-

tions corresponding to the original samples, is replaced by a table lookup operation. A

mean-squared error analysis with this method shows that the quantization effects due to

the use of coefficient bins can be made the same as integer roundoff if 17 bins are used.

More coefficient bins yields a higher density of points for which the integ0olation func-

tion is accurately computed, yielding more precise output values. Ward shows that a

lookup table of 65 coefficient bins adds virtually no error to that due to roundoff.

This approach is demonstrated below for the special case of 1-D magnification. The

function magnify_ 1D takes IN, an input array of INlen pixels, and magnifies it to fill

OUTlen entries in OUT. For convenience, we will assume that the symmetric convolu-

tion kernel extends over seven pixels (three on each side), i.e., a 7-point kernel. The ker-

nel is oversampled at a rate of Oversample samples per pixel. Therefore, if we wish to

center the kernel at any of, say, 512 subpixel positions, then we would choose Oversam-

ple =512 and initialize kern with 7 x512 kernel samples. Note that the 7-point kernel in

the code is included to make the program more efficient and readable. A more general

version of the code would perafit kernels of arbitrary width.


#define KernShift 12

#define KernHaft (1 << (KernShift-I))

#define Oversample512

magnify_l D(IN, OUT, INlen, OUTlen, kern)

unsigned char *IN, *OUT;

int INlen, OUTlen, *kern;

12-bit kernel integers */

1/2 in kernel's notat[on */

subdivisions per pixel */

int x, i, ii, dii, ff, dff, len;

long val;

len = OUTlen;



ii = 0; F ii indexes into bin '/

ff = OUTlen / 2; /* ff is fractional remainder '/

x = INlen * Oversample;

dii = x / OUTten; /* dii is ii increment '/

dlf = x % OUTlen; /* dff is ff increment '/

/* compute all output pixels '/

for(x=0; x

/* compute convolution centered at current position '/

val = (long) IN[-2] * kern[2*Oversample + ii]

+ (long) IN[-1] * kern[l*Oversample + ii]

+ (long) IN[ 0] * kernill]

+ (long) IN[ 1] * kern[l*Overeample - ii]

+ (long) tN[ 2] * kem[2*Oversample - ii]

+ (long) IN[ 3] * kern[3*Overeample - ii];

if(ii == 0)

val += (long) IN[-3] * kern[3*Oversample + ii];

/* roundoff and restore into 8-bit number*/

val = (val + KernHalf) >> KernShift;

if(val < 0) val = 0; /* clip from below '/

if(val > 0xFF) val = 0xFF; /* clip from above '/

OUT[x] = val; /* save result '/

F Bresenham-like algorithm to recenter kernel */

if((ff += dff) >= OUTlen) { F check if fractional part overflows */

ff -= OUTlen; F normalize */

ii++; /* increment integer part */


if((ii += dii) >= Oversample) { F check if integer part overllows */

ii -= Oversample; F normalize */

iN++; F increment input pointer */





The function magnify_ 1D above operates exclusively using integer arithmetic. This

proves to be efficient for those applications in which a floating point accelerator is not

available. We choose to represent kernel samples as integers scaled by 4096 for 12-bit

accuracy. Note that the sum of 7 products, each of which is 20 bits long (8-bit intensity

and 12-bit kernel), can be stored in 23 bits. This leaves plenty of space after packing the

results into 32-bit integers, a common size used in most machines. Although more bits

can be devoted to the precision of the kernel samples, higher quality results must neces-

sarily use larger values of Oversample as well.

A second motivation for using integer arithmetic comes from the desire to circum-

vent division while recentering the kernel. The most tfimct way of computing where to

center the kernel in IN is by evaluating (x) (lNlen / OUTlen), where x is the index into

OUT. An alternate, and cheaper, approach is to compute this positional information

incrementally using rational numbers in mixed radix notation [Massalin 90]. We identify

three variables of interest: IN, ii, and if. IN is the current input pixel in which the kernel

center resides. It is subdivided into Oversample bins. The variable ii indexes into the

proper bin. Since the txue floating point precision has been quantized in this process, ffis

used to maintain the position within the bin. Therefore, as we scan the input, the new

positions can be determined by adding di to ii and dff to if. When doing so, ff may

overflow beyond OUTlen and ii may overflow beyond Oversample. In these cases, the

appropriate roundoff and norrealizations must be made. In particular, ii is incremented if

ffis found to exceed OUTlen and IN is incremented if ii is found to exceed Oversample.

It should be evident that ii and if, taken together, form a fractional component that is

added IN. Although only ii is needed to determine which coefficient bin to select, ff is

needed to prevent the accrual of error during the incremental computation. Together,

these three variables form the following pointer P into the input:

P = IN + ii +if/OUTlen (5.6.1)


where 0 < ii < Oversample and 0 -

A few additional remarks are in order here. Since the convolution kernel can extend

beyond the image boundary, we assume that IN has been padded with a 3-pixel border on

both sides. For minimal border artifacts, their values should taken to be that of the image

boundary, i.e., IN[0] and lN[INlen-1], respectively. After each output pixel is com-

puted, the convolution kernel is shifted by an amount (lNlen-1)/(OUTlen-1). The

value -1 enters into the calculation because this is necessary to guarantee that the boun-

dary values will remain fixed. That is, for resampling operations such as magnification,

we generally want OUT [0] =IN [0] and OUT [OUTlen -1] =IN [lNlen-1].

Finally, it should be mentioned that the approach taken above is a variant of the

Bresenham line-drawing algorithm. Division is made unnecessary by use of rational

arithmetic where the integer numerator and denominator am maintained exactly. In Eq.

(5.6.1), for instance, ii can be used directly to index in the kernel without any additional

arithmetic. A mixed ratfix notation is used, such that ii and ffcannot exceed Oversample

and OUTlen, respectively. These kinds of incremental algorithms can be viewed in terms


of mixed ratfix arithmetic. An intuitive example of this notation is our system for telling

time: the units of days, hours, minutes, and seconds am not mutually related by the same

factor. That is, 1 day = 24 hours, 1 hour = 60 minutes, etc. Performing arithmetic above

is similar to performing calculations involving time. A significant difference, however,

is that whereas the scales of time are fixed, the scales used in this magnification example

are derived from lNlen and OUTlen, two data-dependent parameters. A similar approach

to the algorithm described above can be taken to perform minification. This problem is

left as an exercise for the reader.

5.6.2. Fanifs Resampling Algorithm

The central benefit of separable algorithms is the reduction in complexity of 1-D

resampling algorithms. When the input is restricted to be one-dimensional, efficient

solutions are made possible for the image reconstxuction and antialiasing components of

resampling. Fant presents a detailed description of such an algorithm that is well-suited

for hardware implementation [Fant 86]. Related patents on this method include [Graf 87,

Fant 89].

The process treats the input and output as stxearns of pixels that are consumed and

generated at rates determined by the spatial mapping. The input is assumed to be

mapped onto the output along a single direction, i.e., with no folds. As each input pixel

arrives, it is weighted by its partial contribution to the current output pixel and integrated

into an accumulator. In terms of the input and output stxeams, one of three contritions is


1. The current input pixel is entirely consumed without completing an output pixel.

2. The input is entirely consumed while completing the output pixel.

3. The output pixel will be completed without entirely consuming the current input

pixel. In this case, a new input value is interpolated from the neighboring input pix-

els at the position where the input was no longer consumed. It is used as the next

element in the input stream.

If conditions (2) or (3) apply, the output computation is complete and the accumula-

tor value is stored into the output array. The accumulator is then reset to zero in order to

receive new input contributions for the next output pixel. Since the input is unidirec-

tional, a one-element accumulator is sufficient. The process continues to cycle until the

entire input stream is consumed.

The algorithm described in [Fant 86] is a principal 1-D resampling method used in

separable txansformations defined in terms of forward mapping functions. Like the

example given in the preceding section, this method can be shown to use a variant of the

Bresenham algorithm to step through the input and output streams. It is demonstrated in

the example below. Consider the input arrays shown in Fig. 5.23. The first array

specifies the values of Fv(U) for u=0, t ..... 4. These represent new x-coortfinates for

their respective input pixels. For instance, the leftmost pixel will start at x = 0.6 and ter-

minate at x=2.3. The next input pixel begins to influence the output at x=2.3 and

proceeds until x = 3.2. This continues until the last input pixel is consumed, filling the


output between x = 3.3 and x = 3.9.

The second array specifies the distribution range that each input pixel assumes in

the output. It is simply the difference between adjacent coordinates. Note that this

requires the first array to have an additional element to define the length of the last input

pixel Large values correspond to stretching, and small values reflect compression. They

determine the rate at which input is consumed to generate the output stream.

The input intensity values are given in the third array. Their contributions to the

output stream is marked by connecting segments. The output values are labeled A 0

through A 3 and are defined below. For clarity, the following notation is used: interpo-

lated input values are written within square brackets ([]), weights denoting contributions

to output pixels are written within an extra level of parentheses, and input intensity

values are printed in boldface.

Fv(u) .6 2.3 3.2 3.3 3.9

AFv(u ) 1.7 .9 .1 .6

put I00 90t

Figure 5.23: Resampilng example.

A0 = (100)((.4))=40


The algorithm demonstrates both image reconsWaction and antialiasing. When we

are not positioned at pixel boundaries in the input stream, linear interpolation is used to

reconstruct the discrete input. When more than one input element contributes to an out-

put pixel, the weighted results are integrated in an accumulator to achieve antialiasing.

These two cases are both represented in the above equations, as denoted by the expres-

sions between square brackets and double parentheses, respectively.

Fant presents several examples of this algorithm on images that undergo

magnification, minification, and a combination of these two operations. The resampling

algorithm is illustrated in Fig. 5.24. It makes references to the following variables.

SIZFAC is the multipilcative scale factor from the input to the output. For example,

SIZFAC =2 denotes two-fold magnification. INSFAC is 1/SIZFAC, or the inverse size

factor. It indicates how many input pixels contribute to each output pixel. Thus, in the

case of two-fold magnification, only one half of an input pixel is needed to fill an entire

output pixel. INSEG indicates how much of the current input pixel remains to contribute

to the next output pixel. This ranges from 0 (totally consumed) to 1 (entirely available).

Analogously, OUTSEG indicates how much of the current output pixel remains to be

filled. Finally, Pixel is the intensity value of the current input pixel.

[put Output

(;c n' [slZFAC$-nccamulator I

Figure 5.24: Fant's resampling algorithm [Fant 86].

The following C code implements the algorithm as described above. Input intensi-

ties are found in IN, an array containing INlen entries. The output is stored in OUT, an

array of OUTlen elements. In this example, we assume that a constant scale factor,

OUTlen/lNlen, applies to each pixel.

ReHinted from IEEE Computer Graphics and Application. v, Volume 6, Numar 1, J.nuay 1986,


resample(IN, OUT, INlen, OUTlen)

unsigned char 'IN, *OUT;

int INlen, OUTlen;

int u, x;

double ace, intensity, INSFAC, SIZFAC, INSEG, OUTSEG;

SIZFAC = (double) OUTlen/INlen;



INSEG = 1.0;

ace = 0.;

/* compute all output pixels */

for(x = u = O; x < OUTlen; ) {

/* scale factor */

/* inverse scale factor */

/* # input pixels that map onto 1 output pixel '/

/* entire input pixel is available */

/* clear accumulator */

/* use linear interpolation for reconstruction */

intensity = (INSEG * IN[u]) + ((1-1NSEG) * IN[u+l]);

/* INSEG < OUTSEG: input pixel is entirely consumed before output pixel */


ace += (intensity * INSEG); /* accumulate weighted contribution '/

OUTSEG -= INSEG; /* INSEG portion has been filled */

INSEG = 1.0; /* new input pixel will be available */

u++; /* index into next input pixel */


/* [NSEG >= OUTSEG: input pixel is not entirely consumed before output pixel */

else {

ace += (intensity * OUTSEG); /* accumulate weighted contribution */

OUT[x] = ace * SIZFAC; /* init output with normalized accumulator */

acc= 04 /* reset accumulator for next output pixel */

INSEG -= OUTSEG; /* OUTSEG portion of input has been used */

OUTSEG = INSFAC; /* restore OUTSEG */

x++; /* index into next output pixel */

The code given above is restricted to transformations characterized by a constant

scale factor, i.e., affine txansformations. It can be modified to handle nonlinear image

warping, where the scale factor is made to vary from pixel to pixel. The more sophisti-

cated mappings involved in this general case can be conveniently stored in a forward

mapping address table F. The elements of this table are point samples of the forward

mapping function -- it contains the output coordinates for each input pixel. Conse-

quently, F is made to have the same dimensions as IN, the array containing the input

image values.


msample_gen(F, IN, OUT, INlen, OUTlen)

double *F;

unsigned char 'IN, 'OUT;

int INlen, OUTlen;


int u, x;

double ace, intensity, inpos[2048], INSFAC, INSEG, OUTSEG;

/* precompute input index for each output pixel */

for(u = x = 0; x < OUTlen; x++) {

while(F[u+l] < x) u++;

inpos[x] = u + (double) (x-F[u]) / (F[u+l]-F[u]);


INSEG = 1.0; /* entire input pixel is available */

OUTSEG = i npos[1]; /* # input pixels that map onto 1 output pixel */

INSFAC = OUTSEG; /* inverse scale factor*/

ace = 0.; /* clear accumulator */

/* compute all output pixels '/

Directory: filedownload

Download 2.54 Mb.

Share with your friends:
1   ...   12   13   14   15   16   17   18   19   ...   30

The database is protected by copyright © 2024
send message

    Main page