LAB #8: Piecewise Polynomial Interpolation

Introduction
MATLAB Tips
Piecewise Parabolic Interpolation
Cubic Hermite Interpolation
Cubic Spline Interpolation
Parametric Interpolation
ASSIGNMENT

Introduction

Piecewise linear interpolation has many good properties. In particular, if there was a continuously differentiable function f(x) generating the data, the data points were suitably spread throughout the closed interval, then the interpolant converged to the function.

Now we'd like to consider some common attempts to improve the interpolant, increasing the rate of convergence, or the smoothness. Unfortunately, the formulas for these interpolants get complicated, so I'll give you most of the information you need.

We will also take a quick look at one way that interpolation can move into higher dimensions.

At the end of this lab is an assignment due by the beginning of the next lab session. The assignment involves a graph, and you simply have to show it to me.

MATLAB Tips

We're using the Runge function to examine the performance of our interpolation, which is fine. It means it's easy to increase the number of data points, get the derivative, and so on. However, the Runge function is pretty smooth, and easy to interpolate. We'll want to use some other functions to give the interpolation routines a few headaches.

A very interesting function is 1 at a single place and 0 elsewhere. We'll call this the tack function. Here's how you could define it in MATLAB:

function pval = tack ( x )
n = length ( x );
pval = ( [1:n] == floor(n/2) );

I need to tell you that FLOOR(X) returns the integer part of a real value, and that in MATLAB, a logical expression has the value 1 if true, and 0 if false.

Another function is 0 for a while and then 1. We'll call this the step function. Here's how you could define it in MATLAB:

function pval = step ( x )
n = length ( x );
pval = floor ( 2 * [0:n-1] / n );

The comb function is 0 and 1 and 0 and 1 and ...:

function pval = comb ( x )
n = length ( x );
pval = mod ( [1:n], 2 );

If we increase the divisor 2 to a larger value, we might call the result a rasp function.

These functions are artificial; they don't really correspond to a formula that applies for all x. But they do tend to show up the differences and defects of an interpolation method.

In the exercises below, I've asked you to look at the Runge function over and over again. I imagine that must get boring after a while. When you want to do something more interesting, try using one of these artificial functions instead. The tack function is interesting, because it shows how changing one data value affects the entire code.

Piecewise Parabolic Interpolation

The piecewise linear interpolant converges, but it doesn't look very nice. Can we try a similar interpolation using pieces of parabolas, a quadratic interpolant? If we do the right things, we'd expect the convergence rate to increase, and the interpolant to be a little more graceful over the interpolation intervals. We'll still have those unpleasant "kinks" in the derivatives at the interior abscissas, but we'll see if we can deal with them later.

To define a parabolic function, we need three data points. So each piece of the piecewise function will actually be defined over two consecutive data intervals.

To evaluate the piecewise quadratic interpolant at a point XVAL, we need to:

determine the interval LEFT;
shift LEFT down by 1 if it is even (why?).
determine the parabola which passes through the data points;
evaluate that parabola at XVAL.

If we want to evaluate the interpolant at XVAL, we need to do the usual "bracketing" business. If we call our bracket routine, it will return the value of the data point just to the left of XVAL. But the data points are now in groups of three, and we want to know the data point that begins the set of three. How do we do this?

Warmup Exercise: Define 11 data values XDATA evenly spaced from 0 to 100, and define 101 sample values X, also evenly spaced from 0 to 100. If we compute

y = bracket ( xdata, x )

we know we'll get bracket values suitable for linear interpolation. In particular, here are some:

        x 5 15 25 35 45 ...
        y 1  2  3  4  5 ...

For parabolic interpolation, we need to adjust the "2" to be a "1", the "4" to be a "3", and in general, all the even points need to decrease by 1. Can you think of a nice way to do this? You might want to use the MATLAB FLOOR function, which rounds a number to the nearest integer value. Can you see how this would work? Fill in the missing line, and then plot X versus Y. You should see a staircase with only 5 steps.

        xdata = linspace ( 0, 100, 11 );
        x = linspace ( 0, 100, 101 );
        y = bracket ( xdata, x );
        y = ???

The next piece of the puzzle is, if we have three data points (x0,y0), (x1,y1), and (x2,y2), what is the value at x of the parabola that goes through these points? You can figure this out using divided differences, but let me help you. The formula is:

        c = y0
        b = ( y1 - y0 ) / ( x1 - x0 )
        a = ( ( y2 - y1 ) / ( x2 - x1 ) - ( y1 - y0 ) / ( x1 - x0 ) ) / ( x2 - x0 )
        p = c + b * ( x - x0 ) + a ( x - x0 ) * ( x - x1 )

You'll need to convert these short names to the ones we're using in the program.

Exercise: write a MATLAB M file called paraval.m which has a declaration of the form:

function pval = paraval ( xdata, ydata, xval )

and which evaluates the piecewise parabolic interpolant. Here is a sketch of the routine:

        left = bracket ( xdata, xval )
        left = ???

        c = ???
        b = ???
        a = ???

        pval = ???

This routine should work for a vector of values if you are careful to include the proper operators.

Exercise: Use 5 equally spaced data points from -5 to 5 on the Runge function. Estimate the maximum interpolation error by computing the maximum absolute difference at 101 evenly spaced points. Repeat for 9, 17, and 33 data points. As the interval size decreases by a factor of 2, how is the maximum error behaving?

Cubic Hermite Interpolation

When we look at the results of the piecewise parabolic interpolation, there are a couple of things that might concern you. First, there's the tricky business that the number of data points must be odd, so that the number of intervals is even, so that we can take pairs of intervals to build the parabolas. Second, although the function is nice and smooth within each pair of intervals, there's an obvious "kink" in the derivative at the odd numbered data points, where the form of the parabola changes.

If we were trying to design, say, the shape of the sheet metal pattern for a car door, this kind of kink would not be acceptable. If that's a problem, then perhaps we could try to make sure that our interpolant is smoother and simpler by matching both the function value and the derivative at each point. (This assumes you can get the derivative value. That's not always the case!)

Suppose we have NDATA points XDATA, at each of which we have both the function value YDATA and YPDATA. If there really is a function f(x) lurking behind the data, then we may assume, for i from 1 to NDATA, that:

YDATA_i = f ( XDATA_i )
YPDATA_i = f' ( XDATA_i )

Now consider the situation in the I-th interval, [ XDATA_i, XDATA_i+1 ]. There is just one cubic function that can be defined on this interval, and which has the appropriate function and derivative values at the two endpoints. As before then, we need to determine the interval, (but no extra funny business like with paraval), and then evaluate the function defined by the data. The algebra is a little tricky, so I'll supply that for you.

Exercise: create a MATLAB M file rungep.m which computes the derivative of the Runge function:

function fp = rungep ( x )

I leave it to you to correctly differentiate the Runge function! Then copy from my web page the file herval.m. We are going to use this file to approximate the Runge function. To make sure things are OK,

define 5 data points XDATA equally spaced from -5 to 5;
define YDATA and YPDATA, the values of the Runge function and its derivative at the data points;
define 101 sampling points X equally spaced from -5 to 5;
compute the value P of the Hermite interpolant at 101 points;
compute the value Y of the Runge function at 101 points.
compare the plots of the (X,Y) and (X,P);
note the maximum norm of P - Y.

Of course, we really want to know if the approximation gets better, or goes crazy the way that the continuous polynomial interpolant did. So repeat this exercise for NDATA = 9 and 17.

Cubic Spline Interpolation

One disadvantage of the Hermite interpolation scheme is that you need to know the derivatives of your function. But it's very possible that you don't have any formula for your data, just the values at the data points. A second problem is the Hermite interpolant is smooth, but not smooth enough. The discontinuities in the second derivative at the data points can be noticeable.

Cubic splines are piecewise cubic interpolants which are "very smooth". They are continuous, with continuous first and second derivatives. Only the third derivative is allowed to jump at the "join" points.

The continuity and interpolation conditions are almost enough to specify the cubic. If we add one extra condition at each endpoint, such as the value of the derivative, then the spline is determined.

To use a cubic spline requires several steps. For a given set of data, you compute the spline coefficients. Then, to evaluate the spline at a point, you have to bracket the point with data abscissas (just as we did for the other piecewise functions), and then evaluate the appropriate polynomial.

Exercise: From my web page, copy the files cubset.m and cubval.m. Here is a suggestion on how these routines are used:

          ndata = ?
          xdata = linspace ( -5, 5, ndata );
          ydata = runge ( xdata )

          ypp = cubset ( xdata, ydata )

          x = linspace ( -5, 5, 101 )
          y = runge ( x )
          p = cubval ( xdata, ydata, ypp, x )

Start with 5 equally spaced data points from -5 to 5 on the Runge function. Estimate the maximum interpolation error by computing the maximum absolute difference at 101 evenly spaced points. Repeat for 9, 17, and 33 data points. As the interval size decreases by a factor of 2, how is the maximum error behaving?

Parametric Interpolation

There are many aspects of interpolation we can't cover, especially what happens in multiple dimensions. But let's at least look at a fun aspect, for a change. We're familiar with a parameterized curve, in which a special variable, typically called t, is used to draw objects for which the relationship between x and y is not functional. We are comfortable with the idea that a good definition of a circle is:

          x = cos ( 2 * pi * t )
          y = sin ( 2 * pi * t )

Well, suppose we wanted to do interpolation of a complicated curve? If we have a drawing of the curve, we could simply mark points along the curve that are roughly evenly spaced, and make tables of TDATA, XDATA and YDATA, where the values of t are just 1, 2, 3,...

Now if we want to compute intermediate points on the curve, that's really saying, for a given value of t, what would the corresponding values of x and y be? This takes a bit of thought, but just realize that t is the independent variable that we're used to calling x, and both x and y become dependent variables. If you can keep track of that in your mind, you can see how to modify our interpolation methods to apply them to the parameterized case.

Exercise. You don't believe me? Try this:

          ndata = 21;
          tdata = linspace ( 0, 2*pi, ndata );
          r = 0.5 + cos ( tdata );
          xdata = r .* cos ( tdata );
          ydata = r .* sin ( tdata );

          t = linspace ( 0, 2*pi, 5*ndata );
          x = lintval ( tdata, xdata, t ); 
          y = lintval ( tdata, ydata, t );

          plot ( x, y )

If you'd like a nicer picture, try it again with paraval!

Assignment

Assignment: on a sheet of graph paper, choose a box of about 10 by 10 cells, and write a large script capital letter that roughly fills the box. You'll want to pick a letter that can be drawn in a single curve. You might use one of your initials, or fancy script letters "E", "M" or "P". Mark about 20 equally spaced points along the curve, and record the XDATA and YDATA locations of these points. Now try the following commands:

          ndata = length ( xdata );
          tdata = [ 1 : ndata ];

          t = linspace ( 1, ndata, 3*(ndata-1) + 1 );
          x = lintval ( tdata, xdata, t ); 
          y = lintval ( tdata, ydata, t );

          plot ( x, y )

Now repeat this process, but use a "better" interpolation routine. When you are satisfied with your results, call me over and show me. That's all.

Back to the MATH2070 page.

Last revised on 16 November 1999.