FEM_2011
Notes for Friday, 2 September 2011


This lecture was prepared and presented by Miro Stoyanov.

Differential Equations

The finite element method is a procedure for solving certain problems that are typically defined by differential equations. A differential equation is a statement about a function and its derivatives. Examples might include

        u'(x) = u(x)
        d2u/dx2 = f(x)
        uxx + uyy = g(x,y)
      
In the last example, the function u depends on two variables, so this is an example of a partial differential equation.

We obvious expect the solution of a differential equation to be a function which is not merely continuous but continuously differentiable, perhaps to several orders of differentiation. One motivation for the finite element method is that certain interesting equations do not have such solutions - and yet, there is often a kind of solution if we are willing to generalize what we will accept as a solution.

Depending on the situation, a differential equation that we are asked to solve will include right hand side data, such as the function f(x) in example 2, or g(x,y) in example 3, but also initial data (if the problem involves differential with respect to time) or boundary conditions, or other conditions that are imposed. Physically, these extra conditions provide information needed to specify the problem. Mathematically, they are thought of as the way to guarantee uniqueness of the solution.

Pointwise Differences

Suppose we have a function g(x), and we define f(x) as the derivative of g(x). What is the effect of a change in the definition of g at a single point? In other words, suppose that g1(x) is equal to g(x) everywhere except at x=1, where we set g1(x) to 1000. Then applying the derivative operation to g1(x) will give us a different result: in particular, g1(x) is no longer differentiable at x=1, so the new derivative function f1(x), is undefined there. The differentiation operator works on local differences, and is very sensitive to pointwise changes.

Now suppose we have a function f(x), and we define g(x) as the integral of f(x) from -oo to x. Suppose we have another function f1(x), which is the same as f(x) everywhere, except that f1(x) = 1000. If we define the function g1(x) as the integral of f1(x), what happens? The functions g(x) and g1(x) are identical everywhere! The integral operator is a "smoothing" function. It works by summing things and averaging. Changes to single point values are not noticed.

This suggests that if you want to solve problems in which the data (the right hand side function, or a coefficient function, or a boundary condition) may be discontinuous, it might help to turn from the classical pointwise view of differential equations to a different way of looking at things, in which we integrate the objects we are interested in.

Equivalence Classes of Functions

Suppose, then, that we think about what we can say about a function if we only know its integral. Let's say we are working on the interval [a,b], and we have a collection of functions, for any one of which we can compute I(f), by which we mean the integral of f(x) from a to b.

What can we say, in particular, if I(|f|) = 0? To say that the integral of the absolute value of a function is zero is to make a strong statement. Certainly, this would be true if f(x) is the zero function. But are there other functions whose absolute values could integrate to 0?

Although we are used to only integrating continuous functions, the Riemann integral can handle simple discontinuities. Suppose g(x) is a function which is zero everywhere except at 1, where it is 1000. You should easily convince yourself that the definition of the Riemann integral of the absolute value of this function is also zero, in other words, the single nonzero value made no difference whatsoever to the integral.

It is natural to assume that if the value of a function at a single point makes no difference to the integral, then we can change the value at any point without making a difference (true), and that therefore we can change all the values at all the points, arbitrarily, without making any difference. This last idea is incorrect, luckily! But how do we sort out this situation?

If you are familiar with Lebesgue integration, then you know the answer. Two functions will have the same integral if they differ on a set of measure zero. In particular, a function will integrate to zero if it is zero except on a set of measure zero. Sets that have measure zero include a single point, a finite number of points, or a countable number of points. If we are integrating over a 2D region, then a line has measure zero; in 3D, a surface would also have measure zero. Sets of measure zero are, in a sense, so small that they make no difference to the integral.

Now we can say that two functions f and g are "the same" if they differ only a set of measure zero. Given a fixed function f(x), which might be f(x)=x^2, for instance, then f(x) is "the same" as any function g(x) which is equal to x^2 except at a single point; or at two points, or at countable many points. Any such function g(x) is a slight variation of f(x), and when we integrate |f(x)-g(x)|, we get 0. We say g(x) is equivalent to f(x).

If g(x) and f(x) are equivalent, we can't be sure that they are equal at any particular point x0, but we do know that they can't be unequal at too many places. If we have a particular function f(x) in mind, the set of all other functions g(x) for which I(|f(x)-g(x)|) is zero is called "the equivalence class of f(x)".

Any equivalence class can be "represented" by any one of the functions that are a member of that class. Thus, it may be convenient to think of the set of all functions which are equivalent to x^2 as simply x^2; strictly speaking, this is not correct, since the objects in the set are only "one thing" after we apply the integral operation to them.

In finite elements, we often deal with ordinary functions with classical definitions. However, it's important to be able to understand why the integral operation means that pointwise values are not important, but the general shape of the function is.

A Linear Vector Space

The solutions to our differential equations are functions. But we are going to like to think about them abstractly, as another kind of vector. In order to do that, we must recall the properties of a linear vector space.

Briefly, a linear vector space V is a collection of elements v, with two operations defined, namely addition and scalar multiplication. Also, there is a special element z, the "zero vector".

If u and v are elements of V, and s is a scalar value, then it must be true that we can combine these elements in various ways and get new elements of V, namely:

It should be clear that if we have a collection of functions, we can define a linear space as the collection of all possible sums and scalar multiples of our starting collection. Here, "addition" of functions would mean the pointwise sum:

        (u+v)(x) = u(x) + v(x)
      

If we wish to think of equivalence classes of functions as forming our linear vector space, then addition cannot be thought of in terms of point values. Instead, "u+v" is the item whose integral is equal to the integral of u plus the integral of v.

Norms

Geometry is the study of size and distance. To give a size to the objects in our vector space, we need to come up with what is called a "norm". The norm of a vector f might be symbolized by ||f||. A norm can measure the size of a vector ||f||, but it can also measure the distance between two vectors: ||f-g||.

In finite dimensional vector spaces, three common norms of a vector (x,y) are

These same norms are commonly used for functions defined on an interval [a,b]:

The max norm is defined by the essential supremum because, remember, we are dealing with functions which don't really have pointwise values. If I want the max norm of x^2 over [0,5], I am really asking for the max norm over all functions equivalent to x^2, which means I can alter one, two, or many values, making the maximum value as large as I like. But I can't alter the essential supremum by such changes. The essential supremum of a function "should" be the largest function value, but it is allowed to ignore values on any set of measure zero. And that rules out the small modifications I can make to functions equivalent to x^2. So the essential supremum of any function equivalent to x^2 over [0,5] is 25.

The technical definition of the essential supremum of a function f(x) is:

        ess sup f = inf { s : m ( { x : f(x) > s } ) = 0 }
      
that is, it is the smallest number S which is bigger than all f(x) values except for a set of measure zero. In particular, all the values of f(x) are less than or equal to S, except for a "handful" of x values, that is, a set of measure zero, that is, a set which integrates to zero.

Given a finite interval [a,b], we define the following spaces:

Homework Exercise

Show that Loo[a,b] is a subset of L2[a,b] which is a subset of L1[a,b].

You really need to show this for the case where f is an equivalence class; that is, if f is in L1[a,b], you should only assume that you know that the (Lebesgue) integral of the absolute value of f is finite over [a,b]. If you find it too hard to think about these equivalence classes, you can start by working out the problem for the case when we are dealing with classical pointwise functions.


Last revised on 03 September 2011.