Directional Derivatives Part 3: Is it a max, min, or saddle point?!

May 18, 2010

Last time, we noticed that when \nabla f(x,y) = 0 at some point (x,y) we have that one of two things happens: either there’s a max or min, or there’s a saddle point.  So, I guess, that’s kind of three things.  Okay, three things can happen: max, min, or saddle point.  Let’s take a look at these.

Our first function is the function f(x,y) = x^2 + y^2 +xy.

graph2a

It kind of looks like a big hammock!  Okay, now, let’s note that \nabla f(x,y) = (2x + y, 2y + x), and therefore is equal to zero exactly when its components are: we, therefore, have that at the point (0,0), the gradient is zero.  This means we have to have something happening at this point!  Can you see what it is?  That’s right, kiddo, it’s a minimum.

What’s happening here?  Well, let’s take a hint from two variable calculus: what happens when a point has a derivative of zero, and its second derivative is concave up?  That’s right, it’s a minimum.  Can we do the same sort of thing here?  Well, maybe!  But we have a number of second derivatives.  Specifically, we have these:

\frac{\partial^2}{\partial x\partial y}, \frac{\partial^2}{\partial y\partial x}, \frac{\partial^2}{\partial x^2}, \frac{\partial^2}{\partial y^2}

which look like what they are: we begin with the first, which is the second derivative with respect to y and then to x.  To take this partial, we first take the partial with respect to x, and then we take the partial of that derivative with respect to y.  The others are computed similarly, with the last two being with respect to x and then x again, and with respect to y and then with respect to y again.  So, what do these mean?

Well, if we pick out just one (let’s say, for the sake of it, we pick out \frac{\partial^2}{\partial x^2}, then we have the sort of ordinary concavity interpretation: in the x-direction on the graph, we have a concave upwards if this is positive and concave downwards if it is negative.  Similarly for the second partial \frac{\partial^2}{\partial y^2}.  The only confusing one should be the one where we take the derivative with respect to x and then with respect to y.  For this, there is not an easy interpretation (or, at least, one does not come to me immediately), but this will turn out not to matter so much.

In fact, we need to remember only three second partial derivatives.  It is a relatively well-known theorem that for nice functions the second mixed partials (that is, \frac{\partial^2}{\partial x\partial y} and \frac{\partial^2}{\partial y\partial x}) are equivalent!  Kind of neat, right?  I may write more on this theorem later, but for now let’s assume it.

Now, let’s look at our function above.  What can we say about the second derivatives at the point (0,0)?  Let’s compute them!

\frac{\partial^2 f}{\partial x^2} = 2

\frac{\partial^2 f}{\partial y^2} = 2

\frac{\partial^2 f}{\partial x\partial y} = 1

Compute for yourself that these are correct, and that the mixed partials are actually equivalent.  In this case, there is nowhere to plug in the point (0,0), but in general there will be.

We notice a few things: at the point (0,0), we have the graph is concave up in the x and y directions.  We can intuitively see what this means: the surface will look like a right-side-up bowl at (0,0), and, in fact, that’s what it kind of looks like!

Awesome.  Let’s look at another function.  This time, it’s going to be f(x,y) = x^2 - y^2 -xy.

graph2b

In this case, we have \nabla f(x,y) = (2x -y, -2y-x), and so the point where this is 0 is going to be (0,0).  Okay, let’s take second derivatives!

\frac{\partial^2 f}{\partial x^2} = 2

\frac{\partial^2 f}{\partial y^2} = -2

\frac{\partial^2 f}{\partial x\partial y} = -1

Now, think about it: the second partials in the x-direction tell us the graph is concave up, but the second partials in the y-direction tell us the graph is concave down.  So what should we expect?  Think about it for a second, but you should come up with a picture that looks like a saddle.  This is where the term comes from!  Look at the picture and see if you can see the concavity working.

The actual theorem for this is not so much different.  First, let’s make up a formal thing that we can put our second partial derivatives in!  It’ll kind of be like a little storage box.  We’ll call this the hessian matrix after a mathematician who was pret cool.  The hessian matrix looks like this:

\left( \begin{array}{cc} \frac{\partial^2}{\partial x^2}&\frac{\partial^2}{\partial x\partial y}\\ \frac{\partial^2}{\partial x\partial y} &\frac{\partial^2}{\partial y^2} \end{array}\right)

Obviously, this is just a formal object with no meaning if we have no function to apply the derivatives to, but this is an easy way to show what the hessian looksl ike.

Theorem: At points where \nabla f(x,y) = 0, if the hessian of the function is positive, then there is either a maximum or minimum at that point.  If the hessian  of the function is negative, then it is a saddle point.

Without going into too much detail of the proof, let’s just note why it works; and, in fact, we’ve already kind of seen why it works!  Let f_{xy} mean “second partial with respect to x and then y”, and f_{xx} mean “second partial with respect to x and then x, and so on.  The hessian’s determinant is then f_{xx} f_{yy} - (f_{xy})^2.  Take the determinant of it yourself if you don’t believe me!

Now, let’s note that (f_{xy})^2 is always non-negative (positive or zero).  Suppose that ONLY ONE of f_{xx} and f_{yy} is negative; then the entire determinant is negative (do you see why?) and therefore we have a saddle point.  This was that thing that we talked about above on the second function: one direction is concave up, the other is concave down.  If both are the same sign, then we have that both are either concave up or concave down, and this will usually give us a positive value altogether (since we usually have the product of the second partials in the same direction is larger than the square of the mixed partial; this is something I’m not sure of, and need to prove to myself at some point!  If anyone has a counterexample, please comment!).

So say we get a positive determinant for our hessian.  We know that this is either a max or a min, so how do we know which it is?  Well, it’s easy!  We did it for our first problem: if both second non-mixed partials are positive (that is, concave up) then there is a minimum at this point.  If they are both negative (that is, concave down), then there is a maximum at this point!

So, to sum up, we need to just do a few things to check for relative max and mins and saddle points on a surface:

  1. Find out where \nabla f(x,y) = 0.  These correspond to “critical points.”
  2. Construct the hessian matrix.
  • If the hessian is negative, the graph has a saddle point.
  • If the hessian is positive, then the graph has a max or min.
  • If the hessian is positive, then if f_{xx} and f_{yy} are both positive, then the surface has a relative minimum.  If they are both negative, then the surface has a relative maximum.

Note that if the hessian or both of the non-mixed partials are equal to 0, we can’t say very much about the graph at that point in terms of max, min, and saddle points.  In that case, graphical analysis will tell us a little more information.

There is a little bit more, though.  Remember in calculus how in order to see if something was a max or a min on a certain domain, we’d also need to try the endpoints?  Well, in this case, there could be many “endpoints” — say, for example, we want to check out our surfaces above, but only at points that are on the inside or the boundary of the unit cylinder; then there are many points to check on the boundary of the cylinder!  Fortunately, there’s a really clever way to do this that we’ll talk about next time.  For now, let’s do some exercises.

Exercises:

  1. Let f(x,y) = x^2 + x + y^2.  Find all relative max, min, and saddle points.
  2. Let g(x,y) = x^2 - y^2.  Find all relative max, min, and saddle points.
  3. Let h(x,y) = x^2 - x - y^2.  Find all relative max, min, and saddle points.

 

[Note. For some reason, in the previous version of this post I listed the matrix above as the “Jacobian” — this was not correct, though the jacobian is another kind of matrix that will come in handy eventually!]

Advertisements

4 Responses to “Directional Derivatives Part 3: Is it a max, min, or saddle point?!”

  1. Anonymous said

    dude that was so clear.

  2. Anonymous said

    You probably meant Hessian Matrix and not Jacobian, no?

  3. Anonymous said

    Your statement of a negative determinant of a matrix for a saddle point is only valid for 2×2 matrices.

    It follows directly from the theorem that for a saddle point at x*, y*, a matrix cannot be either positive definite or negative definite. Hence, the 2nd order principal minor of the matrix has to be non-zero and further for a 2×2 matrix always it follows that it is negative.

  4. Greg said

    I have a counter example for your fxx*fyy > fxy^2:
    Try it out on f(x,y) = xy
    Your Hessian is [0 1; 1 0] s.t. |H| = -1
    What is missing in the proof is that the Hessian is positive definite. You can start with the higher dimension Taylor polynomial, use the gradient equal to zero at our critical point and that t is sufficiently small to show that the gradient being positive definite forces f(x*) to be a local minimum.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: