Hilbert’s Nullstellensatz.

October 15, 2010

This serious sounding title is apt for this post, because the Nullstellensatz is the big time.  This is one of the "big" results in algebraic geometry.  Before we dive into the theorem, though, let’s motivate this a little bit. 

First, let’s consider some field k.  Let’s expand this a bit, and consider the space k^{n} = k \oplus k \oplus \cdots \oplus k; so, our elements look like (a_{1}, a_{2}, \dots, a_{n}), where each of these a_{i}\in k

If we have a space like this, a lot of times it’s nice to consider the polynomial functions over it.  For example, whenever we have {\mathbb R}^{2}, we like to consider polynomials like y = 2x^{5} + 3x^2 + 34.  So, for our k^{n}, we’d like to associate a bunch of polynomials to it — specifically, we want to associate k[x_{1}, x_{2}, \dots, x_{n}], the polynomial ring with coefficients in k with n indeterminants. 

We’ll associate k^{n} with k[x_{1}, \dots, x_{n}] in the following way: for each f\in k[x_{1}, \dots, x_{n}] and for each a = (a_{1}, \dots, a_{n})\in k^{n}, we’ll construct a map e_{a}: k[x_{1}, \dots, x_{n}] \rightarrow k called the "evaluation map" by f(x_{1}, \dots, x_{n}) \mapsto f(a_{1}, \dots, a_{n}).  In other words, we’re "plugging a into f" the way we’re used to doing such things in algebra or calculus.  Notice that because all of the coefficients are in k and every a_{i} is in k, we have that the evaluation is also going to be in k.  Well, that’s nice.  But let’s think back to algebra for a second…what did we always try to find out about when we had a polynomial function?


Yes, that’s right, ROOTS of a polynomial.  In other words, we always like to find where f(x) = 0.  As we learned more mathematics, we realized that the roots of a polynomial were what essentially (almost!) uniquely defined the polynomial — each polynomial’s roots were like a fingerprint, telling us exactly what the graph would be like and where everything would be "forced" to go.  In fact, given the roots, when we factor everything out into linear terms (assuming a field which is algebraically closed) the only things we could really change are the constants in front of the terms.  For example, the following are graphs of (x - 2)(x + 1), 3(x-2)(x+1), -5(x-2)(x+1), and -(x-2)(x+1).


Notice that, in whole, these graphs are quite different, but around their zeros they all converge to the same points.  This is more or less the idea here: these are all the same "sort" of graph, differing only by a leading coefficient.  In addition, because these are in a field, the leading coefficient is, in fact, a unit.  If we were only considering the zeros of functions, all of these graphs would be considered "the same", since their roots are identical.

When we begin adding more complicated functions to the mix, we begin having more and more zeros in common.  For example, in the previous picture, all of the graphs shared the points -1 and 2.  But look at the following example, with the functions f(x,y) = 2(x^{2} - y^2) and g(x,y) = -3x(x^{2} - y^2).  What’s perhaps not so clear in the picture is that these functions have nearly the exact same zeros, except for g also has a zero whenever x = 0


(I’ve also included the plane h(x,y) = 0 in the picture above.  For a potentially more illuminating picture, see this.  If you graph this in wolfram mathematica, you can spin it around and see all the little aspects this 2D picture misses.)

Their "common zero set" has to do with the fact that x^{2} - y^{2} = (x - y)(x+y) divides both of these things.  Therefore, if we made a function h(x,y) = (x^{2} - y^{2})(x + y)(2x + 5y^3), we’d (correctly!) guess that it would share zeros with the first two graphs, as well as have some new zeros of its own.

So we’d like something that tells us where a common set of zeros is.  Let’s just, for now, take a set of functions, F, and let the set V(F) denote the set of common zeros (which may potentially be empty).  The astute reader will note (by arguments above) that we can actually multiply any element in F by any element in k[x_{1}, \dots, x_{n}] and its zeros will remain the same — we may get new zeros, but if we added this new function we obtained by multiplying functions in F by functions in k[x_{1}, \dots, x_{n}], then the common set of zeros stays the same (check this!).  By multiplying functions in F by k[x_{1}, \dots, x_{n}], we’ve actually can make F into an ideal of k[x_{1}, \dots, x_{n}] without changing any of the common zeros.  Let’s do this, and let’s call this ideal J, and this will be our ideal of functions in k[x_{1}, \dots, x_{n}] generated by our set F.


A Detour for Four Examples.

Let’s do four quick examples.  Let’s let F = \{x^{2} + 3x + 2, 3x^{2} + 9x + 6\}.  The graph below details these two functions.


Note that they both have common zeros at x = -2, -1, and so we have that V(F) = V(J) = \{-1, -2\}, where J is the ideal that we generated by multiplying everything in F by everything in k[x_{1}, \dots, x_{n}] as we detailed above. 

Now, take F = \{x^{2} + 2x + 2, x +2\}  Again, let J be its associated ideal. 


What happened?  We only have one common zero.  Thus, V(F) = V(J) = \{-2\}

What happens if we take F = \{x + 2, x - 2\} and let J be its associated ideal. 


We have no zeros in common!  Upsetting.  So V(F) = V(J) = \emptyset

One last thing.  What happens if we have something like F = \{0\}.  In either case, any and all points are zeros, trivially, and so we have that V(F) = V(J) = k


Back To Our Main Topic.

Good, okay.  So now you should have some basic idea of what V(J) does; namely, it take functions in J and finds their common zeros.  Said another way, V(J) the set of common roots of all of the functions in J.

Now, a similar idea.  What if I have some points and I want to know what functions vanish on those points; in other words, I want to know the set of functions that has those points for zeros.  So, for a concrete example, suppose I took the points x = -2, 0, 2, and I asked you what kinds of things vanish at those points.  Well, it’s relatively easy to see in two variables that any function f that is divisible by x(x - 2)(x + 2) will be in this set.  Can anything else be in there besides polynomial multiples of x(x - 2)(x + 2)?  No, they cannot.  Otherwise, we’d be able to contradict something using the fundamental theorem of algebra (by finding roots, etc). 

Let’s give a name to this concept.  Say we have some set of points X.  Define I(X) to be every function which has a zero at every point in X.  Notice, here, that I(X) is actually an ideal (why?). 

I remember the difference between these two with the following (stupid) mnemonic: "I Function Very Poorly."  I say that "I" corresponds to "Functions" (as in, it is an ideal of functions) and "V" corresponds to "Points" (as in, V is a set of points, and poorly starts with the same letter as points does.  I never said it was a good mnemonic.).


Quick Examples with I(X).

Let’s let X = \{-2, 0\} in {\mathbb R}^{2}.  What is I(X)?  It’s the ideal of every polynomial which is divisible by x + 2 and x.  We denote this by (x^2 + 2x), where the ()’s around this function means "take the ideal generated by what’s inside."  Note that in this case, the real numbers were fine.  In the actual case that we’d like to consider, we want all the roots to be obtainable, and so we will work in the complexes.  This gives us a bunch more functions in I(X).

Let’s let X = \emptyset now.  These are functions which never have zeros.   These are exactly the constant functions minus the zero function, which are isomorphic to the underlying field without the element 0.

Let’s let X = {\mathbb C}.  These are all functions f(z) that vanish on all of {\mathbb C}.  It doesn’t take much thought to convince yourself this is exactly the zero function and no other function.


Back Again.

Good.  One last thing we need to go over before we state the Nullstellensatz.

What happens when we have something like the function x^{3} and we make an ideal J out of it.  J consists of all elements which are divisible by x^{3}; in other words, those functions which have no linear or constant term.  When we take V(J), what do we get?  Well, x^{3} is zero only at the origin, and so V(J) = 0.  Fair enough, but what about x?  What about x^2?  These are both also zero at the origin, but J just "skips over them".  What happens if we take I(V(J))?  This is a much deeper question that in seems.  We not only get J back, but we also get x and x^2, as well as everything generated by those in the ideal, since I(V(J)) is asking the question, in this case, "what functions have a root only at 0?"

Let’s try once more.  Suppose our function in f(x) = (x - 2)^{4}(x+1)^{3}.  What are the zeros?  x = 2, -1, and so when we take V(J) we get exactly those zeros.  But J sort of skips over some of the lower powers of factors in f; namely, if we took I(V(J)), we would get the ideal generated by (x-2)(x+1).  Stare at this until you get it, because it’s ridiculously important.  We get this "lower-power" function because we only care about the roots x = 2, -1, and (x-2)(x+1) is kind of the "smallest function" (the function of least degree) which has those roots.  In other words, every function which has those roots is divisible by (x-2)(x+1)

So what happened here?  It’s kind of like this: if we put in a function with multiple roots, our ideal J skips over the functions that have the same root, but just a lower power.  So, we kind of would like to say that I(V(J)) is the same as "everything that has the same kind of factorization as our functions, but is a few powers lower."  This idea is called the radical ideal, most likely because it’s so radical.

Specifically, the radical ideal rad(I) is defined to be \{x\ | \ x^{n}\in I, \mbox{ for some } n\in {\mathbb N}\}.  In this way, we get "all the powers" we need. 

Take that previous example, f(x) = (x - 2)^{4}(x+1)^{3}, if we considered J = (f), the ideal of f, then rad(J) = ((x - 2)(x + 1)), the ideal generated by (x-2)(x+1).  Note also that we can make f out of (x-2)(x+1) by raising it to the third power to get (x-2)^{3}(x+1)^{3} and then multiplying it by (x - 2) which is an element in the polynomial ring.  But the important thing is to note that (x-2)(x+1) is kind of the minimal polynomial that gives us the correct roots, minimal in the respect that every other polynomial which has those roots must be divisible by it.

Considering this example above and giving a bit more thought to it, note that we have always that J \subseteq rad(J)  Thus, our rad(J) is going to be either the same or slightly bigger than J.  If rad(J) = J then we call J a radical ideal.  Pretty sweet, no?

Now, why did we introduce this?  Because when we were taking I(V(J)) we kept getting things that had the same roots as everything in J, but just had lower powers.  This is exactly where rad(J) comes in.  What we’ve been working towards is (surprise!) the statement of the nullstellensatz.


Theorem (Nullstellensatz): Let k be an algebraically closed field and let J be an ideal of functions in k[x_{1}, \dots, x_{n}].  Then we have that I(V(J)) = rad(J).


The nullstellensatz has a few clever proofs, but I’m going to leave them off for now, because this post was mainly to build intuition for why this should be true.  The nullstellensatz essentially says that if f is a function which is zero at a set of roots common to all elements in J, then f is either in J or some power of f is in J.  This idea should "jive" with what we’ve said before: there should be some kind of minimal polynomials which have the same roots as stuff in J but may not have been included in J, sadly.  The nullstellensatz allows us to fix this by just taking I(V(J))

Note that if J is a radical ideal, then the statement reduces to I(V(J)) = J.  This is why we’re going to like radical ideals a lot.

Let’s done quick example.  Let’s let k = {\mathbb C}, which is algebraically closed.  Let’s let J be the ideal generated by f(z) = z^{2}.  What’s I(V(J))?  We have V(J) are the common zeros here, which, in this case, are just the zeros of f.  Thus, V(J) = \{0\}.  Then we have that I(V(J)) are all the functions which have a zero at z = 0.  A little thinking will convince you that these are exactly the functions with no constant term; in other words, the functions which are divisible by z.  Therefore, I(V(J)) = (z), the ideal generated by z.  In fact, if we considered rad(J) and used the nullstellensatz, we would have come around to the same answer since rad(x^2) is just (x).  Prove this to yourself!


Next time we’ll look at some of the corollaries of the Nullstellensatz and we’ll ask ourselves why we should care about it.

5 Responses to “Hilbert’s Nullstellensatz.”

  1. Sarah said

    This is great! Very clear and really helpful for me.

    • negin mehrdad said

      Hi sara,
      I read your examples and graphs about nullstallensatz theorem.
      I am looking for an ideal generated by two components such that I(V(I))=Rad(I) and its graph is 3 dimension. I=
      would you please give me a an example?
      I need it as soon as possible.

  2. Best letter math on the internet.

  3. Anonymous said

    Wonderful explanation.

  4. Anonymous said

    Thank you!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: