Algebraic Number Theory

6 Geometry of numbers

In this chapter we will develop the machinery necessary for proving Theorem 4.0.3.

Definition 6.0.1
#

A lattice \(\Lambda \subset \mathbb {R}^n\) is a subgroup (under addition) generated by \(n\) linearly independent vectors.

Remark 6.0.2
#

If \(\Lambda \) is a lattice in \(\mathbb {R}^n\) then

\[ \Lambda = \mathbb {Z}e_1 \oplus \cdots \oplus \mathbb {Z}e_n \]

where \(e_i\) are linearly independent vectors over \(\mathbb {R}\), i.e. there does not exist \(r_i \in \mathbb {R}\) such that \(\sum _i r_ie_i =0\). So its not enough that the \(e_i\) be independent over \(\mathbb {Q}\), so \((1,0)\) and \((\pi ,0)\) do not generate a lattice in \(\mathbb {R}^2\).

Definition 6.0.3
#

If \(\Lambda \subset \mathbb {R}^n\) is a lattice generated by \(e_i\) then

\[ P(\Lambda )=\{ x \in \mathbb {R}^n \mid x=\sum _i r_ie_i, 0 \leq r_i {\lt} 1 \} \]

is called the fundamental domain of \(\Lambda \).

Note that if we \(\lambda \in \Lambda \) and let \(P(\Lambda )+\lambda =\{ x+ \lambda \mid x \in P(\Lambda )\} \) then

\[ \mathbb {R}^n= \bigcup _{\lambda \in \Lambda } P(\Lambda ) +\lambda . \]

Lemma 6.0.4

Let \(\Lambda \subset \mathbb {R}^n\) be a lattice. Then the volume of \(P(\Lambda )\) does not depend on the choice of basis of \(\Lambda \). Moreover, if \(\{ e_i\} \) is the basis, then

\[ \operatorname{Vol}(P(\Lambda ))=|\det (e_1,e_2,\dots ,e_n)| \]

(here the right hand side is the determinant of the matrix whose columns are given by the \(e_i\)).

Proof

The second statement is just linear algebra, so we will only prove the first. Let \(f_i\) denote a second basis of \(\Lambda \) and let \(M(e_i), M(f_i)\) denote the matrices whose columns are given by \(e_i\) and \(f_i\) respectively. Then

\[ M(e_i)=M(f_i)A \]

where \(A\) is a \(n \times n\) matrix with entries in \(\mathbb {Z}\). Similarly ,

\[ M(f_i)=M(e_i)B. \]

Therefore,

\[ M(e_i)=M(e_i)BA \]

Now, since \(M(e_i),M(f_i)\) are non-degenerate we have \(BA=I\) and therefore \(\det (A)=\pm 1\).

Lemma 6.0.5

Let \(S \subset \mathbb {R}^n\) be a measurable set (i.e. \(\operatorname{Vol}(S)=|\idotsint _S dx_1\cdots dx_n|\) exists) and \(\Lambda \) is a lattice. Then if  \(\operatorname{Vol}(S) {\gt} \operatorname{Vol}(P(\Lambda ))\) then there exist \(x,y \in S\) with \(x \neq y\) such that \(x-y \in \Lambda \).

Furthermore, if \(S\) is compact, then the same conclusion holds if \(\operatorname{Vol}(S) \geq \operatorname{Vol}(P(\Lambda ))\).

Proof

We begin by writing

\[ \mathbb {R}^n = \bigcup _{\lambda \in \Lambda } \left(P(\Lambda )+\lambda \right)\qquad \text{(as a disjoint union)} \]

therefore

\[ S= \mathbb {R}^n \cap S= \bigcup _{\lambda \in \Lambda } \left(P(\Lambda )+\lambda \right) \cap S \qquad \text{(as a disjoint union)}. \]

From this it follows that

\[ \operatorname{Vol}(S)=\sum _{\lambda \in \Lambda } \operatorname{Vol}\left((P(\Lambda )+\lambda \right) \cap S) =\sum _{\lambda \in \Lambda } \operatorname{Vol}\left((P(\Lambda ) \right) \cap (S-\lambda ) ). \]

Now, if \(P(\Lambda ) \cap (S -\lambda )\) are all disjoint, then the sum their volume is \({\lt} \operatorname{Vol}(P(\Lambda ))\) contradicting our assumption that \(\operatorname{Vol}(S){\gt} \operatorname{Vol}(P(\Lambda ))\). Therefore two of these sets meet, say \(P(\Lambda ) \cap (S -\lambda )\) and \(P(\Lambda ) \cap (S -\mu )\) (with \(\lambda \neq \mu \)) and therefore we have some \(x-\lambda =y-\mu \) giving \(x-y=\lambda -\mu \in \Lambda \).

For the second part, if \(S\) is now compact with \(\operatorname{Vol}(S) \geq \operatorname{Vol}(P(\Lambda ))\). Then let \(S'=(1+\epsilon )S\) such that \(\operatorname{Vol}(S'){\gt} \operatorname{Vol}(P(\Lambda ))\). Then by the above, we can find \(x,y \in S'\) such that \(x-y \in \Lambda \). Let \(\Lambda _\epsilon \) denote the set of such \(x,y\).

Note that if \(\epsilon ' \leq \epsilon \) then \(\Lambda _{\epsilon '} \subset \Lambda _\epsilon \). Therefore \(\cap _{\epsilon {\gt}0} \Lambda _\epsilon \neq \emptyset \). So let \(\lambda \in \cap _{\epsilon {\gt}0} \Lambda _\epsilon \). We claim that \(\lambda =x-y\) for some \(x,y \in S\). Take \(\epsilon =1/n\) and write \(\lambda =x_n-y_n\) with \(x_n,y_n \in (1+1/n)S\) (which we can do by the first part). Since \(x_n,y_n \in 2S\) for all \(n\) and \(2S\) is compact. So \((x_n,y_n)\) form a sequence in a compact set, so there is a subsequence that converges to a point \((x,y)\). Since \(x_n,y_n \in (1+1/n)S\) we have \(x \in \bigcap (1+1/n)S=S\) and similarly \(y \in S\). Since \(\lambda =x_n-y_n\) for all \(n\) we see that in the limit \(\lambda =x-y\) which then gives the result.

Definition 6.0.6
#

A subset \(S \subset \mathbb {R}^n\) is called:

  1. Convex if whenever \(x,y \in S\) then the line segment joining \(x\) and \(y\) is also contained in \(S\).

  2. Centrally symmetric if whenever \(x \in S\) then \(-x \in S\).

Lemma 6.0.7 Minkowski’s convex body lemma

Let \(S\) be a compact, convex and centrally symmetric subset of \(\mathbb {R}^n\) and \(\Lambda \) a lattice. If

\[ \operatorname{Vol}(S) \geq 2^n \operatorname{Vol}(P(\Lambda )) \]

then \(S\) contains a point of \(\Lambda \).

Proof

Consider the set

\[ \frac{1}{2}S=\{ \frac{1}{2} x \mid x \in S\} . \]

Then \(\operatorname{Vol}(\frac{1}{2}S) \geq \operatorname{Vol}(P(\Lambda ))\). So by Lemma 6.0.5 there exist \(x,y \in \frac{1}{2}S\) such that \(x-y \in \Lambda \). We claim that \(x-y \in S\).

Note that \(2x,2y \in S\). Now, since \(S\) is centrally symmetric, we have \(-2y \in S\). Furthermore, since \(S\) is convex,

\[ \frac{1}{2} (2x-2y) \in S. \]

Thus \(x-y \in S\).

Let now apply this to number theory. Let \(K\) be a number field with \([K:\mathbb {Q}]=n\). Then we have \(n\) embeddings of \(K \hookrightarrow \mathbb {C}\) and in fact if we let \(r_1\) be the number of real embeddings and \(r_2\) the number of complex conjugate embeddings then we have:

Definition 6.0.8
#

Let \(K\) be a number field with \(r_1\) real embeddings and \(r_2\) complex conjugate pairs of embeddings, then the canonical embedding is

\[ \Theta : K \longrightarrow \mathbb {R}^{r_1} \times \mathbb {C}^{r_2} \overset {\sim }{\longrightarrow } \mathbb {R}^n \]

given by

\begin{align*} x \mapsto & (\sigma _1(x),\dots ,\sigma _{r_1}(x),\sigma _{r_1+1}(x),\dots ,\sigma _{r_1+r_2}(x))\\ & \mapsto (\sigma _1(x),\dots ,\sigma _{r_1}(x),\Re \sigma _{r_1+1}(x),\Im \sigma _{r_1+1}(x), \dots ,\Re \sigma _{r_1+r_2}(x)), \Im \sigma _{r_1+1}(x) \end{align*}

where the first \(r_1\) of the \(\sigma _i\) are the real embeddings then rest are the complex ones and \(\Re ,\Im \) denote real and imaginary parts.

Example 6.0.9
#
  1. Let \(K=\mathbb {Q}(\sqrt{-d})\) with \(d\) a square-free positive integer. Then the embedding is given by sending \(x+y\sqrt{-d}\) to

    \[ (x,y\sqrt{-d}) \in \mathbb {R}^2. \]
  2. If \(K=\mathbb {Q}(\sqrt{d})\) with \(d\) a square-free positive integer. Then the embedding is given by sending \(x+y\sqrt{d}\) to

    \[ (x+y\sqrt{d},x-y\sqrt{d}) \in \mathbb {R}^2. \]

Proposition 6.0.10

Let \(K\) be a number field with \([K:\mathbb {Q}]=n\) and \(\Theta : K \to \mathbb {R}^n\) is canonical embedding. Then \(\Theta (\mathcal{O}_K)\) is a lattice in \(\mathbb {R}^n\) and if \(P=P(\Theta (\mathcal{O}_K))\) then

\[ \operatorname{Vol}(P)=2^{-r_2}\sqrt{|\Delta (\mathcal{O}_K)|} \]

where \(r_2\) is the number of complex conjugate pairs of embeddings.

Furthermore, if \(\mathfrak {a}\subset \mathcal{O}_K\) is an ideal, then \(\Lambda _\mathfrak {a}:=\Theta (\mathfrak {a})\) is a sublattice of \(\Theta (\mathcal{O}_K)\). Moreover,

\[ \operatorname{Vol}(P(\Lambda _\mathfrak {a}))=2^{-r_2}\sqrt{|\Delta (\mathcal{O}_K)|}N(\mathfrak {a}) \]

Proof

Let \(\{ e_i\} \) be an integral basis of \(\mathcal{O}_K\). Then

\[ \Theta (\mathcal{O}_K)=\left\{ \sum _i \lambda _i \Theta (e_i) \mid \lambda _i \in \mathbb {Z}\right\} . \]

We want to compute

\[ |\det (M(\Theta ))|:=|\det (\Theta (e_1),\dots ,\Theta (e_n))| . \]

By definition

\[ \Theta (e_i)=(\sigma _1(e_i),\dots ,\sigma _{r_1}(e_i),\Re \sigma _{r_1+1}(e_i),\Im \sigma _{r_1+1}(e_i), \dots ,\Re \sigma _{r_1+r_2}(e_i)), \Im \sigma _{r_1+1}(e_i) )^T. \]

Now, note that

\[ \Re \sigma _{r_1+1}(e_i)=\frac{1}{2}(\sigma _{r_1+1}(e_i) + \overline{\sigma }_{r_1+1}(e_i)) \qquad \Im \sigma _{r_1+1}(e_i)=\frac{1}{2\sqrt{-1}}(\sigma _{r_1+1}(e_i) - \overline{\sigma }_{r_1+1}(e_i)). \]

Using this we have

\[ |\det (M(\Theta ))|= \left(\frac{1}{2} \right)^{2r_2} \left| \det \left(\begin{matrix} \sigma _1(e_1) & \cdots & \sigma _1(e_n) \\ \vdots & & \vdots \\ \sigma _{r_1}(e_1) & \cdots & \sigma _{r_1}(e_n) \\ (\sigma _{r_1+1}(e_1) + \overline{\sigma }_{r_1+1}(e_1)) & \cdots & (\sigma _{r_1+1}(e_n) + \overline{\sigma }_{r_1+1}(e_n)) \\ (\sigma _{r_1+1}(e_1) - \overline{\sigma }_{r_1+1}(e_1)) & \cdots & (\sigma _{r_1+1}(e_n) - \overline{\sigma }_{r_1+1}(e_n)) \\ \vdots & & \vdots \end{matrix} \right) \right| \]

Doing simple row operations we can transform this into

\[ \left(\frac{1}{2} \right)^{2r_2} \left| \det \left(\begin{matrix} \sigma _1(e_1) & \cdots & \sigma _1(e_n) \\ \vdots & & \vdots \\ \sigma _{r_1}(e_1) & \cdots & \sigma _{r_1}(e_n) \\ (\sigma _{r_1+1}(e_1) + \overline{\sigma }_{r_1+1}(e_1)) & \cdots & (\sigma _{r_1+1}(e_n) + \overline{\sigma }_{r_1+1}(e_n)) \\ 2\sigma _{r_1+1}(e_1) & \cdots & 2\sigma _{r_1+1}(e_n) \\ \vdots & & \vdots \end{matrix} \right) \right| \]

and then to

\[ \left(\frac{1}{2} \right)^{r_2} \left| \det \left(\begin{matrix} \sigma _1(e_1) & \cdots & \sigma _1(e_n) \\ \vdots & & \vdots \\ \sigma _{r_1}(e_1) & \cdots & \sigma _{r_1}(e_n) \\ \overline{\sigma }_{r_1+1}(e_1) & \cdots & \overline{\sigma }_{r_1+1}(e_n)) \\ \sigma _{r_1+1}(e_1) & \cdots & \sigma _{r_1+1}(e_n) \\ \vdots & & \vdots \end{matrix} \right) \right| \]

Notice the power of \(1/2\) has changed. But now, if we look back at Proposition 2.2.16 we see that this is simply

\[ 2^{-r_2} \sqrt{|\Delta (\mathcal{O}_K)|}. \]

Lets now look at the sublattice \(\Lambda _\mathfrak {a}\). Note that as additive abelian groups we have \(\mathcal{O}_K \cong \mathbb {Z}^n\) and \(\mathfrak {a}\) is a subgroup of index \(N(\mathfrak {a})\). Since \(\Theta \) is injective we have \(\Lambda _\mathfrak {a}\) is a subgroup of \(\Theta (\mathcal{O}_K)\) of index \(N(\mathfrak {a})\). From this it follows that

\[ P(\Lambda _{\mathfrak {a}})=N(\mathfrak {a})P(\Theta (\mathcal{O}_K)) \]

(compare this with the proof of Proposition 3.4.5) from which the result follows.

Lemma 6.0.11

Let \(S_t \subset \mathbb {R}^{r_1} \times \mathbb {C}^{r_2} \cong \mathbb {R}^n\) be a subset given by points \((y_i,z_i) \in \mathbb {R}^{r_1} \times \mathbb {C}^{r_2}\) such that

\[ \sum _i |y_i|+ \sum _j |z_j| \leq t. \]

Then \(S\) is compact, convex and centrally symmetric and moreover

\[ \operatorname{Vol}(S)=\frac{2^{r_1}t^n}{n!} \left( \frac{\pi }{4}\right)^{r_2}. \]

Proof

\(S\) is closed and bounded and therefore is compact. \(S\) is also clearly symmetric. For \(\lambda \in (0,1)\) we have

\begin{align*} \sum _i |\lambda y_i+(1-\lambda )y_i’|+& 2 \sum _j |\lambda z_i+(1-\lambda )z_i’| \leq \sum _i |\lambda y_i|+|(1-\lambda )y_i’|+2 \sum _j |\lambda z_i|+|(1-\lambda )z_i’|\\ & = \lambda \sum _i | y_i|+(1-\lambda )\sum _j |z_i| + \lambda \sum _i | y_i’|+(1-\lambda )\sum _j |z_i’| \\ & \leq \lambda +(1-\lambda )=1 \end{align*}

From this it follows that \(S\) is also convex.

Note that if \(r_1=1\) and \(r_2\) then \(S=[-t,t]\) which has volume (in this case length) \(2\). Similarly, if \(r_1=0\) and \(r_2=1\) then \(S\) is just a ball in \(\mathbb {C}\) of radius \(\frac{1}{2}\) so has volume (in this case area) \(\frac{\pi t^2}{4}\). We will prove the formula for the volume by induction on \((r_1,r_2)\).

Assume we know the formula for \((r_1,r_2)\). Lets look at the \((r_1+1,r_2)\) case. Here the set is given by points such that

\[ \sum _{i=1}^{r_1+1}|y_i|+ 2\sum _{j=1}^{r_2} |z_j| \leq 1 \]

which can be rewritten as

\[ \sum _{i=1}^{r_1}|y_i|+ 2\sum _{j=1}^{r_2} |z_j| \leq t-|y| \]

where \(y=y_{r_1+1}\). This set has volume

\[ \int _{0}^{1} \frac{2^{r_1}t^n}{n!} \left( \frac{\pi }{4}\right)^{r_2} (t-|y|)^n dy= \frac{2^{r_1}t^n}{n!} \left( \frac{\pi }{4}\right)^{r_2} \int _{0}^{1}(t-y)^n dy=\frac{2^{r_1+1}t^{n+1}}{n+1!} \left( \frac{\pi }{4}\right)^{r_2}. \]

A slightly more involved, yet still elementary proof gives the \((r_1,r_2+1)\) and thus the result.

Finally, with this we can finally prove Theorem 4.0.3

Theorem

Let \(K\) be a number field with \(r_1\) real embeddings and \(r_2\) conjugate pairs of complex embeddings. Let \([K:\mathbb {Q}]=n\) and let \(\mathfrak {a}\) be an ideal of \(\mathcal{O}_K\). Then there is an element \(a \in \mathfrak {a}\) such that

\[ |N_{K/\mathbb {Q}}(a)| \leq \frac{n!}{n^n} \left( \frac{4}{\pi } \right)^{r_2} |\Delta (\mathcal{O}_K)|^{1/2} N(\mathfrak {a}) \]

Proof

Let \(S_t\) be as in Lemma 6.0.11 and pick \(t\) such that

\[ \operatorname{Vol}(S_t)=2^n \operatorname{Vol}(P(\Lambda _\mathfrak {a})) \]

i.e. such that

\[ t^n=2^{n-r_1}\pi ^{-r_2}n! \sqrt{|\Delta (\mathcal{O}_k)|} N(\mathfrak {a}) \tag {1} \]

Then by Lemma 6.0.7 there is an \(a \in \mathfrak {a}\) such that \(\Theta (a) \in S_t\). Then we have

\[ |N_{K/\mathbb {Q}}(a)|= \prod _{i=1}^{r_1} |\sigma _i(a)| \prod _{j=r_1+1}^{r_1+r_2} |\sigma _j(a)|^2 \]

(here we use Proposition 1.7.6). Now, using the arithmetic-geometric mean inequality

\[ \sqrt[n]{z_1\dots z_n} \leq \frac{1}{n} \sum _i z_i \]

for \(z_i\) positive real numbers, we have

\[ |N_{K/\mathbb {Q}}(a)| \leq \left( \frac{1}{n}\sum _{i=1}^{r_1} |\sigma _i(a)|+ \frac{2}{n}\sum _{j=r_1+1}^{r_1+r_2} |\sigma _j(a)| \right)^n \leq \frac{t^n}{n!} \]

by definition of \(S_t\). Using (1) then gives the result.