Calderon-Zygmund theory of singular integrals.

1. Calderon-Zygmund decomposition

The Calderon-Zygmund decomposition is a key step in the real variable analysis of singular integrals. The idea behind this decomposition is that it is often useful to split an arbitrary integrable function into its “small” and “large” parts, and then use different technique to analyze each part.

The scheme is roughly as follows. Given a unction { f} and an altitude { \alpha}, we write { f=g+b}, where { |g|} is point wise bounded by a constant multiple {\alpha}. While { b} is large, it does enjoy two redeeming features: it is supported in a set of reasonable small measure, and its mean value is zero on each of the ball that constitute its support. To obtain the decomposition { f=g+b}, one might be tempted to “cut” { f} at the height { \alpha}; however, this is not what works. Instead, one bases the composition on the set where the maximal function of { f}has height { \alpha}.

Theorem 1 (Calderon-Zygmund decomposition)

Suppose we are given a function { f\in L^1} and a positive number { \alpha}, with {\alpha>\frac{1}{\mu(R^n)}\int_{R^n}|f|d\mu}. Then there exists a decomposition of { f}, {f=g+b}, with { b=\sum_{k}b_k}, and a sequences of balls {\{B_k^*\}}, so that,

  1. { |g(x)|\leq c\alpha}, for a.e. { x}.
  2. Each {latex b_k} is supported in {B_k^*},{ \int|b_k(x)|d\mu(x)\leq c\alpha\mu(B_k^*)}, and { \int b_k(x)d\mu(x)=0}.
  3. { \sum_k\mu(B_k^*)\leq \frac{c}{\alpha}\int|f(x)|d\mu(x)}.


Before proof this theorem, I explain the geometric intuition why this theorem could be true first. Merely speaking, this is just base on cut off the function into two part, the part with high altitude and the part with low altitude and extension the part with high altitude to make the extension one satisfied the condition 2 and 3.

Proof: In fact this decomposition have a good geometric explain, we just divide the part {\{x: |f(x)|>\alpha\}} and extension it carefully to make they behaviour like several balls, to satisfied the special condition on this part. \Box

Remark 1 Remark 1: A Calderon-Zygmund decomposition for {L^p} function was done in Charlie Fefferman’s thesis; see Section II of  One can also find this in Loukas Grafakos’s Classical Fourier Analysis Classical Fourier Analysis page 303 exercise 4.3.8. The question is broken up into parts that should be easy to handle.

Several people have considered with this question. An excellent paper that comes to mind is Anthony Carbery’s Variants of the Calderon–Zygmund theory for { L^p}-spaces which appeared in Revista Matematica Iberoamericana, Volume 2, Number 4 in 1986. There are also several useful references that appear in Carbery’s paper.

Remark 2 We could also consider a variant of Calderon-Zygmund decomposition, such as equipped with a nontrivial weight function { w} or find some different way to decomposition for some special purpose.

Remark 3 Consider suitable decomposition of the physics space or even both the physics space and fractional space try to gain some reasonable estimate is a fundamental philosophy in harmonic analysis, beside the Calderon-Zegmund decomposition,

Whitney decomposition. Which is important trick in the proof of fefferman-stein restriction theorem and differential topology.

Wave packet decomposition. The wave packet decomposition. This decomposition underlies the proof of Carleson’s theorem (this is more explicit in Fefferman’s proof than Carelson’s original proof), Lacey and Thiele’s proof of the boundedness of the bilinear Hilbert transform, as well as a host of follow-up work in multilinear harmonic analysis. The idea of the wave packet decomposition is to decompose a function/operator in terms of an overdetermined basis. This allows one to preserve symmetries (such as modulation symmetries) that aren’t preserved by a classical Calderon-Zygmund decomposition (which endows the frequency with a distinguished role). One might consider using a wave packet decomposition if is working with an operator that has a modulation symmetry. This is discussed in more detailed in Tao’s blog post on the trilinear Hilbert transform.

Polynomial decomposition. The application of polynomial decomposition to harmonic analysis is more recent, and its full potential still seems unclear. Applications include Dvir’s proof of the finite field Kakeya conjecture, Guth’s proof of the endpoint multilinear Kakeya conjecture (and, indirectly, the Bourgain-Guth restriction theorems), Katz and Guth’s proof of the joints problem and Erdos distance problem, among many other results. Generally, the idea behind the polynomial decomposition is to partition a subset of a vector space over a field into a finite number of cells each of which contains roughly the same fraction of the original set. One further wishes that no low degree algebraic variety can intersect too many of the cells. In Euclidean space, the polynomial ham sandwich decomposition does exactly this. This allows one to, for instance, control linear (or, more generally, `low algebraic degree’) interactions between points in distinct cells. This has so far proven the most useful in incidence-type problems, but many problems in harmonic analysis, thanks to the translation symmetry of the Fourier transform, are inextricably linked with such incidence-type problems. See (again) Tao’s survey of this topic for a more detailed account.


2. Singular integrals

Have the Calderon-Zegmund decomposition in hand, now we proof a conditional one bounded result for singular integrals.

The singular integral one is interested in are operator { T}, expressible in the form

\displaystyle (Tf)(x)=\int_{R^n}K(x,y)f(y)d\mu(y) \ \ \ \ \ (1)


Where the kernel { K} is singular near { x=y}, and so the expression is meaningful only if { K} is treated as a distribution or in some limiting sense. Now the particular regularization of { (Tf)(x)} may be appropriate depends much on the context, and a complete treatment of the issues thereby raised take us quite far afield.

Let us limit ourselves to two closely related ways of dealing with the questions concerning the definability of the operator. One is to prove estimates for the (dense) subspace where the operator is initially defined. The other is to regularize the given operators by replacing it with a suitable family, and to prove the uniformly estimates for this family. This idea is similar occurring in spectral geometry when we wish to investigate the spectrum of some operator we try to consider some deformation, so deduce to control the spectrum of a seres of paramatrix, for example, consider the wave kernel or heat kernel rather than the passion kernel itself. Common to both methods is a priori approach: We assume some additional properties of the kernel, but then prove estimates that are independent of these “regularity” properties.

We now carry out the first approach in detail. There will be two kinds of assumptions made about the operator. The first is quantitative: we assume that we are given a bound { A}, so that the operator { T} is defined and bounded on { L^q} with norm { A}; that is,

\displaystyle \|T(f)\|_q\leq A\|f\|_q, \forall f, f\in L^q \ \ \ \ \ (2)


Moreover, we assume that there is associated to { T} a measurable function { K} (that plays the role of its kernel), so that for the same constant { A} and some constant { c>1},

\displaystyle \int_{R^n-B(y,c\delta)}|K(x,y)-K(x,\bar y)|d\mu(x)\leq A, \forall \bar y\in B(y,\delta)  \ \ \ \ \ (3)


for all { y\in R^n, \delta>0}.

The further regularity assumption on the kernel { K} is that for each { f} in {L^q} that has compact surppot, the integral coverages absolutely for almost all { x } in the complement of the support of { f}, and that equality holds for these { x}.

Theorem 2 (Bounded of singular integral with condition)

Under the condition 1 and 3 made above on { K}, the operator { T} is bounded in { L^p} norm on { L^p\cap L^q}, when { 1<p<q}. More precisely,

\displaystyle \|T(f)\|_p\leq A_p\|f\|_p

For { f\in L^p\cap L^q} with { 1<p<q}, where the bound { A_p} depends only on the constant { A} appearing in 1 and 3 and on { p}, but not on the assumed regularity of { K}, or on { f}.



Now let us begin to prove the conditional theorem. The key point is to use the potential of {T} has been a bounded operator from {L^q\rightarrow L^q}. Said, it already assumed {\exists A>0} such that {\forall f\in L^q} we have {\|T(f)\|_q\leq A\|f\|_q}. Now let us look at the singular integral expression:

\displaystyle (Tf)(x)=\int_{R^n}K(x,y)f(y)d\mu(y). \ \ \ \ \ (4)



The key point is to proof the mapping {f\rightarrow T(f)} is a weak-type {1-1}; that is,

\displaystyle \mu\{x:|Tf(x)|>\alpha\}\leq \frac{A'}{\alpha}\int |f|d\mu. \ \ \ \ \ (5)


At once we establish 5, then the theorem followed by interpolation. Now we use theorem 1 on {f} get {f=g+b}, thanks to the triangle inequality and something similar we have {g,b \in L^q}, in fact {R^n= A\amalg B, B\cup_{k}B_k}, {g=\chi_A g+\chi_{B}g, b=\chi_A b+\chi_{B} b}, by triangle inequality and {f=g+b}, to proof {g,b \in L^q}, we only need to proof {\chi_A g, \chi_B g, \chi_A b, \chi_B b\in L^q}, but this is easy to proof.

Now we know the {L^q} bounded of {g,b}, we divide the difficult of establish the weak 1-1 bound of {f} into the difficult of establish the weak 1-1 bound for {g} and {b}. i.e.

\displaystyle \mu\{x:|Tf(x)|>\alpha\}\leq \mu \{x:|Tg(x)|>\alpha\} +\mu\{x:|Tb(x)|>\alpha\} \ \ \ \ \ (6)


For {g}, if this weak 1-1 bound is not true, we have,

\displaystyle \mu \{x:|Tg(x)|>\alpha\}\geq \frac{A'}{\alpha}\int |g|d\mu \ \ \ \ \ (7)


thanks to the trivial estimate {\|g\|_q \leq c\alpha^{q-1}\|g\|_1 }. combine this two estimate we have:

\displaystyle c\alpha^{q-1}\|g\|_1\geq \|g\|^q_q\geq c\|Tg\|^q_q \geq A'\alpha^{q-1} \|g\|_1 \ \ \ \ \ (8)


The first estimate is true on {A} due to {|g|\leq \alpha, a.e. x\in R^n}. But compare the left and the right of 8 lead a contradiction, so 7 follows. For {b}, the thing is more complicated and in fact really involve the structure of the convolution type of the singular integral. The key point is controlling near the diagonal of {K(x,y)}. we warm up with a more refine decomposition {b=\sum b_k}, {\forall k, b_k=b\cdot \chi_{B_k}}. For a large constant {c>>1} choose later define {B^*_k=c B_k}. We know {b\in L^q}, but the really difficult thing occur in the how to combine the following 5 condition to lead a contradiction:

  1. {\|Tb\|_q\leq \|b\|_q}.
  2. property come from the Calderon-Zegmund decomposition, {\int_{B_k}\|b\|\leq c\alpha \mu(B_k),\forall k} and {\int_{B_k}b=0}.
  3. Hormander condition 3 , {\int_{R^n-B(y,c\delta)}|K(x,y)-K(x,\bar y)|d\mu(x)\leq A, \forall \bar y\in B(y,\delta)}
  4. the reverse of weak 1-1 of {b}, {\mu\{x:b(x)>\alpha\}> \frac{A'}{\alpha}\|b\|_1}.
  5. the structure {Tb(x)=\int_{R^n} K(x,y)b(y)dy}

The first step is to break {b} into {b_k}, and reduce the case of several balls to the case of only one ball, this could be done by triangle inequality or more may be we could do it derectly, but any way it is not difficult.

Then the thing become intersting, we focus on {b_1}, divide {Tb_1=T\chi_{B_1} b_1+ T\chi_{{\mathbb R}^n-B_1}b}. thanks to the hormander condition 3 we have good control on {T\chi_{{\mathbb R}^n-B_1^*}}, in fact we can proof a weak 1-1 bound on it,

\displaystyle \mu\{x:|T_{{\mathbb R}^n-B_1^*}b_1|>\alpha\}< \frac{A'}{\alpha}\|b_1\|_1 \ \ \ \ \ (9)

\displaystyle \begin{array}{rcl} T_{{\mathbb R}^n-B_1^*}b_1(x) & = & \int_{{\mathbb R}^n-B_1^*}K(x,y)b_1(y)dy\\ & = & \int_{{\mathbb R}^n-B_1^*}[K(x,y)-K(x,\bar y)]b_1(y)dy+\int_{{\mathbb R}^n-B_1^*}K(x,\bar y)b_1(y)dy\\ & \leq & \int_{{\mathbb R}^n-B_1^*}Ab_1(y)dy+\int_{{\mathbb R}^n-B_1^*}K(x,\bar y)b_1(y)dy \end{array}

So we conclude,

\displaystyle \begin{array}{rcl} \|T_{{\mathbb R}^n-B_1^*}b_1\|_1 & = & \int_{{\mathbb R}^n}|\int_{{\mathbb R}^n-B_1^*}K(x,y)b_1(y)dy|dx\\ & = & \int_{{\mathbb R}^n}\int_{{\mathbb R}^n-B_1^*}|[K(x,y)-K(x,\bar y)]b_1(y)dy|dx+\int_{{\mathbb R}^n}|\int_{{\mathbb R}^n-B_1^*}K(x,\bar y)b_1(y)dy|dx\\ & \leq & A\int_{{\mathbb R}^n}b_1(y)dy+\int_{{\mathbb R}^n-B_1^*}K(x,\bar y)b_1(y)dy=A\int_{{\mathbb R}^n}b_1(y)dy \end{array}

The last equality used the condition {\int b_1=0}.




The large sieve and the Bombieri-Vinogradov theorem


Large sieve a philosophy reflect as a large group of inequalities which is very effective on controlling some linear sum or square sum of some correlation of arithmetic function, some idea of which could have originated in harmonic analysis, merely rely on almost orthogonality.

One fundamental example is the estimate of the quality,

\sum_{n\leq x}|\Lambda(n)\overline{\chi(n)}|

One naive idea of control this quality is using Cauchy-schwarz inequality. But stupid use this we gain something even worse than trivial estimate. In fact by triangle inequality and trivial estimate we gain trivial bound: \sum_{n\leq x}|\Lambda(n)\overline{\chi(n)}|\leq x. But by stupid use Cauchy we get following,

\sum_{n\leq x}|\Lambda(n)\overline{\chi(n)}|\leq ((\sum_{n\leq x}|\Lambda(n)|^2)(\sum_{n\leq x}|\chi(n)|^2))^{\frac{1}{2}}\leq xlog^{\frac{1}{2}}x

But this does not mean Cauchy-Schwarz is useless on charge this quality, we careful look at the inequality and try to understand why the bound will be even worse. Every time we successful use Cauchy-Schwarz there are two main phenomenon, first, we lower down the complexity of the quantity we wish to bound, second we almost do not loss any thing at all. So we just reformulate the quantity and find it lower down the complexity and the change is compatible with the equivalent condition of Cauchy-Schwarz. For example we have following identity,

\sum_{n\leq x}|\Lambda(n)\overline{\chi(n)}|=\sqrt{ \sum_{n\leq x}|\Lambda(n)\overline{\chi(n)}| \sum_{m\leq x}|\Lambda(m)\overline{\chi(m)}|}=\sqrt{ \sum_{k_1,k_2\in \mathbb F_p^{\times}}\sum_{n',m'\leq \frac{x}{p}}|\Lambda(n')\Lambda(m')\overline{\chi(k_1)\chi(k_2)}| }

So we could understand this quality as the Variation of primes in arithmetic profession constructed by \{pn+b| b\in\{1,2,...,p-1\}\}. But this is still difficult to estimate, merely because of we need to control the variation of convolution of \Lambda with itself on \mathbb F_p^{\times}\simeq \{pn+b| b\in\{1,2,...,p-1\}\}.

Now we change our perspective, recall a variant of Cauchy-Schwarz inequality, which called Bessel inequality, as following,

Bessel inequality

Let {g_1,\dots,g_J: {\bf N} \rightarrow {\bf C}} be finitely supported functions obeying the orthonormality relationship,

\displaystyle \sum_n g_j(n) \overline{g_{j'}(n)} = 1_{j=j'}

for all {1 \leq j,j' \leq J}. Then for any function {f: {\bf N} \rightarrow {\bf C}}, we have,

\displaystyle (\sum_{j=1}^J |\sum_{n} f(n) \overline{g_j(n)}|^2)^{1/2} \leq (\sum_n |f(n)|^2)^{1/2}.

Pf: The proof is not very difficult, we just need to keep an orthogonal picture in our mind, consider \{g_{j}(n)\}, 1\leq j\leq J to be a orthogonal basis on l^2(\mathbb N), then this inequality is a natural corollary.

Have this inequality in mind, by the standard argument given by transform from version of orthogonal to almost orthogonal which was merely explained in the previous note.  We could image the following corresponding almost orthogonal variate of “Bessel inequality” is true:

Generalised Bessel inequality

Let {g_1,\dots,g_J: {\bf N} \rightarrow {\bf C}} be finitely supported functions, and let {\nu: {\bf N} \rightarrow {\bf R}^+} be a non-negative function. Let {f: {\bf N} \rightarrow {\bf C}} be such that {f} vanishes whenever {\nu} vanishes, we have

\displaystyle (\sum_{j=1}^J |\sum_{n} f(n) \overline{g_j(n)}|^2)^{1/2} \leq (\sum_n |f(n)|^2 / \nu(n))^{1/2} \times ( \sum_{j=1}^J \sum_{j'=1}^J c_j \overline{c_{j'}} \sum_n \nu(n) g_j(n) \overline{g_{j'}(n)} )^{1/2}

for some sequence {c_1,\dots,c_J} of complex numbers with {\sum_{j=1}^J |c_j|^2 = 1}, with the convention that {|f(n)|^2/\nu(n)} vanishes whenever {f(n), \nu(n)} both vanish.


Linear metric on F2, free group with two generator.


I may have made a stupid mistake, but if not, we could construct a metric by pullback a metric on a suitable linear normalized space H which we carefully constructed. Let we define the generators of free group F_2 by a,b.

Step 1.

Constructed the linear normalized space H. the space H was spanned by basis \Lambda=\Lambda_a \coprod \Lambda_b, \Lambda_a, \Lambda_b are defined by look at the Cayley graph of F_2, there is a lot of vertical vector and horizontal vector in the Cayley graph, for every level set of vertical vector we put a basis in \Lambda_a, because there is only countable many vertical vectors (for example, a,a^2,a^{-5} are in the same vertical level, bab^{-1}, ba^{10}b^{-1} are in the same vertical level, bab^{-1},a are not in the same vertical level), we put a basis in \Lambda_a for every vertical level and claim we accomplished the construct of \Lambda_a, we do the same operation for \Lambda_b but only change the vertical level with horizontal level. Now we accomplished the construction of \Lambda, We spanned this with coefficient \mathbb Z and we get a linear space V. by Zorn’s lemma there exists a norm on the space, take one norm \|\cdot\| we accomplished the construction of H=(V,\|\cdot\|).

Step 2:

Pullback the norm \|\cdot\| on H to the free group F_2. In fact there is a natural bijection T: F_2\to H, which is given by following: On the Cayley graph (imaged it is embedding in \mathbb R^2), identity 1 in the group F_2 corresponding to the original, and more general every element in F_2 exactly identify with a point in the Cayley graph, thanks to there is no relation between a,b. And then there is of course infinity many of path from original to the point, but there is only one shortest path , thanks to there is no loop in the Cayley graph. We identify the elements in F_2 with the point in Cayley graph with the shortest path. Now we could explain why the path lies H. This path only across to finite vertical level and horizontal level and on every level it only pass finite step, this already given a representation \sum_{e_i\in \Lambda}c_i\cdot e_i, c_i\in \mathbb Z, the key point is there is only finite c_i\neq 0. So we have defined the bijection T:F_2\to H, and we could use the bijection to pullback the norm on H to a norm on F_2.

Step 3:

Now we begin to proof the norm we get by pullback satisfied the condition we need. We need only to proof the condition of linear growth and triangle inequality. The conjugation invariance is automatically by linear growth by the comments of Tobias Fritz. The triangle inequality is automatically, due to the bijection T stay the structure in fact, the multiplier of elements x_1,x_2 \in_2 could be view as put the two path together but this  is not true… merely because of the addition operation is not commutative.

The space we should consider is the path space equipped with the composition operation. I image there exists a “big space” such that the natural metric on the “big space” restrict on the embedding image of \mathbb F_2 is a linear growth metric.