Discrete harmonic function in Z^n

There is some gap, in fact I can improve half of the argument of Discrete harmonic function , the pdf version is Discrete harmonic function in Z^n, but I still have some gap to deal with the residue half…


1. The statement of result

First of all, we give the definition of discrete harmonic function.

Definition 1 (Discrete harmonic function) We say a function {f: {\mathbb Z}^n \rightarrow {\mathbb R}} is a discrete harmonic function on {{\mathbb Z}^n} if and only if for any {(x_1,...,x_n)\in {\mathbb Z}^n}, we have:

\displaystyle f(x_1,...,x_n)=\frac{1}{2^n}\sum_{(\delta_1,...,\delta_n )\in \{-1,1\}^n}f(x_1+\delta_1,...,x_n+\delta_n ) \ \ \ \ \ (1)


In dimension 2, the definition reduce to:

Definition 2 (Discrete harmonic function in {{\mathbb R}^2}) We say a function {f: {\mathbb Z}^2 \rightarrow {\mathbb R}} is a discrete harmonic function on {{\mathbb Z}^2} if and only if for any {(x_1,x_2)\in {\mathbb Z}^2}, we have:

\displaystyle f(x_1,x_2)=\frac{1}{4}\sum_{(\delta_1,\delta_2)\in \{-1,1\}^2}f(x_1+\delta_1,x_2+\delta_2 ) \ \ \ \ \ (2)


The result establish in \cite{paper} is following:

Theorem 3 (Liouville theorem for discrete harmonic functions in {{\mathbb R}^2}) Given {c>0}. There exists a constant {\epsilon>0} related to {c} such that, given a discrete harmonic function {f} in {{\mathbb Z}^2} satisfied for any ball {B_R(x_0)} with radius {R>R_0}, there is {1-\epsilon} portion of points {x\in B_R(x_0)} satisfied {|f(x)|<c}. then {f} is a constant function in {{\mathbb Z}^2}.

Remark 1 This type of result contradict to the intuition, at least there is no such result in {{\mathbb C}}. For example. the existence of poisson kernel and the example given in \cite{paper} explain the issue.

Remark 2 There are reasons to explain why there could not have a result in {{\mathbb C}} but in {{\mathbb Z}^2},

  1. The first reason is due to every radius {R} there is only {O(R^2)} lattices in {B_R(x)} in {{\mathbb Z}^2} so the mass could not concentrate very much in this setting.
  2. The second one is due to there do not have infinite scale in {{\mathbb Z}^2} but in {{\mathbb C}}.
  3. The third one is the function in {{\mathbb Z}^2} is automatically locally integrable.


The generation is following:

Theorem 4 (Liouville theorem for discrete harmonic functions in {{\mathbb R}^n}) Given {c>0,n\in {\mathbb N}}. There exists a constant {\epsilon>0} related to {n,c} such that, given a discrete harmonic function {f} in {{\mathbb Z}^n} satisfied for any ball {B_R(x_0)} with radius {R>R_0}, there is {1-\epsilon} portion of points {x\in B_R(x_0)} satisfied {|f(x)|<c}. then {f} is a constant function in {{\mathbb Z}^n}.

In this note, I give a proof of 4, and explicit calculate a constant {\epsilon_n>0} satisfied the condition in 3, this way could also calculate a constant {\epsilon_n} satisfied 4. and point the constant calculate in this way is not optimal both in high dimension and 2 dimension.

2. some element properties with discrete harmonic function

We warm up with some naive property with discrete harmonic function. The behaviour of bad points could be controlled, just by isoperimetric inequality and maximum principle we have following result.

Definition 5 (Bad points) We divide points of {{\mathbb Z}^n} into good part and bad part, good part {I} is combine by all point {x} such that {|f(x)|<c}, and {J} is the residue one. So {A\amalg B={\mathbb Z}^n}.

For all {B_R(0)}, we define {J_R:=J\cap B_R(0), I_R=I\cap B_R(0)} for convenient.

Theorem 6 (The distribution of bad points) For all bad points {J_R} in {B_R(0)}, they will divide into several connected part, i.e.

\displaystyle J_R=\amalg_{i\in S_R}A_i \ \ \ \ \ (3)

and every part {A_i} satisfied {A_i\cap \partial B_R(0)\neq \emptyset}.

Remark 3 We say {A} is connected in {{\mathbb Z}^n} iff there is a path in {A} connected {x\rightarrow y, \forall x,y\in A}.

Remark 4 the meaning that every point So the behaviour of bad points are just like a tree structure given in the gragh.

Proof: A very naive observation is that for all {\Omega\subset {\mathbb Z}^n} is a connected compact domain, then there is a function

\displaystyle \lambda_{\Omega}: \partial \longrightarrow {\mathbb R} \ \ \ \ \ (4)

such that {\lambda_{\Omega}(x,y)\geq 0, \forall (x,y)\in \Omega \times \mathring{\Omega}}. And we have:

\displaystyle f(x)=\sum_{y\in\partial \Omega}\lambda_{\Omega}(x,y)(y) \ \ \ \ \ (5)


This could be proved by induction on the diameter if {\Omega}. Then, if there is a connected component of {\Omega} such that contradict to theorem 6 for simplify assume the connected component is just {\Omega}, then use the formula 5we know

\displaystyle \begin{array}{rcl} \sup_{x\in \Omega}|f(x)| & = & \sup_{x\in \Omega}\sum_{y\in\partial \Omega}\lambda_{\Omega}(x,y)(y) \\ & \leq & \sup_{\partial \Omega}|f(x)| \\ & \leq & c \end{array}

The last line is due to consider around {\partial \Omega}. But this lead to: {\forall x\in \Omega, |f(x)|<c} which is contradict to the definition of {\Omega}. So we get the proof. \Box

Now we begin another observation, that is the freedom of extension of discrete harmonic function in {{\mathbb Z}^n} is limited.

Theorem 7 we can say something about the structure of harmonic function space of {Z^n}, the cube, you will see, if add one value, then you get every value, i.e. we know the generation space of {Z^n}

Proof: For two dimension case, the proof is directly induce by the graph. The case of {n} dimensional is similar. \Box

Remark 5 The generation space is well controlled. In fact is just like n orthogonal direction line in n dimensional case.

3. sktech of the proof for \ref


The proof is following, by looking at the following two different lemmas establish by two different ways, and get a contradiction.

\paragraph{First lemma}

Lemma 8 (Discrete poisson kernel) the poisson kernel in {{\mathbb Z}^n}. We point out there is a discrete poisson kernel in {{\mathbb Z}^n}, this is given by:

\displaystyle f(x)=\sum_{y\in \partial B_R(z)}\lambda_{B_R(z)}(x,y)(y) \ \ \ \ \ (6)

And the following properties is true:

  1. {\lambda_{B_R(z+h)}(x+h,y+h)=\lambda_{B_R(z)}(x,y)} , {\forall x\in \Omega, h\in {\mathbb Z}^n}.
  2. \displaystyle \lambda_{B_R(z)}(x,y)\rightarrow \rho_R(x,y) \ \ \ \ \ (7)

Remark 6 The proof could establish by central limit theorem, brown motion, see the material in the book of Stein \cite{stein}. The key point why this lemma 8 will be useful for the proof is due to this identity always true {\forall x\in B_R(0)}, So we will gain a lots of identity, These identity carry information which is contract by another argument.

\paragraph{Second lemma} The exponent decrease of mass.

Lemma 9 The mass decrease at least for exponent rate.

Remark 7 the proof reduce to a random walk result and a careful look at level set, reduce to the worst case by brunn-minkowski inequality or isoperimetry inequality.

\paragraph{Final argument} By looking at lemma 1 and lemma 2, we will get a contradiction by following way, first the value of {f} on {\partial B_R(0)} increasing too fast, exponent increasing by lemma2, but on the other hand, it lie in the integral expresion involve with poisson kernel, but the pertubation of poisson kernel is slow, polynomial rate in fact…



\bibitem{stein} Functional analysis



Uncertainty principle

The pdf version is Uncertainty principle. The nice note of terrence tao seems given a nice answer for the problem below.

1. Introduction

Is there a Brunn-Minkowski inequality approach to the phenomenon charged by uncertainty principle? More precisely, is it possible to say some thing about the Gaussian distribution

\displaystyle G(x)=e^{-|x|^2} \ \ \ \ \ (1)


to be the best choice that {\|\hat G-G\|_2} arrive minimum?

Remark 1 Or some other suitable distance space on reasonable function (may be some gromov hausdorff distance? Any way, to say the guassian distribution is the best function to defect the influence of uncertain principle.

I do not know the answer of the problem 1, but this is a phenomenon of a universal phylosphy, aid, uncertainty principle, heuristic:

It is not possible for both function {f} and its Foriour transform {\hat f} to be localized on small set.

Now let me give some approach by intuition to explain why the phenomenon of “uncertainty principle” could happen.

The approach is based on:

  1. level set decomposition.
  2. area formula (or coarea formula), anyway, some kind of change variable formula.
  3. integral by part.
  4. Basic understanding on exponential sum.

Let our function {f\in S} the Shwarz space, we begin with a intuition (not very rigorous) calculate:

\displaystyle \begin{array}{rcl} \hat f(\xi) & = & \int e^{2\pi i<\xi, x>}f(x)dx\\ & \overset{integral \ by \ part}= &\int \frac{1}{-2\pi i\xi}e^{-2\pi i<x,\xi>}\cdot \nabla f(x)\\ & \overset{Fubini}= & \int_{inf |f|}^{max |f|}\int_{level set A(t)} \frac{-e^{2\pi i<\xi,x>}}{-2\pi i\xi}\nabla f(x)dH^{n-1}(A)dt \end{array}

Now we try to understanding the result of the calculate, it is,

\displaystyle \hat f(\xi) =\int_{inf |f|}^{max |f|}\int_{level set A(t)} \frac{-e^{2\pi i<\xi,x>}}{-2\pi i\xi}\nabla f(x)dH^{n-1}(A)dt \ \ \ \ \ (2)


\displaystyle \pounds(A(t),\xi)=\int_{level set A(t)} \frac{-e^{2\pi i<\xi,x>}}{-2\pi i\xi}\nabla f(x)dH^{n-1}(A) \ \ \ \ \ (3)


The calculate is wrong, but not very far from the thing that is true, the key point is now the exponential sum involve. We could use the pole coordinate in the frequence space and get some very rough intuition of why the the uncertainty principle could occur.

Remark 2 Why we consider the level set decomposition, due to the integral is a combination of linear sum of the integral on every level set, so shape of level set is the key point.

The part of {\frac{-e^{2\pi i<\xi,x>}}{-2\pi i\xi}} in 2 is a rotation on the level set, a wave correlation of it and the christization function {\chi_{A_t}} of level set {A_t} in the whole space, this is of course a exponential sum.

Now we can begin the final intuition explain of the phenomenon of uncertainty principle. If the density of function {f} is very focus on some small part of the physics space, then it is the case for level sets of {f}, but we could say some thing for the exponential sum {\pounds(A(t),\xi)} 3 related to the level set, just by very simply argument with hardy litterwood circle method or Persaval identity? Any way, something similar to this argument will make sense, due to if the diameter of level set focus ois small, then we can not get a decay estimate for {\pounds(A(t),\xi)} when {\xi\rightarrow \infty} along one direction in frequency space, in fact we could say the inverse, i.e. it could not decay very fast.

2. Bernstein’s bound and Heisenberg uncertainty principle



2.1. Motivation and Bernstein’s bound

There is two different Bernstein’s bound, we discuss the first with the motivation, and proof the second rigorously. \paragraph{Form 1} {A} is a invertible affine map, then for a ball {B}, {A(B)=\epsilon} is a ellipsoid.

\displaystyle \epsilon=\{x\in {\mathbb R}^d|\sum_{j=1}^{d}r_j^{-2}(x_j-y_j)^2\leq 1\} \ \ \ \ \ (4)


By a orthogonal transform we could make {A} to be a diagonal matrix, i.e. {A=diag(r_1,...,r_d)}. It is said, for {\forall f\in S} or {f} is a smooth bump function, {f_A=f\circ A^{-1}}, so we have,

\displaystyle \hat f_A(\xi)=\int e^{2\pi i<x,\xi>}\cdot f\circ A^{-1}(x)dx \ \ \ \ \ (5)

We define dual of {\epsilon}, {\epsilon^*:=\{\xi\in {\mathbb R}^d| \sum_{j=1}^d\xi_j^2r_j^2\leq 1\}}.

Remark 3 Why there we use the metric {\xi_j^2r_j^2\leq 1} but not the standard inner product {<\xi,x>}? How to understand the choice?

Proposition 1 We have the following property:

  1. {f_A\in L^{\infty}\Longrightarrow \|\hat f_A\|_{1}\leq +\infty}.
  2. {|\hat f_A(\xi)|\leq c_N|\xi|(1+|\xi|^2_{\epsilon^*})^{-N}}


Remark 4

\displaystyle |\xi|^2_{\epsilon^*}=\sum_{i=1}^d\xi_j^2r_j^2

This is a norm of {{\mathbb R}^d} related to {\epsilon^*}.

Proof: Suffice to proof 2.

\displaystyle \begin{array}{rcl} |\hat f_A(\xi)| & = & |\int_{{\mathbb R}^d}e^{2\pi i<x,\xi>}f_A(x)dx|\\ & \overset{integral \ by \ part}\sim & \frac{1}{(1+|\xi|)^N}\int |e^{2\pi i<x,\xi>}\partial^N f_A(x)dx|\\ & = & c_N |\xi|(1+|\xi|_{\epsilon^*}^2)^{-N} \end{array}


More quantitative we have rigorous one: \paragraph{Form 2} If {f\in L^2({\mathbb R}^d)}, {supp f\in B(r,0)}, then it is not possible for {\hat f} to be concentrate on a scale much less than {R^{-1}}.

Proposition 2 (Bernstein’s bound) Suppose {f\in L^2({\mathbb R}^d)}, {supp f\subset B_R(0)}. Then,

\displaystyle \|\partial^{\alpha}\hat f\|_2\leq (2\pi r)^{|\alpha|}\|f\|_2, \forall \alpha. \ \ \ \ \ (6)

Proof: {\alpha=0} case is trivial by Paserval identity, which said on {L^2({\mathbb R}^d)}, fourier transform is a isometry, {\|f\|_2=\|\hat f\|_2}. For general case, integral by part, and use trivial estimate,

\displaystyle \begin{array}{rcl} \|\partial^{\alpha}\hat f\|_2 & \overset{integral \ by \ part}= & \|x^{\alpha} f\|_2\\ & \leq & (2\pi r)^{\alpha}\|f\|_2 \end{array}


2.2. Heisenberg inequality

Theorem 3 (Heisenberg uncertain principle) {f\in L^2({\mathbb R}^d)}, so {\hat f\in L^2({\mathbb R}^d)}, {\|f\|_2=\|\hat f\|_2}. then for any {x_0,\xi_0\in {\mathbb R}^d}, every direction, we have

\displaystyle \|f\|_2^2=\|f\|_2\|\hat f\|_2\leq \|(x-x_0)f\|_2\|(\xi-\xi_0)\hat f\|_2 \ \ \ \ \ (7)


Remark 5 We could understand the inequality by the following way. suffice to prove it with {f\in S} and then by approximation argument. {f\otimes \hat f\in S({\mathbb R}^d\times {\mathbb R}^d)}, define {\|f\otimes \hat f\|_2:= \|f\|_{L^2({\mathbb R}^d)}\cdot \|\hat f\|_{L^2({\mathbb R}^d)}}. then we have the following:

\displaystyle \|f\otimes \hat f\|_{L^2}\leq 4\pi \|xf\otimes \hat{xf}\|_{L^2} \ \ \ \ \ (8)

Remark 6 The inequality is shape, the extremizers being precisely given by the modulated Gaussians: arbitrary

\displaystyle f(x)= c e^{2\pi i\xi_0x}e^{-\pi \delta(x-x_0)^2} \ \ \ \ \ (9)


There are two proof strategies I have tried, I try them for several hour but not work out with a satisfied answer, the method more involve, I explain what happen in section 1, I have not tried, I will try it later. Both this two strategies i face some difficulties, I explain why I can not work out them with a proof: \paragraph{Strategy 1} The first one is, we could work with {f\in S} of course, by approximation, then we find, by Paserval, {\|f\|_2=\|\hat f\|_2, \|\partial_x f\|_2=\|\xi \hat f\|2} and are both true. then we use our favourite way to use Cauchy-Schwarz, the difficulty is we can not use a integral by part argument directly, even after restrict ourselves with monotonic radical symmetry inequality and by a rearrangement inequality argument, it seems reasonable due to rearrangement decreasing the kinetic energy as said in Lieb’s book. But even work with monotonic one, then one involve with some complicated form, try to use Fubini theorem to rechange the order of integral try to say something, it is possible to work out by this way but I do not know how to do. There is some calculate under this way,

\displaystyle \begin{array}{rcl} \|xf\|_2\|\xi \hat f\|_2 & \overset{Cauchy-Schwarz}\geq & \int xf\cdot \partial_x f\\ & \sim & \int f^2 \end{array}

but you know, at a point we have {\partial_x(Xf)=f+x\partial_x f\neq f}, the reasonable calculate is following,

\displaystyle \partial_x(xf)=f+x\partial_x f \ \ \ \ \ (10)

We want {\partial_x P(x,f) =f}, Then

\displaystyle \begin{array}{rcl} \partial_x P(x,f) & = & f\\ & = & \partial_x(xf)-x\partial_x f\\ & = & \partial_x(xf)-\partial_x(\frac{1}{2}x^2\partial_x f)+\frac{1}{2}x^2\partial_{x^2}f\\ & = & \partial_x(\frac{1}{6}x^3\partial_{x^2}f)-\frac{1}{6}x^3\partial_{x^3}f\\ ...\\ & = &\partial_x(\sum_{i=1}^{\infty}(-1)^{i+1}x^i\frac{1}{i!}\partial_{x^i}f)+(-1)^{i+1}x^i\frac{1}{i!}\partial_{x^{i+1}}f \end{array}

Seems to be {f=\partial_x(ln(f))}… I do not know.

\paragraph{Strategy 2} The second strategy is, in the quantity {\|xf\|_2\|\xi \hat f\|_x} we lose two cone very near {x_0,\xi}, we need use the extra thing to make up them. May be effective argument come from some geometric inequality.

3. The Amerein-Berthier theorem

Next we investigate following problem, the problem is following: if {E,F\subset {\mathbb R}^d} are of finite measure, can there be a nonzero {f\in L^2({\mathbb R}^d)} with {supp (f)\subset E} and {supp(\hat f)\subset F}? Some argument is folowing: Observe that:

\displaystyle \chi_{F}\hat f=\hat f \Longrightarrow \chi_{E}(\chi_F \hat f)^{\vee}=f. \ \ \ \ \ (11)

Assume that: {Tf:=\chi_{E}(\chi_F \hat f)^{\vee}} then {Tf=f}. So we have, at least {\|T\|_{2-2}\geq 1}. Some dirty calculate show:

\displaystyle \begin{array}{rcl} (Tf)(x) & = & \int e^{2\pi i\xi x}\chi_F\hat f(\xi)\chi_E(x)d\xi\\ & = & \int \int e^{2\pi i\xi(x-y)}f(y)\chi_F(\xi)\chi_E(x)dy d\xi\\ & \overset{Fubini}= &\int_{{\mathbb R}^d}\chi_E(x)\chi_F^{\vee}(x-y)f(y)fy \end{array}

So we can define kernel of {T},

\displaystyle K(x,y)=\chi_E(x)\chi_F(x-y)^{\vee} \ \ \ \ \ (12)


By Fubini, we calculate the Hilbert-Schmidt norm:

\displaystyle \int_{{\mathbb R}^{2d}}|K(x,y)|^2dxdy=|E||F|=\sigma^2<+\infty \ \ \ \ \ (13)


So {T} is a compact operator and its {L^2} operator norm satisfied {\|T\|=\min(\sigma,1)}. So if {\sigma<1} then we can canculate we can not have {f\neq 0} in the original question.

The story is in fact more interesting, the answer of the question is no even for {\sigma\geq 1}, so in all case. We have the following quatitative theorem:

Theorem 4 {E,F} finite measure in {{\mathbb R}^d}, then

\displaystyle \|f\|_{L^2({\mathbb R}^d)}\leq c(\|f\|_{L^2(E^c)}+\|\hat f\|_{L^2(F^c)}) \ \ \ \ \ (14)

for some constant {c=c(E,F,d)}.

Remark 7 There is a naive approach for this theorem: Area formula trick, the shape of level set. Obvioudly we have:

\displaystyle \|f\|_{L^2({\mathbb R}^d)}\leq \|f\|_{L^2(E)}+\|f\|_{L^2(E^c)} \ \ \ \ \ (15)


Key point is proof:

\displaystyle \|f\|_{L^2(E)}\leq c(E,F,d)\|\hat f\|_{L^2(F^c)} \ \ \ \ \ (16)


Let us do some useless further calculate:

\displaystyle \begin{array}{rcl} |f\|_{L^2(E)} & = & \|\chi_E \cdot f\|_{L^2({\mathbb R}^d)}\\ & = & \|\widehat {\chi_E\cdot f}\|_{L^2({\mathbb R}^d)}\\ & = & \|\hat \chi_E \cdot \hat{f^{\vee}}\|_{L^2({\mathbb R}^d)} \\ & = & \|\chi_E^{\vee} * f^{\vee}\|_{L^2({\mathbb R}^d)} \end{array}

So suffice to have:

\displaystyle \|\chi_E^{\vee} * f^{\vee}\|_{L^2({\mathbb R}^d)}\leq c(E,F,d)\|\hat f\|_{L^2(F^c)} \ \ \ \ \ (17)

But there is connter example given by modified scaling Gaussian distribution… The point is form 15 to 16 is too loose.

Following I given a right approach, following by my sprite on level set and area formula argument and discritization.

Proof: The story is the same for a discretization one. We need point out, change the space {{\mathbb R}^d} to {{\mathbb Z}^d}, then every thing become a discretization one, and the change could been argue as a approximation way. What happen then, we have a naive picture in mind which is:

\displaystyle \delta \rightarrow wave , \ wave \rightarrow \delta

What is the case with {L^2} norm, it become the standard nner product on {{\mathbb Z}^d}, and the scale involve, i.e. we have the following basic estimate:

\displaystyle \|\chi_E f\|_1^2\leq \|\chi_E f\|_2 \cdot|E| \ \ \ \ \ (18)


Now image if the density of {f} concentrate in a very small area, then by a cut off argument we consider the supp of {f}, {supp f=E} is very small, then use the argument 18, we could conclute the density of {\hat f} could not very concentrate in the fraquence space. The constant {c(E,F,d)} could be given presicely by this way, but I do not care about it. \Box


4. Logvinenko-Sereda theorem

Next we formulate some result that provide further evidence of the non-concentration property of functions with Fourier support on {B_1}.

4.1. A toy model

Theorem 5 Let {\alpha>1} an suppose that {S\subset {\mathbb R}^d} satisfies,

\displaystyle |S\cap B|<\alpha |B|, \ for \ all \ balls \ B \ of \ radius\ 1. \ \ \ \ \ (19)

If {f\in L^2({\mathbb R}^d)} satisfies {supp(\hat f)\subset B(0,1)} then

\displaystyle \|f\|_{L^2(S)}\leq \delta(\alpha)\|f\|_2 \ \ \ \ \ (20)

Where {\delta(\alpha)\rightarrow 0} as {\alpha \rightarrow 0}.

Proof: This is a easy corollary of the argument I give in the proof of Amerein-Berthier theorem 4. \Box

4.2. A refine version

Theorem 6 Suppose that a measurable set {E\subset {\mathbb R}^d} satisfies the following “thinkness” condition: there exists {\gamma\in (0,1)} such that

\displaystyle |E\cap B|>\gamma |B| \ for \ all \ balls \ B \ of \ radius\ R^{-1}. \ \ \ \ \ (21)

where {R>0} is arbitrary but fixed. Assume that {supp(\hat f)\subset B(0,R)}. Then

\displaystyle \|f\|_{L^2({\mathbb R}^d)}\leq C\|f\|_{L^2(E)}. \ \ \ \ \ (22)

where the constant {C} depends only on {d} and {\gamma}.

Remark 8 This proof need some very good estimate come from several complex variables.

5. The Malgrange-Ehrenpreis theorem

Theorem 7 Let {\Omega} be a bounded domain in {{\mathbb R}^d} and let {p\neq 0} be a polynomial, Then, for all {g\in L^3(\Omega)}, there exists {f\in L^2(\Omega)} such that {p(D)f=g} in a distribution sence.


Pesudo differential opertor and singular integral

I already understand this material 3days ago but it is a little difficult for me to type the latex…


1. Introduction

There is two space to understand a function’s behaviour, the physics space and the frequency space (Why thing going like this? Why there is such a duality?). Namely, we have:

\displaystyle \hat f(\xi)=\int_{{\mathbb R}^d}e^{2\pi i\xi x}f(x)dx \ \ \ \ \ (1)


The key point is, waves is a parameter group of scaling of definition of a constant fraquence wave, so it connected the multiplication and addition. Basically due to it can be look as the correlation of a function and the scaling of wave with carry all the information about {f}. A generation of this obeservation is the wavelet theory.

So as we well know, the key ingredient of Fourier transform is to image function as a sum of series waves. A famous theorem of Mikhlion said that a translation-invariant operator {T} on {R^n} could be represented by a multiplication operator on the Fourier transform side. translation is the meaning, {h\circ T=T\circ h, \forall h} is a translation.

In a formal level, consider it as distribution (compact distribution or temperature distribution is both OK). We have:

\displaystyle T(e^{2\pi ix\xi})=a(\xi)e^{2\pi ix\xi}, \forall \xi \in {\mathbb R}^n \ \ \ \ \ (2)


the meaning is if we consider {T} is a operator on distribution space, {T:S'\rightarrow S'}, then {\forall f\in S},

\displaystyle \int T(e^{2\pi ix\xi})f=\int a(\xi)e^{2\pi ix\xi}f

due to the linear combination of {e^{2\pi ix\xi}} will consititue a dense set in {S}. So this could extend to the whole distribution space by dual and give the definition of {T}, i.e.

\displaystyle (Tf)(x)=\int_{{\mathbb R}^n}a(x,\xi)e^{2\pi ix\xi}\hat f(\xi)d\xi \ \ \ \ \ (3)


Remark 1 {T} is bounded on {L^2({\mathbb R}^n)} when {a} is a bounded function, thanks to Parevel theorem. When {a} is a bounded function, the composition of two such operator could be defined, and the symbol of composition operator corresponding to the composite of their symbol, i.e.

\displaystyle T_a\circ T_b(e^{2\pi ix\xi})=b(\xi)a(\xi)e^{2\pi ix\xi} \ \ \ \ \ (4)


Remark 2 For parenval theorem, i.e. {\|\hat f\|_2=\|f\|_2}, there is two approach, heat kernel approximation approach and discretization.

We wish to investigate the operator given by multiplier, i.e.

\displaystyle (Tf)(x)=\int_{{\mathbb R}^n}a(x,\xi)e^{2\pi ix\xi}\hat f(\xi)d\xi \ \ \ \ \ (5)

When it is satisfied {\|T\|_{p-p}<\infty}?

Intuition, the following calculate is only morally true, not rigorous.

\displaystyle \begin{array}{rcl} \|Tf\|^p_p & = & \int_{{\mathbb R}^d }|\int_{{\mathbb R}^d}a(x,\xi) e^{2\pi ix\xi} \hat f(\xi)d\xi|^pdx \\ & \overset{\exists f}\sim & \int_{{\mathbb R}^d }\int_{{\mathbb R}^d}|a(x,\xi) e^{2\pi ix\xi} \hat f(\xi)|^p d\xi dx \\ & \overset{Fubini}\sim & \int_{{\mathbb R}^d}\int_{{\mathbb R}^d}|\widehat{a(x,\xi)}\hat f(\xi) |^pdxd\xi \\ & \sim & \int_{{\mathbb R}^n}\int_{{\mathbb R}^n}|{a(x,\xi)}^{\vee}*f(\xi)|^pdxd\xi \end{array}


So we need some restriction on { {a(x,\xi)}^{\vee}}, namely {\widehat{a(x,\xi)}}, so we need some decay condition on {|\partial_x^{\alpha}\partial_{\xi}^{\beta}a(x,\xi)|}, why this, just consider integral by part for {a(x,\xi)\in S}. The rigorozaton of this intuition inspirit us to the definition of symbol calss.

Definition 1 we say {a(x,\xi)} is in symbol class {S_m} iff,

\displaystyle |\partial_x^{\beta}\partial_{\xi}^{\alpha}a(x,\xi)|\leq A_{\alpha,\beta}(1+|\xi|)^{m-|\alpha|} \ \ \ \ \ (6)

for all {\alpha,\beta} is multi-indece.

Remark 3

  1. we note that all partial differential operator, whose coefficient, together with all their derivatives are bounded belong to this class, In this particular circumstance, the symbol is a polynomial in {\xi}, essentially the “characteristic polynomial” of the operator.
  2. The general operator of this class have a parallel description in terms of their kernels. That is, in a suitable sense,

    \displaystyle (Tf)(x)=\int_{{\mathbb R}_x}K(x,y)f(y)dy \ \ \ \ \ (7)

    besides enjoying a cancellation property, {K} is here characterized by differential inequalities “dual” to those for {a(x,\xi)}. In the key case where the order {m=0}, this kernel representation makes {T} a singular integral operator.

  3. The crucial {L^2} estimate, when {m=0}, is atelatively simple consequences of Plancherel’s theorem for the Fourier transform. With this, the {L^p} theory introduce in previous note is therefore applicable.
  4. The product identity that holds in the translation-invariant case generalized to the situation treated here as a symbolic calculus for the composition of operators. That is, there is an asymptotic formula for the composition of two such operators, whose main term is the point-wise product of their symbols.
  5. The succeeding terms of the formula are of decreasing orders. These orders measure not only the size of the symbols, but determine also the increasing smoothing properties of the corresponding operators. The smoothing properties are most neatly expressed in terms of the Sobolev space {W_k^p} and the Lipschitz space {\Lambda_{\alpha}}.


2. Pseudo-differential operator

“Freezing principle”: from variable coefficient differential equation to constant coefficient differential equation by approximation. divide into 2 steps:

  1. divide space into small cubes.
  2. take average of the coefficient of differential equation in every cubes.

Suppose we are interested in study the solution of the classical elliptic second order equation.

\displaystyle (Lu)(x)=\sum a_{ij}(x)\frac{\partial^2 u(x)}{\partial x_i\partial x_j}=f(x) \ \ \ \ \ (8)


Where the coefficient matrix {\{a_{ij}(x)\}} is assume to be real, symmetric, positive definite and smooth in {X}. Understanding {P}, such that,

\displaystyle PL=I \ \ \ \ \ (9)


Looking for a {P}. Such that {PL=I+E}. {E} is a error term which have good control. To do this, fix an arbituary point {x_0}, freeze the operator {L} at {x_0}:

\displaystyle L_{x_0}=\sum a_{ij}(x_0)\frac{\partial^2}{\partial x_i\partial x_j} \ \ \ \ \ (10)


In Fourier sense ({L^2} sence).

\displaystyle \begin{array}{rcl} L_{x_0}f(x) & = & \int e^{2\pi ix\xi}(\widehat{ \sum a_{ij}(x_0)\frac{\partial^2 f(\xi)}{\partial x_i\partial x_j}}) d\xi\\ & = & \int e^{2\pi ix\xi}\int e^{-2\pi i \xi y}\sum a_{ij}(x_0)\frac{\partial^2}{\partial x_i\partial x_j}f(y)dyd\xi\\ & = & \int e^{2\pi ix\xi}(-4 \pi^2)\sum_{i,j}a_{ij}(x_0)\xi_i\xi_j \end{array}

Remark 4 The remark is, morally speaking, for application of fourier transform in PDE. morally we could only solve the problem with linear differential equation (although we could consider the hyperbolic type). The main obstacle for Fourier transform application into PDE:

  1. it only make sense with Schwarz class or its dual, this is not main obstacle, in principle could be solved by rescaling.
  2. the main obstacle is it only compatible with linear differential equation.


Cut-off function: {\eta} vanish near the origin,

\displaystyle (P_{x_0}f)=\int_{{\mathbb R}^n}(-4\pi^2\sum_{i,j}a_{ij}(x_0)\xi_i\xi_j)^{-1}\eta(\xi)\hat {f(\xi)}e^{2\pi ix\xi}d\xi. \ \ \ \ \ (11)



\displaystyle L_{x_0}P_{x_0}=I+E_{x_0}.

{E_{x_0}} is actually a smoothing operator, because it is given by convolution with a fixed test function. It should be seasonable when {x} near {x_0}, {(Pf)(x)} is well approximated by {(P_{x_0}f)(x)}, it is actually the case, define {((Pf)(x):=(P_xf)(x)}, i.e.

\displaystyle (Pf)(x)=\int_{{\mathbb R}^n}(-4\pi^2\sum_{i,j}a_{ij}(x)\xi_i\xi_j)^{-1}\eta(\xi)\hat f(\xi)e^{2\pi ix\xi}d\xi \ \ \ \ \ (12)


The operator {P} so given is a propotype of a pesudo-differential operator. Moreover, one has {LP=I+E}, where the error operator {E} is “smoothing of order 1”. That this is indeed the case is the main part of the symbolic calculus described.

Definition 2 (symbol class) A function {a(x,\xi)} belong to {S^m} and is said to be of order {m} of {a(x,\xi)} is a {C^{\infty}} function of {(x,\xi)\in {\mathbb R}^n\times {\mathbb R}^n} and satisfies the differential inequality:

\displaystyle |\partial_x^{\beta}\partial_{\xi}^{\alpha}a(x,\xi)|\leq A_{\alpha,\beta}(1+|\xi|)^{m-|\alpha|} \ \ \ \ \ (13)


For all {\alpha,\beta} are multi-indece.

Now we trun to the exact meaning of pesudo-differential operator, i.e. how them action on functions. Under some suffice given regularity condition, for {a\in S^m}, {T_a:S\rightarrow S}.

\displaystyle (Tf)(x)=\int_{{\mathbb R}^n}a(x,\xi)\hat f(\xi)e^{2\pi ix\xi}d\xi \ \ \ \ \ (14)


Remark 5 {T_a:S\rightarrow S} is continuous and for {a_k\rightarrow a} pointwise, {a_k\in S,\forall k\in {\mathbb N}^*}, {T_{a_k}(f)\rightarrow T_a(f)} in {S}.

then expense it, we get:

\displaystyle (T_af)(x)=\int\int a(x,\xi)e^{2\pi i\xi(x-y)}f(y)dyd\xi \ \ \ \ \ (15)


This could be diverge, even when {f\in S}. The key point is we do not have control with the second integral, morally speaking, this phenomenon is the weakness of Lesbegue integral which would not happen in Riemann integral, so sometime we need the idea from Riemann integral, this phnomenon is settle by multi a cut off function {\eta_{\epsilon}} and take {\epsilon\rightarrow \infty}, the same deal also occur as the introduced of P.V. integral in Hilbert transform. The precise method to deal with the obstacle is following: {a_{\epsilon}(x,\xi)=a(x,\xi)\gamma(\epsilon x,\epsilon \xi)}, if {a\in S^m}, {a_{\epsilon}\in S^m}. {T_{a_{\epsilon}}\rightarrow T_a} in the sense:

{\forall f\in S}, {T_{a_{\epsilon}}(f)\rightarrow T_a(f)},

\displaystyle (T_af)(x)=\lim_{\epsilon\rightarrow 0}\int\int a_{\epsilon}(x,\xi)e^{2\pi i\xi(x-y)}f(y)dyd\xi \ \ \ \ \ (16)

We also have:

\displaystyle <T_af,g>=<f,T_a^*g>, \forall f,g\in S. \ \ \ \ \ (17)


Then we have:

\displaystyle (T^*_ag)(y)=\lim_{\epsilon\rightarrow 0}\int\int \bar a_{\epsilon}(x,\xi)e^{2\pi i\xi(y-x)}g(x)dxd\xi \ \ \ \ \ (18)

and {<f,g>} denotes {\int_{{\mathbb R}^n}f(x)\bar g(x)dx}. Thus the pesudo-differential operator {T_a} initially defined as a mapping from {S} to {S}, extend via the identity 17 to a mapping from the space of temperatured distribution {S'} to itself {S'}. Notice also that {T_a} is automatically continuous in this space. \newpage

3. {L^p} bounded theorem

We first introduce a powerful tools, called dyadic decomposition,

Lemma 3 (dyadic decomposition) In eculid space {{\mathbb R}^n} there exists a function {\phi\in C^{\infty}({\mathbb R}^n)} such that,

\displaystyle \sum_{i\in {\mathbb Z}}\phi(2^{-i}x)=1 \ \ \ \ \ (19)


and {\forall x\in {\mathbb R}^n}, there is only two of {i\in {\mathbb Z}} such that {\phi(2^{-i}x)\neq 0}, and we can choose {\phi} to be radical and {\phi(x) \geq 0,\forall x\in {\mathbb R}^n}.

Remark 6

So for a given mutiplier {a(x,\xi)}, we will have {a(x,\xi)=\sum_{i\in {\mathbb Z}}a_i(x,\xi)=\sum_{i\in Z}\phi(2^{-ix})a(x)}.

Proof: The proof is easy, after rescaling we just need observed there is a bump function satisfied whole condition. \Box

Theorem 4 Suppose {a} is a symbol of order 0, i.e. that {a\in S^0} Then the operator {T_a}, initially defined on {S}, extends to a bounded operator from {L^2({\mathbb R}^n)} to itself.

Remark 7 Suffice to show {\|T_a(f)\|_{L^2}\leq A\|f\|_{L^2}, \forall f\in S} and by dual.

In fact we can directly proof a more general theorem:

Theorem 5 Let {m:{\mathbb R}^d-\{0\}\rightarrow {\mathbb C}} satisfy, for any multi-index {\gamma} of length {|\gamma|\leq d+2},

\displaystyle |\partial^{\gamma}m(\xi)|\leq B|\xi|^{-|\gamma|}

For all {\xi\neq 0}. Then, for any {0<p<\infty}, there is a constant {C=C(d,p)} such that,

\displaystyle \| (m\hat f)^{\vee}\|_p\leq C(p,d)\|f\|_p \ \ \ \ \ (20)


for all {f\in S}.

Proof: {a\in S^0}, so we have:

\displaystyle |\partial_x^{\beta}\partial_{\xi}^{\alpha}a(x,\xi)\leq a_{\alpha,\beta}(1+|\beta|)^{-|\alpha|} \ \ \ \ \ (21)

{\forall \alpha,\beta} are multi indeces. Then we consider dyadic decomposition, the is a function {\phi} satisfied the condition in 19, define {a_i(x,\xi)=\phi(2^{-i}x)a(x)}. then {supp a_i(x,\xi)} cpt, {\|a_i\|<\infty}. So {a_i\in L^p({\mathbb R}^n)}, we have,

\displaystyle \begin{array}{rcl} \|T_{a_i}f\|_p^p & = & \int_{{\mathbb R}^d}|K_i*f(x)|^pdx \\ & = & \int_{{\mathbb R}^d}|\int_{{\mathbb R}^d}K_i(x-y)f(y)dy|^pdydx\\ & \leq &\int_{{\mathbb R}^d}\int_{{\mathbb R}^d}|K_i(x-y)f(y)|^pdydx\\ & = & \|K_i\|_p^p\|f\|_p^p \end{array}

{\|K_i\|_p} have good decay estimate, thanks to {u_i=\phi(2^{-i}x)a(x)\in S^0}, this estimate is deduce morally along the same ingredient of “station phase”, it is come from a argument combine “counting point” argument and a rescaling argument. So,

\displaystyle \begin{array}{rcl} \|Tf\|_p & = & \|\sum T_if\|_p\\ & \leq & (\sum \|K_i\|_p^p)\|f\|_p \end{array}

But we have {\sum\|K_i\|_p^p\leq \infty}, ending the proof. \Box

Remark 8 this method also make sense of restrict the condition to be:

\displaystyle |\partial_x^{\beta}\partial_{\xi}^{\alpha}a(x,\xi)|\leq A_{\alpha,\beta}(1+|\xi|)^{-|\alpha|}, \forall |\alpha|\leq d+2. \ \ \ \ \ (22)

Where {d} is the dimension of the space, and we could change {2} to {1+\epsilon}.

Remark 9

\displaystyle \widehat{\frac{\partial^2 u}{\partial x_i\partial x_j}}|\xi|=\frac{\xi_i\xi_j}{|\xi|^2}\widehat \Delta u(\xi), m(\xi)=\frac{\xi_i\xi_j}{|\xi|^2} \ \ \ \ \ (23)

is a counter example for {p=1,\infty}.

Remark 10 The key point is the estimate

\displaystyle \int|\int e^{2\pi i\xi x}\phi(2^{-j}\xi)m(\xi)d\xi |^pdx \ \ \ \ \ (24)

Correlation of taylor expension and wavelet expension. This is also crutial for the theory of station phase.

4. Calculus of symbols

This calculus of symbols would imply there is some structure on this set.

Theorem 6 Suppose {a,b} are symbols belonging to {S^{m_1}} and {S^{m_2}} respectively. Then there is a symbol {c} in {S^{m_1+m_2}} so that:

\displaystyle Tc=T_a\circ T_b


\displaystyle c\sim \sum_{\alpha}\frac{(2\pi i)^{-|\alpha|}}{\alpha}(\partial_{\xi}^{\alpha}a)(\partial_x^{\alpha}b). \ \ \ \ \ (25)

in the sense that,

\displaystyle c-\sum_{|\alpha|<N}\frac{(2\pi i)^{-|\alpha|}}{\alpha !}\partial_{\xi}^{\alpha}\partial_x^{\alpha}b\in S^{m_1+m_2-N} \ \ \ \ \ (26)

For all {N>0}.

The following “proof” is not rigorous, we just calculate it formally, we could believe it is true rigorously, by some approximation process. Proof: We assume {a,b} have compact support so that our manipulations are justified. We use the alternate formula 15 to write,

\displaystyle (T_af)(y)=\int b(y,\xi)e^{2\pi i\xi(y-z)}f(z)dzd\xi \ \ \ \ \ (27)

Then we apply {T_a}, again in the form 15, but here with the variable {\eta} replacing in the integration. The result is,

\displaystyle T_a(T_bf)(x)=\int a(x,\eta)b(y,\xi)e^{2\pi i\eta(x-y)}e^{2\pi i\xi(y-z)}f(z)dzd\xi dyd\eta. \ \ \ \ \ (28)

This calculate is easy to derive, but the following is more tricky. Now {e^{2\pi i\eta(x-y)}\cdot e^{2\pi i\xi(y-z)}=e^{2\pi i(x-y)(\eta-\xi)}\cdot e^{2\pi i(x-z)\xi}}, so

\displaystyle T_a(T_bf)(x)=\int c(x,\xi)e^{2\pi i(x-z)\xi}f(z)dz d\xi \ \ \ \ \ (29)


\displaystyle c(x,\xi)=\int a(x,\eta)b(y,\xi)e^{2\pi i(x-y)(\eta-\xi)}dyd\eta \ \ \ \ \ (30)


we can also carry out the integration in the y-variable. This leads to the corresponding Fourier transform of {b} in that variable, and allows us to rewrite 30 as,

\displaystyle c(x,\xi)=\int a(x,\xi+\eta)\hat b(\eta,\xi)e^{2\pi i x\eta}d\eta. \ \ \ \ \ (31)

With this form in hand, use taylor expense to the symbol {a(x,\xi+\eta)}, i.e.

\displaystyle a(x,\xi+\eta)=\sum_{|\alpha|<N}\partial_{\xi}^{\alpha}a(x,\xi)\eta^{\alpha}+R_N(x,\xi,\eta) \ \ \ \ \ (32)

with a suitable error term {R_N}, due to

\displaystyle \frac{1}{\alpha !}\int \partial_{\xi}^{\alpha}a(x,\xi)\hat\eta(\eta,\xi)e^{2\pi ix\eta}d\eta=\frac{(2\pi i)^{|\alpha|}}{\alpha !}(\partial_{\xi}^{\alpha}a(x,\xi))(\partial_x^{\alpha}b(x,\xi)). \ \ \ \ \ (33)

we only need to proof {R_N\in S^{m_1+m_2-N}} and it is definitely the case, we get the theorem. \Box

Remark 11 We need replace {a,b} with {a_{\epsilon},b_{\epsilon}}, where

\displaystyle a_{\epsilon}(x,\xi)=a(x,\xi)\cdot \gamma(\epsilon ,\epsilon \xi), b_{\epsilon}(x,\xi)=b(x,\xi)\cdot \gamma(\epsilon ,\epsilon \xi). \ \ \ \ \ (34)

we note that {a_{\epsilon},b_{\epsilon}} satisfy the same differential inequalities that {a} and {b} do, uniformly in {\epsilon, 0<\epsilon\leq 1} .passage to the limit as {\epsilon\rightarrow 0} will then give us our desired result.

5. Estimate in {L^p} , Sobolev, and Lipchitz space

We now take up the regularity properties of our pesudo-differential operator as expressed in terms of the standard function spaces, we begin with the {L^p} boundedness of an operator of order {0}.

5.1. {L^p} estimate

Suppose {a} belongs to the symbol class {S^0}. Then, we can express {T=T_a} as

\displaystyle (Tf)(x)=\int K(x,y)f(y)dy=\int K(x,x-y)f(y)dy \ \ \ \ \ (35)

due to {a\in S^0}, we know, with some approximation argument and first do it with a cutoff symbol of {a}, i.e. {a_{\epsilon}}, that,

\displaystyle |K(x,y)|\leq A|x-y|^{-n} \ \ \ \ \ (36)

So that the integral coverage whenever {f\in S} and {x} is away from the support of {f}. Since we know that {T} is bounded on {L^2({\mathbb R})}, this representation extends to all {f\in L^2({\mathbb R})} for almost every {x\notin supp f}. More generally, we have,

\displaystyle |\partial_{x}^{\alpha}\partial_{y}^{\beta}K(x,y)|\leq A_{\alpha,\beta}|x-y|^{-n-|\alpha|-|\beta|} \ \ \ \ \ (37)

hence {K} satisfies,

\displaystyle \int_{|x-y|\geq 2\delta}|K(x,y)-K(x,\bar y)|dx\leq A, \ if\ |y-\bar y|\leq \delta, all \ \ \delta>0. \ \ \ \ \ (38)

Use the general singular integral theory we get the following {L^p} estimate.

Theorem 7 Suppose {T_a} is the pseudo-differential operator corresponding to a symbol {a} in {S^0}, then {T_a} extends to a bounded operator on {L^p({\mathbb R}^n)} to itself, for {1<p<\infty}.

5.2. Sobolev spaces

We first recall the definition of the Sobolev spaces {W_k^p}, where {k} is a positive integer. A function {f} belongs to {W_k^p({\mathbb R}^n)} if {f\in L^p({\mathbb R}^n)} and the partial derivatives {\partial_x^{\alpha}f}, taken in the sense of distribution, belong to {L^p({\mathbb R}^n)}, whenever {0\leq |\alpha|\leq k}. The norm in {W_k^p} is given by,

\displaystyle \|f\|_{W_k^p}=\sum_{|\alpha|\leq k}\|\partial_x^{\alpha}f\|_{L^p} \ \ \ \ \ (39)

the following result is the directly corollary of 7.

Theorem 8 Suppose {T_a} is a pseudo-differential operator whose symbol {a} belongs to {S^m}. If {m} is an integer and {k\geq m}, then {T_a} is a bounded mapping from {W_k^p} to {W_{k-m}^p}, whenever {1<p<\infty}.

Remark 12 This theorem remain valid for arbitrary real {k,m}.

5.3. Lipschitz spaces

Theorem 9 Suppose {a} is a symbol in {S^m}. Then the operator {T_a} is a bounded mapping from {\Lambda_{\gamma}} to {\Lambda_{\gamma-m}}, whenever {\gamma>m}.

Lemma 10 Suppose the symbol {a} belongs to {S^m}, and define {T_{a_j}=T_a\Delta_j}. Then, as operator from {\L^{\infty}({\mathbb R}^n)} to itself, the {T_{a_j}} have norms that satisfy

\displaystyle \|T_{a_j}\|\leq A2^{jm} \ \ \ \ \ (40)

We shall now point out a very simple but useful alternative characterization of {\Lambda_{\gamma}}. This is in terms of approximation by smooth functions; it is also closely connected with the definition of {\Lambda_{\alpha}} space as intermediate spaces, using the “real” method of interpolation.

Corollary 11 A function {f} belongs to {\Lambda_{\gamma}} if and only if there is a decomposition,

\displaystyle f=\sum_{j=0}^{\infty}f_j \ \ \ \ \ (41)

with {\|\partial_x^{\alpha}f_j\|_{L^infty}\leq A2^{-j\gamma}\cdot 2^{j|\alpha|}}, for all {0\leq |\alpha|\leq l}, where {l} is the smallest integer {>\gamma}.

When {f\in \Lambda_{/\gamma}}, the argument prove 10, with {T_a=I}, {f_j=F_j=\Delta_j(f)}, gives the required estimate for the {f_j}.

A second consequence of 9 is the following:

Corollary 12 The operator {(I-\Delta)^{\frac{m}{2}}} gives an isomorphism from {\Lambda_{\gamma}} to {\Lambda_{\gamma-m}}, whenever {\gamma>m}.

This is clear because {(I-\Delta)^{\frac{m}{2}}} is continuous from {\Lambda_{\gamma}} to {\Lambda_{\gamma-m}}, and its inverse, {(I-\Delta)^{\frac{-m}{2}}}, is continuous from {\Lambda_{\gamma-m}} to {\Lambda_{\gamma}}.

A glimpse to the general theory

1. Introduction

We have talked about a very basic result in singular integral, i.e. if we have an additional condition, i.e. {q-q} bounded condition, then by interpolation theorem we only need to establish the weak {1-1} bound then we establish the {p-p} bound of {T}, {\forall 1< p< q }.

The category of of singular integral is very general, in fact the singular integral we interested in always equipped more special structure. We discuss following 3 types result which world be the central role in this further series note.

  1. Approximation of the identity.
  2. Singular integral with {L^2} bounded translation invariant operator.
  3. Maximal function, singular integral, and square functions.

The underlying object we consider in both the three case is some special singular integral, in the first case, it looks like a {T=sup_{t>0} \Phi_{t}*f}, this, among the other thing, has a close relationship with the maximal operator {Mf}. This is discussed in 2. For the singular integral with {L^2} bound, the Fourier transform or its discretization version, Fourier series is natural involved. And there is a “representation theorem” similar to the sprite of Reisz representation theorem, said, roughly speaking, if we consider the {L^2} bound operator adding the condition of transform invariant, then it is really coinside with the case of our image, the operator must behaviour as a Fourier multiple. This is the contant of famous Mikhlin multiplier theorem, and we discuss some technique difficulty in the process of establishing such a theorem, this is the contact of 3. At last we discuss some deep relationship between three basic underlying intution and objects in harmonica analysis, the Maximal function, singular integral, and square functions. They could all be understanding as tools to understanding the variant complicated emerging in singular integral. But there is definitely some common points. This is the theme of 4. Of course there are some further topic which are also interesting, but I do not want to discuss them here, maybe somewhere else.

2. Approximation of the identity

First topic, we discuss the approximation of the identity, this play a central role in understanding solution of PDE, why, I think a key point is this tools carry a lots of information about the scaling of the space, as it well known, analysis could roughly divide into two parts, “hard analysis” and “soft analysis”, approximation of the identity supply a way to transform a result form “hard analysis” side to “soft analysis” side and reverse. And when it shows its whole power always along with the involving of following Dominate convergence theorem:

Theorem 1 (DCT)

Let {\{f_n\}_{n=1}^{\infty}} be a series of function on measure space {(X,\Sigma,\mu)}, and {f_n \rightarrow f, a.e. x\in X}, and {\{f_n\}_{n=1}^{\infty}} satisfied a controlling condition, i.e. we can find a integrable function {g\in L^{1}(X)}, such that {|f_n(x)|\leq |g(x)|, a.e. x\in X, \forall n\in {\mathbb N}^*}, then we know,

\displaystyle \lim_{n\rightarrow \infty} \int_{X}f_n(x)d\mu\rightarrow \int f(x)d\mu \ \ \ \ \ (1)


In fact we have even stronger,

\displaystyle \lim_{n\rightarrow \infty} \int_{X}|f_n(x)-f(x)|d\mu=0 \ \ \ \ \ (2)


This is a standard theorem in real analysis, we give the proof.

Proof: {f} is the point-wise limit of {f_n} so we know f is measurable and also dominate by {g}, so by triangle inequality we have:

\displaystyle |f-f_n|\leq 2|g|

Then the 1 is trivially true, due to a diagonal taking subsequences trick. For more subtle result 2, we need use reverse Fatou theorem to show it is true, roughly speaking we have,

\displaystyle \limsup_{n\rightarrow \infty}\int_{X}|f_n-f|\leq \int_{X}\limsup_{n\rightarrow \infty}|f_n-f|=0

The key point is the first inequality above used the reverse Fatou theorem. \Box

Now we discuss of the main result of the approximation identity. So first we need to define what is a approximation identity. a key ingredient is scaling. i.e. we given a function {\Phi} and consider {\Phi_t=t^{-n}\Phi(\frac{x}{t})}, and we wish,

\displaystyle \lim_{t\rightarrow 0}(f*\Phi_t(x))=f(x), for a.e. x\ \in {\mathbb R}^n \ \ \ \ \ (3)


Whenever {f\in L^p, 1\leq p\leq \infty}, but there need some technique assume to make this intuition to be tight, this lead the following definition.

Definition 2 (Approximation of the identity) Suppose {\Phi} is a fixed function on {{\mathbb R}^n} that is appropriated small at infinity (have good enough decay rate), for example, take,

\displaystyle |\Phi(x)|\leq A(1+|x|)^{-n-\epsilon} \ \ \ \ \ (4)


Then we define {\{\Phi_t:\Phi_t(x)=t^{-n}\Phi(\frac{x}{t})\}} to be an approximation of the identity.

The key theorem is the following, related the approximation of the indentity with the maximal operator.

Theorem 3

\displaystyle \sup_{t>0}|(\Phi_t*f)(x)|\leq c_{\Phi}Mf(x) \ \ \ \ \ (5)


For heat kernel, the thing is more subtle.

Theorem 4 [Heat kernel estimate]

\displaystyle \|f-e^{t\Delta}f\|_2\leq \|\nabla f\|_2\sqrt{t} \ \ \ \ \ (6)


Remark 1 I know this theorem from Lieb’s book. The power of 4 combine with Plancherel theorem could use to establish the Sobolev inequality, at least for the index {p=2}.

There are 3 ingredients which cold be useful.

  1. the power of Rearrangement inequality involve in the Approximation of indentity operator. we could consider the relationship between {f*\Phi_t} and {f*\overline \Phi_t}, where {\overline \Phi} is constructed by take the average of {\Phi} on the level set but the foliation of scaling. Intuition seems some monotonic property natural emerge.
  2. There is a discretization model, i.e. the toy model on gragh, or we think it as correlation between particles, the key point is the rescaling deformation could be instead by semi group or renormalization property.
  3. We consider the more general case, now there is not only one {\Phi} but a group of them, i.e. {\Phi_k, k\in A}, this will involve some amenable theory I think.

We give two of the original and most important examples, First, if

\displaystyle \Phi(x)=c_n(1+|x|^2)^{\frac{-(n+1)}{2}}


\displaystyle c_n=\frac{\Gamma(\frac{n+1}{2})}{\pi^{\frac{n+1}{2}}}

then {\Phi_t(x)} is the possion kernel, and,

\displaystyle u(x,t)=(f*\Phi_t)(x)

Gives the solution of the Dirichlet problem for the upper half space,

\displaystyle {\mathbb R}^{n+1}_{+}=\{(x,t):x\in {\mathbb R}^n,t>0\}


\displaystyle \Delta u=(\frac{\partial^2}{\partial t^2}+ \sum_{j=1}^n\frac{\partial^2}{\partial x_j^2})u(x,t)=0,\ u(x,0)\equiv f(x) \ \ \ \ \ (7)


The second example is the Gaussian kernel,

\displaystyle \Phi(x)=(4\pi)^{-\frac{n}{2}}e^{-\frac{|x|^2}{4}}.

This time, if {u(x,t)=(f*\Phi_{t^{\frac{1}{2}}})(x)}, then {u} is a solution of the heat equation,

\displaystyle (\frac{\partial}{\partial t}- \sum_{j=1}^n\frac{\partial^2}{\partial x_j^2})u(x,t)=0,\ u(x,0)\equiv f(x) \ \ \ \ \ (8)



3. Singular integral with {L^2} bounded translation invariant operator

The main result proved in last note about singular integral is a conditional one, guaranteeing the boundedness on {L^p} for a range {1<p\leq q}, on the assupution that the boundedness on {L^q} is already known; the most important instance of this occurs when {q=2}. In keeping with this, we consider bounded linear transformation {T} from {L^2({\mathbb R}^n)} to itself that commute with translation. As is well known, such operator are characterized by the existence of a bounded function {m} on {{\mathbb R}^n} (the “multiper”), so that {T} can be realized as,

\displaystyle \widehat{Tf(\xi)}=m(\xi)\widehat f(\xi) \ \ \ \ \ (9)

Where {\widehat{}} denotes the Fourier transform. Alternatively, at least on test function {f\in S}, {T} can be realized in terms of convolution with a kernel {K},

\displaystyle Tf=f*K \ \ \ \ \ (10)


Where {K} is the distribution given by {\hat K=m}. We shall now examine how the theorem with condition on singular integral weill lead to some result of this type of operator. Roughly speaking, it is due to now we know the boundedness on {L^2}, for technique condition, we need to assume the distribution {K} agree away from the origin with a function that is locally integrable away from the origin with a function that is locally integrable away from the origin; in this case we define the function by {K(x)}. Then 10 implies that,

\displaystyle Tf(x)=\int K(x-y)f(y)dy,\ for \ a.e. x\notin supp f. \ \ \ \ \ (11)

Whenever {f} is in {L^2} and {f} has campact support. Tis is the representation of singular integral in the present context. Next, the crucial hormander condition is then equivalent with,

\displaystyle \int_{|x|\geq c|y|}|K(x-y)-K(x)|dx\leq A \ \ \ \ \ (12)


for all {y\neq 0}, where {c>1}. In this case, the condition 12 have a further understanding, in fact,

Lemma 5

\displaystyle |(\frac{\partial}{\partial x}^{\alpha}K(x))|\leq A_{\alpha}|x|^{-n-|\alpha|},\ for\ all \ \ \alpha \ \ \ \ \ (13)


or its weaker form, (here {\gamma>0} is fixed )

\displaystyle |K(x-y)-K()|\leq A\frac{|y|^{\gamma}}{|x|^{n+\gamma}}, whenever \ |x|\geq c|y|. \ \ \ \ \ (14)

imply the hormander condition 12

Proof: Integral by part. \Box

So, now the key point is how do {K}, satisfied such conditions, come about? It turns out that, toughly speaking, such condition on {K} have equivalent versions when sated in terms of the Fourier transform of {K}, namely the multiper {m}. This is transform the difficulties from physics space to fractional space In the future note, we will find a proof of the following Theorem:

Theorem 6 For {m=\hat K}.

If we assume that,

\displaystyle |(\frac{\partial}{\partial \xi}^{\alpha}m(\xi))|\leq A'_{\alpha}|\xi|^{-n-|\alpha|},\ for\ all \ \ \alpha \ \ \ \ \ (15)

holds for all {\alpha}, then {K} satisfied 5 for all {\alpha}.

If we assume that {m} satisfied the above inequality for all {0\leq |\alpha| \leq l}, where {l} is the smallest integer {>\frac{n}{2}}, then {K} satisfied 12

Remark 2 The multiplier {m} satisfied the second part condition of 6, are called Marcinkiewicz mulltiplier.


4. Maximal function, singular integral, and square functions.


Calderon-Zygmund theory of singular integrals.

1. Calderon-Zygmund decomposition

The Calderon-Zygmund decomposition is a key step in the real variable analysis of singular integrals. The idea behind this decomposition is that it is often useful to split an arbitrary integrable function into its “small” and “large” parts, and then use different technique to analyze each part.

The scheme is roughly as follows. Given a unction { f} and an altitude { \alpha}, we write { f=g+b}, where { |g|} is point wise bounded by a constant multiple {\alpha}. While { b} is large, it does enjoy two redeeming features: it is supported in a set of reasonable small measure, and its mean value is zero on each of the ball that constitute its support. To obtain the decomposition { f=g+b}, one might be tempted to “cut” { f} at the height { \alpha}; however, this is not what works. Instead, one bases the composition on the set where the maximal function of { f}has height { \alpha}.

Theorem 1 (Calderon-Zygmund decomposition)

Suppose we are given a function { f\in L^1} and a positive number { \alpha}, with {\alpha>\frac{1}{\mu(R^n)}\int_{R^n}|f|d\mu}. Then there exists a decomposition of { f}, {f=g+b}, with { b=\sum_{k}b_k}, and a sequences of balls {\{B_k^*\}}, so that,

  1. { |g(x)|\leq c\alpha}, for a.e. { x}.
  2. Each {latex b_k} is supported in {B_k^*},{ \int|b_k(x)|d\mu(x)\leq c\alpha\mu(B_k^*)}, and { \int b_k(x)d\mu(x)=0}.
  3. { \sum_k\mu(B_k^*)\leq \frac{c}{\alpha}\int|f(x)|d\mu(x)}.


Before proof this theorem, I explain the geometric intuition why this theorem could be true first. Merely speaking, this is just base on cut off the function into two part, the part with high altitude and the part with low altitude and extension the part with high altitude to make the extension one satisfied the condition 2 and 3.

Proof: In fact this decomposition have a good geometric explain, we just divide the part {\{x: |f(x)|>\alpha\}} and extension it carefully to make they behaviour like several balls, to satisfied the special condition on this part. \Box

Remark 1 Remark 1: A Calderon-Zygmund decomposition for {L^p} function was done in Charlie Fefferman’s thesis; see Section II of ams.org/mathscinet-getitem?mr=257819  One can also find this in Loukas Grafakos’s Classical Fourier Analysis Classical Fourier Analysis page 303 exercise 4.3.8. The question is broken up into parts that should be easy to handle.

Several people have considered with this question. An excellent paper that comes to mind is Anthony Carbery’s Variants of the Calderon–Zygmund theory for { L^p}-spaces which appeared in Revista Matematica Iberoamericana, Volume 2, Number 4 in 1986. There are also several useful references that appear in Carbery’s paper.

Remark 2 We could also consider a variant of Calderon-Zygmund decomposition, such as equipped with a nontrivial weight function { w} or find some different way to decomposition for some special purpose.

Remark 3 Consider suitable decomposition of the physics space or even both the physics space and fractional space try to gain some reasonable estimate is a fundamental philosophy in harmonic analysis, beside the Calderon-Zegmund decomposition,

Whitney decomposition. Which is important trick in the proof of fefferman-stein restriction theorem and differential topology.

Wave packet decomposition. The wave packet decomposition. This decomposition underlies the proof of Carleson’s theorem (this is more explicit in Fefferman’s proof than Carelson’s original proof), Lacey and Thiele’s proof of the boundedness of the bilinear Hilbert transform, as well as a host of follow-up work in multilinear harmonic analysis. The idea of the wave packet decomposition is to decompose a function/operator in terms of an overdetermined basis. This allows one to preserve symmetries (such as modulation symmetries) that aren’t preserved by a classical Calderon-Zygmund decomposition (which endows the frequency with a distinguished role). One might consider using a wave packet decomposition if is working with an operator that has a modulation symmetry. This is discussed in more detailed in Tao’s blog post on the trilinear Hilbert transform.

Polynomial decomposition. The application of polynomial decomposition to harmonic analysis is more recent, and its full potential still seems unclear. Applications include Dvir’s proof of the finite field Kakeya conjecture, Guth’s proof of the endpoint multilinear Kakeya conjecture (and, indirectly, the Bourgain-Guth restriction theorems), Katz and Guth’s proof of the joints problem and Erdos distance problem, among many other results. Generally, the idea behind the polynomial decomposition is to partition a subset of a vector space over a field into a finite number of cells each of which contains roughly the same fraction of the original set. One further wishes that no low degree algebraic variety can intersect too many of the cells. In Euclidean space, the polynomial ham sandwich decomposition does exactly this. This allows one to, for instance, control linear (or, more generally, `low algebraic degree’) interactions between points in distinct cells. This has so far proven the most useful in incidence-type problems, but many problems in harmonic analysis, thanks to the translation symmetry of the Fourier transform, are inextricably linked with such incidence-type problems. See (again) Tao’s survey of this topic for a more detailed account.


2. Singular integrals

Have the Calderon-Zegmund decomposition in hand, now we proof a conditional one bounded result for singular integrals.

The singular integral one is interested in are operator { T}, expressible in the form

\displaystyle (Tf)(x)=\int_{R^n}K(x,y)f(y)d\mu(y) \ \ \ \ \ (1)


Where the kernel { K} is singular near { x=y}, and so the expression is meaningful only if { K} is treated as a distribution or in some limiting sense. Now the particular regularization of { (Tf)(x)} may be appropriate depends much on the context, and a complete treatment of the issues thereby raised take us quite far afield.

Let us limit ourselves to two closely related ways of dealing with the questions concerning the definability of the operator. One is to prove estimates for the (dense) subspace where the operator is initially defined. The other is to regularize the given operators by replacing it with a suitable family, and to prove the uniformly estimates for this family. This idea is similar occurring in spectral geometry when we wish to investigate the spectrum of some operator we try to consider some deformation, so deduce to control the spectrum of a seres of paramatrix, for example, consider the wave kernel or heat kernel rather than the passion kernel itself. Common to both methods is a priori approach: We assume some additional properties of the kernel, but then prove estimates that are independent of these “regularity” properties.

We now carry out the first approach in detail. There will be two kinds of assumptions made about the operator. The first is quantitative: we assume that we are given a bound { A}, so that the operator { T} is defined and bounded on { L^q} with norm { A}; that is,

\displaystyle \|T(f)\|_q\leq A\|f\|_q, \forall f, f\in L^q \ \ \ \ \ (2)


Moreover, we assume that there is associated to { T} a measurable function { K} (that plays the role of its kernel), so that for the same constant { A} and some constant { c>1},

\displaystyle \int_{R^n-B(y,c\delta)}|K(x,y)-K(x,\bar y)|d\mu(x)\leq A, \forall \bar y\in B(y,\delta)  \ \ \ \ \ (3)


for all { y\in R^n, \delta>0}.

The further regularity assumption on the kernel { K} is that for each { f} in {L^q} that has compact surppot, the integral coverages absolutely for almost all { x } in the complement of the support of { f}, and that equality holds for these { x}.

Theorem 2 (Bounded of singular integral with condition)

Under the condition 1 and 3 made above on { K}, the operator { T} is bounded in { L^p} norm on { L^p\cap L^q}, when { 1<p<q}. More precisely,

\displaystyle \|T(f)\|_p\leq A_p\|f\|_p

For { f\in L^p\cap L^q} with { 1<p<q}, where the bound { A_p} depends only on the constant { A} appearing in 1 and 3 and on { p}, but not on the assumed regularity of { K}, or on { f}.



Now let us begin to prove the conditional theorem. The key point is to use the potential of {T} has been a bounded operator from {L^q\rightarrow L^q}. Said, it already assumed {\exists A>0} such that {\forall f\in L^q} we have {\|T(f)\|_q\leq A\|f\|_q}. Now let us look at the singular integral expression:

\displaystyle (Tf)(x)=\int_{R^n}K(x,y)f(y)d\mu(y). \ \ \ \ \ (4)



The key point is to proof the mapping {f\rightarrow T(f)} is a weak-type {1-1}; that is,

\displaystyle \mu\{x:|Tf(x)|>\alpha\}\leq \frac{A'}{\alpha}\int |f|d\mu. \ \ \ \ \ (5)


At once we establish 5, then the theorem followed by interpolation. Now we use theorem 1 on {f} get {f=g+b}, thanks to the triangle inequality and something similar we have {g,b \in L^q}, in fact {R^n= A\amalg B, B\cup_{k}B_k}, {g=\chi_A g+\chi_{B}g, b=\chi_A b+\chi_{B} b}, by triangle inequality and {f=g+b}, to proof {g,b \in L^q}, we only need to proof {\chi_A g, \chi_B g, \chi_A b, \chi_B b\in L^q}, but this is easy to proof.

Now we know the {L^q} bounded of {g,b}, we divide the difficult of establish the weak 1-1 bound of {f} into the difficult of establish the weak 1-1 bound for {g} and {b}. i.e.

\displaystyle \mu\{x:|Tf(x)|>\alpha\}\leq \mu \{x:|Tg(x)|>\alpha\} +\mu\{x:|Tb(x)|>\alpha\} \ \ \ \ \ (6)


For {g}, if this weak 1-1 bound is not true, we have,

\displaystyle \mu \{x:|Tg(x)|>\alpha\}\geq \frac{A'}{\alpha}\int |g|d\mu \ \ \ \ \ (7)


thanks to the trivial estimate {\|g\|_q \leq c\alpha^{q-1}\|g\|_1 }. combine this two estimate we have:

\displaystyle c\alpha^{q-1}\|g\|_1\geq \|g\|^q_q\geq c\|Tg\|^q_q \geq A'\alpha^{q-1} \|g\|_1 \ \ \ \ \ (8)


The first estimate is true on {A} due to {|g|\leq \alpha, a.e. x\in R^n}. But compare the left and the right of 8 lead a contradiction, so 7 follows. For {b}, the thing is more complicated and in fact really involve the structure of the convolution type of the singular integral. The key point is controlling near the diagonal of {K(x,y)}. we warm up with a more refine decomposition {b=\sum b_k}, {\forall k, b_k=b\cdot \chi_{B_k}}. For a large constant {c>>1} choose later define {B^*_k=c B_k}. We know {b\in L^q}, but the really difficult thing occur in the how to combine the following 5 condition to lead a contradiction:

  1. {\|Tb\|_q\leq \|b\|_q}.
  2. property come from the Calderon-Zegmund decomposition, {\int_{B_k}\|b\|\leq c\alpha \mu(B_k),\forall k} and {\int_{B_k}b=0}.
  3. Hormander condition 3 , {\int_{R^n-B(y,c\delta)}|K(x,y)-K(x,\bar y)|d\mu(x)\leq A, \forall \bar y\in B(y,\delta)}
  4. the reverse of weak 1-1 of {b}, {\mu\{x:b(x)>\alpha\}> \frac{A'}{\alpha}\|b\|_1}.
  5. the structure {Tb(x)=\int_{R^n} K(x,y)b(y)dy}

The first step is to break {b} into {b_k}, and reduce the case of several balls to the case of only one ball, this could be done by triangle inequality or more may be we could do it derectly, but any way it is not difficult.

Then the thing become intersting, we focus on {b_1}, divide {Tb_1=T\chi_{B_1} b_1+ T\chi_{{\mathbb R}^n-B_1}b}. thanks to the hormander condition 3 we have good control on {T\chi_{{\mathbb R}^n-B_1^*}}, in fact we can proof a weak 1-1 bound on it,

\displaystyle \mu\{x:|T_{{\mathbb R}^n-B_1^*}b_1|>\alpha\}< \frac{A'}{\alpha}\|b_1\|_1 \ \ \ \ \ (9)

\displaystyle \begin{array}{rcl} T_{{\mathbb R}^n-B_1^*}b_1(x) & = & \int_{{\mathbb R}^n-B_1^*}K(x,y)b_1(y)dy\\ & = & \int_{{\mathbb R}^n-B_1^*}[K(x,y)-K(x,\bar y)]b_1(y)dy+\int_{{\mathbb R}^n-B_1^*}K(x,\bar y)b_1(y)dy\\ & \leq & \int_{{\mathbb R}^n-B_1^*}Ab_1(y)dy+\int_{{\mathbb R}^n-B_1^*}K(x,\bar y)b_1(y)dy \end{array}

So we conclude,

\displaystyle \begin{array}{rcl} \|T_{{\mathbb R}^n-B_1^*}b_1\|_1 & = & \int_{{\mathbb R}^n}|\int_{{\mathbb R}^n-B_1^*}K(x,y)b_1(y)dy|dx\\ & = & \int_{{\mathbb R}^n}\int_{{\mathbb R}^n-B_1^*}|[K(x,y)-K(x,\bar y)]b_1(y)dy|dx+\int_{{\mathbb R}^n}|\int_{{\mathbb R}^n-B_1^*}K(x,\bar y)b_1(y)dy|dx\\ & \leq & A\int_{{\mathbb R}^n}b_1(y)dy+\int_{{\mathbb R}^n-B_1^*}K(x,\bar y)b_1(y)dy=A\int_{{\mathbb R}^n}b_1(y)dy \end{array}

The last equality used the condition {\int b_1=0}.



Almost orthogonality


Motivation and Cotlar’s lemma

We always need to consider a transform T on Hilbert space l^2(\mathbb Z) (this is a discrete model), or a finite dimensional space V. If under a basis T is given by a diagonal matrix this story is easy,

\displaystyle A = \begin{pmatrix} \Lambda_1 & 0 & \ldots & 0 \\ 0 & \Lambda_2 & \ldots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \ldots & \Lambda_n \end{pmatrix} \ \ \ \ \ (5)

Then ||T||=\max_{i}\lambda_i.

In fact, for T is a transform of a finite dimensional space, T is given by (a_{ij})_{n\times n} by duality we have ||T||=||TT^*||, so we have,

||T||=||TT^*||=|(\sum a_{ij}x_j)y_i|\leq |\sum_{i,j}\frac{1}{2}(|a_{ij}(|x_i|^2+|y_j|^2)|\leq M

If we have given \sum_{i}|a_{ij}|\leq M and \sum_{j}|a_{ij}|\leq M \forall i,j\in \{1,2,...,n\}.

But in application of this idea, the orthogonal condition always seems to be too restricted and due too this we have the following lemma which is follow the idea but change the orthogonal condition by almost orthogonal.


Let \{T_j\}_{j=1}^N be finitely many operators on some Hilbert space H. Such that for some function \gamma : \mathbb Z\to R^+ one has,

||T_j^*T_k||\leq \gamma^2(j-k),||T_jT_k^*||\leq \gamma^2(j-k)

for any 1\leq j,k\leq N. Let \sum_{l=-\infty}^{\infty}\gamma(l)=A<\infty. then ,

||\sum_{j=1}^NT_j||\leq A


tensor power trick + duality ||T||=||TT^*||^{\frac{1}{2}}.

Singular integrals on L^2



Define T is a operator on measure space X\times Y equipped positive product measure \mu\wedge \nu, via,


K is a measurable kernel, then,

1). ||T||_{1\to 1}\leq \sup_{y\in Y}\int_{X}|K(x,y)|\mu(dx)=:A.

2). ||T||_{\infty\to \infty}\leq \sup_{x\in X}\int_{Y}|K(x,y)|\nu(dy)=:B.

3). ||T||_{p\to p}\leq A^{\frac{1}{p}}B^{\frac{1}{p'}},  \forall 1\leq p\leq \infty.

4). ||T||_{1\to \infty}\leq ||K||_{L^{\infty}(X\times Y)}.


1),2),4) merely due to Fubini theorem and Bath lemma.


3) proof by the interpolation and combine 1) and 2).


Let K be a Calderon-Zegmund operator, with the additional assumption

that |\nabla K(x)|\leq B|x|^{-d-1}. Then

||T||_{2\to 2} \leq CB

with C = C(d).

Caldero ́n–Vaillancourt theorem


Hardy’s inequality

Theorem(Hardy inequality)

For any 0 \leq s < \frac{d}{2} there is a constant C(s, d) with the prop-

arty that,

|||x|^{-s} f||_2 \leq C(s,d)||f||_{H^s(R^d)}

for all f \in H^s(R^d).


Linear to Multi-linear

The technique that transform a problem which is in a linear setting to a multilinear setting is very powerful.

such like:

1.The renormalization technique in complex dynamic system, and the generalization

this is mainly the Ostrowoski representation,and something else.

2.Fouriour analysis


this can be view when it is difficult to investigate a quality about a function f, it is always easier to take charge with some some part of f, in this case is given by \hat f(\xi),\xi \in R or \hat f(k),k\in Z like cut f into a lot of small parts,deal with every part and use some inequality(always the triangle inequality or similar thing) to glue it into a whole estimate of the quantity of f.

3.Multi-scales theory


this is used in the improve of Minkowski dimension of 3-dim kakeya set by Katz-Tao.

4.The proof of Bourgain-Sarnak-Ziegler theorem


Theorem(B-S-Z). Let F : N \to C with |F| \leq 1 and let \nu be a multiplicative function with |\nu| \leq 1. Let \tau > 0 be a small parameter and assume that for all primes p_1, p_2 \leq e^{1/\tau} , p_1 \neq p_2, we have that for M large enough

|\sum_{m\leq M} F(p_1m)\overline {F(p_2m)}| \leq \tau M.

Then for N large enough

|\sum_{m\leq M} \nu(n)F(n) | \leq 2 \sqrt{\tau log(\frac{1}{\tau})}M.

this theorem is not difficult to prove by bilinear method and Cauchy-Schwarz,you can see the detail in https://arxiv.org/abs/1110.0992v1.
According to this theorem,to get a good approximation of \sum_{1\leq k\leq x}\mu(k)f(k) we use need a good approximation on \sum_{1\leq k\leq x}f(p_1x)\overline {f(p_2x)}.this will be much easier.but for the RHS f(x) is very complicated so I do not have a non-trivial estimate for \sum_{1\leq k\leq x}f(p_1x)\overline {f(p_2x)} until now.

5.The multiplier restriction theorem(Tao)


6.Some special construction in additive Combitriocs


Such like when we want to consider some set A\subset Z_1 satisfied \frac{|A-A|}{|A+A|}>>1 it is convenient to consider in a high dimensional linear space Z^N rather than in Z.

Kakeya conjecture (Tomas Wolff 1995)

There is a main obstacle to improve the kakeya conjecture,remain in dimension 3,and the  result established by Tomas Wolff in 1995 is almost the best result in R^3 even until now.the result establish by Katz and Tao can be view as a corollary of Wolff’s X-ray estimate.

For f\in L_{loc}^1(R^d),for 0<\delta<1:

f_{\delta}^*:P^{d-1}\longrightarrow R.f_{\delta}^*(e)=\sup_{T}\frac{1}{|T|}\int_{T}|f|.

T is varise in all cylinders with length 1.radius \delta.axis in the e direction.

f_{\delta}^{**}:R^d\longrightarrow R.f^{**}_{\delta}(x)=sup_{T}\frac{1}{|T|}\int_{T}|f|.

T varise in cylinders contains x,length 1,radius \delta.

Keeping this two maximal function in mind,we give the statement of the Kakeya maximal function conjecture:

||M_{\delta}f||_d\leq C_{\epsilon} \delta^{-\epsilon}||f||_d

Where M_{\delta}=f_{\delta}^* or M_{\delta}=f_{\delta}^{**}.

Because we have the obviously 1-\infty estimate:



So by the Riesz-Thorin interpolation we have:

||M_{\delta}f||_{q}\leq C_{\epsilon}\delta^{-(\frac{d}{p}-1+\epsilon)}||f||_p.           (*)

for 1\leq p\leq d,q\leq(d-1)p'.the task is establish (*) for (p,q) as large as posible in the range.

for the 2 dimension case,the result is well know.the key estimate is:

\sum_{j}|T_i\cap T_j|\leq log(\frac{1}{\delta})|T_i|

for d\geq 3 case,the main result of Wolff is:

||M_{\delta}f||_q\leq C_{\epsilon}\delta^{-(\frac{d}{p}-1+\epsilon)}||f||_p

hold for p=\frac{d+2}{2}.q=(d-1)p'. M_{\delta}=f_{\delta}^* or f_{\delta}^{**}.

Now we sketch the proof.

prove f_{\delta}^*,f_{\delta}^{**} cases together.

We can make some reduction:

the first one is we can assume the sup of f is in a fix compact set.

the second is instead of consider f_{\delta}^{**},we can consider f_{\delta}^{***}(x)=\sup_{T}\frac{1}{|T|}\int_T|f|.

where T varies in all cylinder with radius \delta,length 1,axis \frac{\pi}{100} with a fix direction.

the first reduction is obvious(why?)

the second reduction rely on a observe:

||f_{\delta}^{***}||_q\leq A(\delta)||f||_p          \Longrightarrow    ||f_{\delta}^{**}||_q\leq CA(\delta)||f|_p

this is just finite cover by rotation of the coordinate and triangle inequality.

now we begin to establish a frame and put the two situations f_{\delta}^*,f_{\delta}^{***} into it.

Let M(d,1) be all line in R^d.

then M(d,1)=R^d\times S^{d-1}/\sim is a 2d-2 dim manifold.

M(d,1)\longrightarrow P^{d-1}

l \longrightarrow  e_l

e_l is the line parallel to l.and the middle point is original.

dist(l_1,l_2)\sim \theta(l_1,l_2)+d_{mis}(l_1,l_2).


Wolf axiom:

(A,d) metric space.

\mu(D(\alpha,\delta)) \sim \delta^m.\alpha\in A.\delta \leq diam(A).

for certain m\in R^+.

\forall \alpha\in A.F_{\alpha} \subset M(d,1) is given.and \bar{\cup_{\alpha}F_{\alpha}} is compact.

d(\alpha,\beta)\lesssim inf_{l\in F_{\alpha};m\in F_{\beta}}dist(l,m) for all \alpha,\beta \in A.

If f:R^d\longrightarrow R then we define M_{\delta}f:A\longrightarrow R by

M_{\delta}f(\alpha)=\sup_{l\in F(\alpha)}\frac{1}{|T_{l}^{\delta}|}\int_{|T^{\delta}_l|}|f|.

Property (**):

If l_0\in \cup_{\alpha F_{\alpha}}. \Pi is a 2-plane.containing l_0and if \sigma \geq \delta and if \{\alpha_j\}_{j=0}^N is a \delta-seperated subset of A and for each j,there is l_j\in F_{\alpha_j} with dist(l_j,M(\Pi,l))<\delta and dist(l,l_0)<\sigma.then

N\leq \frac{C\sigma}{\delta}


Fractional uncertain principle

semyon dyatlov的一篇文章

semyon dyatlov的文章https://arxiv.org/pdf/1710.05430.pdf,用fractional uncertainly priciple导出了hyperbolic surface上测地线诱导的zeta函数在Re(s)>1-\epsilon只有有限个零点。



1.p-adic上的黎曼猜想,因为这篇文章的证明强烈依赖于markov性质,这和p adic的结构也很像,有可能可以利用p adic猜想的证明思路继续做一部分。


2.billiard的传播子,但是这里不一样,文章中的 Schottky groups本质上是对于算子的逆写成一种级数形式其中级数由Schottky group生成,但是对于billiard传播子的情况所有的涉及的热核或者波核的paramatrix不仅仅具备markov性质,起主导作用的却是某种需要X-ray估计的性质,级数和并不是对全空间求而是某种截断了的子空间里面,所以比这个证明要难。建立起billiard的传播子估计是证明inverse spectral problem的重要一步。


3.interval exchange map,但是interval exchange map的结构就好只有这里的traslation,这里有一个像的大小的指数衰减,这是interval exchange map所没有的。interval exchage map可能还需要涉及到一个拆分估计,会更难,这可能可以在interval exchange map上的sarnak猜想有进展。


下面讲一下我对文章证明主要思路的理解:首先对于hyperbolic空间H^2/{\Gamma} 我们用poincare的方式来理解为 D mod掉一个作用,那么极限集\Lambda_{\Gamma} 就是基本域在分式线性变换下在D的边界下的极限点。

关键是建立如下估计 \int_{\Lambda_{\Gamma}}exp (i\xi\phi(x)) g(x) d\mu(x)\leq C|\xi|^{−\epsilon_1} \forall \xi, |\xi| > 1.


1.研究极限集\Lambda_{\Gamma} 的结构,本质上具备某种组合上的树结构,在分式线性变换下树的上方和下方交换,而且对于象有指数级别的衰减,这很像连分数展开中的otrowoski表示。对于分式线性变换和Schottky group作用的体积形变估计是容易得到的。

2.Patterson–Sullivan测度\mu 是在\Gamma 作用下的遍历测度,特别的,和 \Gamma 是compatible的,所以变量代换公式成立:

 \int_{\Lambda_{\Gamma}} f(x)d\mu(x) =\int_{\Lambda_{\Gamma}}f(\gamma(x))|\gamma'(x)|_{\delta}^B d\mu(x) \forall \gamma \in \Gamma

这个很重要,一旦我们能够找到I :=\amalg_{b\in \Omega} I_b, 实际这可以导出一个关于f的方程:

 L_Zf(x) = \sum_{a\in Z,a\to b} f(\gamma_{a′}(x))w_{a′}(x), x ∈ I_b.

我们关心的selberg zeta函数的零点就等于方程特征值1对应的特征函数: L_zf(x)=f(x) .


3. 文章中3.1.是bourgain的主要贡献,是所谓的sum-product现象在这里的一个引用,为了得到foriour衰减性估计,我们需要不断拆散区间,实际上在树的每一层上面我们都很清楚怎么把这一层的积分拆散到上一层和下一层,这实际上可以看成一个renormelization方程:

  \int_{\Lambda_{\Gamma}}f d\mu =\int_{\Lambda_{\Gamma}}L_{Z(\tau)}^{2k+1} f d\mu = \sum_{A,B,A\leftrightarrow B}f(\gamma_{A∗B}(x))w_{A∗B}(x)d\mu(x).

1中的形变估计(只需要估计一下交叉项带来的误差)告诉我们:  | \int_{}fdμ|^2 ≤C\tau^{(2k−1)\delta}\sum_{A,B,A\leftrightarrow B} |\int_{I_b(A)} e^{iξ\phi(\gamma_{A∗B}(x))}w_{a′_k} (x)d\mu(x)| ^2 +C\tau^2 .


  |\int_{\Lambda_{\Gamma}} f d\mu|^2 ≤ C\tau^{(2k+1)\delta}\sum_A sup_{\eta\in J_{\tau}}| e^{2πi\eta\xi_{1,A} (b_1)···\xi_{k,A} (b_k )}| + C\tau^{\delta/4}. (*)

最后为了用bourgain的sum-product现象得到的引理3.3来控制(*)RHS,我们需要对 R ⊂ Z(τ) ^{k+1} Z(τ)^{k+1}-R 分段估计,前者是用正则性导致的收敛速度快,后者shi用minkowki维数很小,前者估计已经建立,所以只需对Z(τ)^{k+1}-R 建立minkoski维数上界估计,这是显然的。


  \int_{\Lambda_{\Gamma}}exp ( i\xi\phi(x)) g(x) d\mu(x) ≤ C|\xi|^{−\epsilon_1} \forall \xi, |\xi| > 1.

这个估计是用来建立fractional uncertain principle的关键,一旦我们有了fractional uncertain principle,hyperbolic surface上测地线诱导的zeta函数在Re(s)>\frac{1}{2}-epsilon只有有限个零点就只是一个fix point theorem的argument。

另外一个不需要用sum-product现象的极为简单的证明见[1710.05430] Fractal uncertainty for transfer operators。