Monotonically increasing intervals

Problem  A real sequence \{T_n\}_{n\ge1} satisfies T_{n+1}+T_{n}=2n^\alpha for all n\ge1, where 1<\alpha<2. Let \Delta_{n}:=T_{n+1}-T_n be the length of the nth interval. Show that there exists a unique T_1 such that \Delta_{n+1}>\Delta_n for all n\ge1.

Fig. 1. Nonlinearly mapped intervals (pdf version)

Remark  This is a problem that arises from nonlinear quantization in AAC. There, MDCT coefficient x is power-compressed to y=x^{3/4} before linear quantization. Then we face the problem of choosing boundaries T_n, T_{n+1} for the nth quantization interval where all x\in [T_n, T_{n+1}] are quantized to n and de-quantized to \bar{x}=n^{4/3}, such that the de-quantization noise \bar{x}-x is minimized statistically. Of course, the solution to this problem depends on the underlying probability distribution of x. In case of uniform distribution, the de-quantized value n^{4/3} should be at the center of the nth interval, or T_n + T_{n+1} = 2n^{4/3}. With this restriction, all the boundaries \{T_n\}_{n>1} are uniquely determined by T_1. But the length of the interval \Delta_n=T_{n+1}-T_n, as revealed by numerical experiments, almost always undulates and at the same time, asymptotically increases to \infty as n\to\infty (\sim \tfrac43n^{1/3} ), for a randomly chosen T_1. Then, is there a T_1 to make the length of the interval monotonically increase? The answer is yes, and this T_1 is uniquely determined by \alpha, as shown in the following.

Proof  Let us denote the difference of nth and (n+1)th intervals’ lengths by \Gamma_n:=\Delta_{n+1}-\Delta_n. We claim that \Gamma_n>\Gamma_{n+2} for all n\ge1. If so, to ensure \Gamma_n>0 for all n\ge1, then we need to show that there exists a unique T_1 such that both \lim_{n\to\infty}\Gamma_{2n} and \lim_{n\to\infty}\Gamma_{2n+1} exist and are non-negative. First, by the relation T_n+T_{n+1}=2n^\alpha, we have \Gamma_n=2(n+1)^\alpha+2n^\alpha-4T_{n+1}. Furthermore, let f(x):=(x+1)^\alpha-x^\alpha, then

\begin{array}{rcl} \Gamma_{n+2}-\Gamma_n  & = & 2(n+3)^\alpha+2(n+2)^\alpha-2(n+1)^\alpha-2n^\alpha-4T_{n+3}+4T_{n+1} \\  & = & 2(n+3)^\alpha-6(n+2)^\alpha+6(n+1)^\alpha-2n^\alpha \\  & = & 2(f(n+2)-2f(n+1)+f(n)) \\  & = & 2f''(\eta_n), \end{array}

where \eta_n\in(n,n+2) and the last equation is due to recursive application of Rolle’s theorem (first about f(x+1)-f(x), then about about f(x)).  Since f''(x)=\alpha(\alpha-1)[(x+1)^{\alpha-2}-x^{\alpha-2}]<0 for x>0 when 1<\alpha<2, we have f''(\eta_n)<0 thus \Gamma_{n+2}-\Gamma_n<0. Next, consider the limit of \Gamma_{2n}+\Gamma_{2n+1}:

\begin{array}{rcl} \Gamma_{2n}+\Gamma_{2n+1}  & = & \Delta_{n+2}-\Delta_n \\  & = & T_{n+3}-T_{n+2}-T_{n+1}+T_n \\  & = & T_{n+3}+T_{n+2}-2T_{n+2}-2T_{n+1}+T_{n+1}+T_n \\  & = & 2[f(n+1)-f(n)] \\  & \xrightarrow{n\to\infty} & 0, \end{array}

where the last equation is due to \lim_{x\to\infty}f'(x)=0 for 1<\alpha<2. Thus, the desired T_1, if it ever exits, must ensure \lim_{n\to\infty}\Gamma_{2n}=0. Conversely, if with some T_1, \lim_{n\to\infty}\Gamma_{2n}=0, then \Gamma_n>0 for all n\ge1.

Now, if suffices to show there exists a unique T_1 such that \lim_{n\to\infty}\Gamma_{2n}=0. Let S_n:=(n-1)^\alpha-(n-2)^\alpha+\cdots+(-1)^n, then T_n=2S_n-(-1)^nT_1. We claim that \xi_n:=(2n+\tfrac12)^\alpha-2S_{2n+1} converges as n\to\infty. In that case, we shall have

\begin{array}{rcl} \Gamma_{2n}  & = & 2(2n+1)^\alpha+2(2n)^\alpha-4T_{2n+1} \\  & = & 2(2n+1)^\alpha+2(2n)^\alpha-8S_{2n+1}-4T_1 \\  & = & 2(2n+1)^\alpha+2(2n)^\alpha-4(2n+\tfrac12)^\alpha+4\xi_n-4T_1 \\  & = & 2[g(2n+\tfrac12)-g(2n)]+4\xi_n-4T_1 \\  &\xrightarrow{n\to\infty} & 4\xi_n-4T_1, \end{array}

where g(x):=(x+\tfrac12)^\alpha-x^\alpha and the last equation above follows from \lim_{x\to\infty}g'(x)=0 when 1<\alpha<2. Therefore, the unique T_1 will be \lim_{n\to\infty}\xi_n. It remains to show that \{\xi_n\}_{n\ge1} does converge.

Let us investigate the difference between \xi_n and \xi_{n+1}:

\begin{array}{rcl} \xi_{n+1}-\xi_n  & = & (2n+\tfrac52)^\alpha-(2n+1)^\alpha-2[(2n+2)^\alpha-(2n+1)^\alpha] \\  & = & f(2n+\tfrac32)+f(2n+\tfrac12)-2f(2n+1) \\  & = & \tfrac14\alpha(\alpha-1)(\alpha-2)\theta_n^{\alpha-3}, \end{array}

where \theta_n\in(2n+\tfrac12,2n+\tfrac52) and the last equation is also due to recursive application of Rolle’s theorem. (First about f(x+\tfrac12)-f(x), then about f(x), and finally about x^\alpha.) Therefore, we have \xi_{n+1}-\xi_n=\mathcal{O}(n^{\alpha-3}), which implies that \xi_n converges.

This completes the proof.                                                                                                        \square

Posted in Analysis, mathematics, signal processing | Leave a comment

3D Matrix Multiplication Tricks

Recently, I came across times 3D matrix multiplication in Matlab. It is for frequency domain signal processing, specifically, frequency-dependent multi-channel Wiener filter. Here, 3D matrix multiplication means 2D matrix multiplication at each frequency (or other parameters):  Given \mathbf{A}(f) and \mathbf{B}(f) which are matrices (vectors) at each frequency f, derive \mathbf{C}(f)=\mathbf{A}(f)\mathbf{B}(f).

In Matlab, we hold \mathbf{A}(f) in a 3D matrix \mathtt{A(1:F,1:M,1:K)}, and hold \mathbf{B}(f) in \mathtt{B(1:F,1:K,1:N)}. Suppose frequency is along the first dimension, then one plain implementation of 3D matrix multiplication is

A1=permute(A,[2 3 1]);
B1=permute(B,[2 3 1]);
for f=1:F
    C(f,:,:)=A1(:,:,f)*B1(:,:,f);
end

But if \mathtt{F} is large while \mathtt{M*N} is small (for example, \mathtt{F=1024,M=2,N=5}), as in multi-channel Wiener filter, the loop along frequency drags speed down significantly.

The native power of Matlab lies in matrix and vector operation, which is lightning fast. (JIT helps a lot in some cases, but not always.) Here the strategy is to replace the loop along frequency by loop along the second and third dimensions, and vectorize the inner operations:

 A1=permute(A,[1 3 2]);
 for m=1:M
     for n=1:N
          C(:,m,n)=sum(A1(:,:,m).*B(:,:,n),2);
     end
 end

The total number of iterations now is \mathtt{M*N}, instead of \mathtt{F}. Using this strategy, I observed more than 10X speeding up for \mathtt{F=1024,M=2}, and \mathtt{N=5}.

In the special case of \mathtt{K==1}, the above code can be simplified further:

B1=squeeze(B);
for n=1:N
    C(:,:,n)=A.*repmat(B1(:,n),1,M);
end

Another important case is \mathtt{A(f,:,:)} is a diagonal matrix at each frequency, we can find \mathtt{C(1:F,1:M,1:N)} in one stroke:

C=repmat(A(:,1:M+1:M*M),[1 1 N]).*B;

It feels very good to see results jumping out right away.

Posted in coding, signal processing | Tagged , | 1 Comment

Dense Circle

Proposition 1. Let \alpha be an irrational number and denote \{\mathbb{N}\alpha\} the set of fractional parts of n\alpha for all n\in\mathbb{N}. Then \{\mathbb{N}\alpha\} is dense in [0,1].

Proof. We shall first prove that \{\mathbb{N}\alpha\} is dense at 0, then proceed to the whole range of [0,1]. Without loss of generality, \alpha is assumed to be within (0,1).

To show that \{\mathbb{N}\alpha\} is dense at 0, we claim that there exists a m\in\mathbb{N} such that \{m\alpha\} \leq \frac12\alpha. Denote n_0 the largest integer satisfying n_0\alpha < 1, then (n_0+1)\alpha \geq 1.  If \{(n_0+1)\alpha\}=(n_0+1)\alpha-1 \leq \frac12\alpha, then m=n_0+1 is what we need, else let \beta=1-\{n_0\alpha\}=1-n_0\alpha<\frac12\alpha. Denote n_1 the largest integer satisfying n_1\beta \leq 1, then (n_1+1)\beta > 1 and \{n_1n_0\alpha\}=\{n_1(1-\beta)\}=1-n_1\beta < \beta <\frac12\alpha, that is, m=n_1n_0 is what we need. Then apply this procedure to \{m\alpha\} and so on, which leads to a sequence approaching to 0 at least as fast as 2^{-n}\alpha.

Further, for any \epsilon > 0, we can find a N_\epsilon\in\mathbb{N} such that \{N_\epsilon\alpha\} < \epsilon by the above argument. Then for any x\in[0,1], |\{\lfloor\frac{x}{\epsilon}\rfloor N_\epsilon\alpha\} - x| < \epsilon, thus \{\mathbb{N}\alpha\} is dense at x. This completes the proof.                                                                                                                                           \square

Alternatively, we can prove Prop. 1 by the Dirichlet’s approximation theorem, which states that for any real number \alpha and positive integer N, there exit integers p and q such that 1\leq q \leq N and |q\alpha-p|<1/(N+1). Thus \{\mathbb{N}\alpha\} is dense at 0 and then dense everywhere on [0,1].

Remark 1. Prop. 1 can be formulated on the unit circle S^1, that is, the set e^{2\pi{i}\mathbb{N}\alpha} for any irrational number \alpha is dense on S^1.

Remark 2. The interval [0,1] can be replaced with [0,\beta] where \beta>0 as long as \alpha/\beta is irrational (\alpha might be rational).

Remark 3. If we replace \mathbb{N} with its subset \mathbb{N}q+r, where q,r\in\mathbb{N}, \{(\mathbb{N}q+r)\alpha\} will still be dense in [0,1]. This is because \{(\mathbb{N}q+r)\alpha\}=\{\mathbb{N}(q\alpha)+r\alpha\}, in which \{\mathbb{N}(q\alpha)\} is dense in [0,1] and r\alpha amounts to a constant (circular) shifting. On the other hand, for any infinite subset H \subset \mathbb{N}, \{H\alpha\} will be dense somewhere in [0,1], since [0,1] is compact, but \{H\alpha\} is not necessarily dense everywhere in [0,1], for example, H=\{n~|~\{n\alpha\}-\frac12|<\frac14\}.

Problem 1. Let d_\alpha(n) be the largest interval between two adjacent points from \{\alpha\}, \{2\alpha\}, \ldots, \{n\alpha\} on [0,1]. By Prop. 1, d_\alpha(n) monotonically goes to 0. But what is the speed of decaying of d_\alpha(n) as n goes to infinity? For example, could it be \mathcal{O}(n^{-1})?

Problem 2. Let \alpha be an irrational number. Then, for what kind of f:\mathbb{N}\to\mathbb{N}, \{f(\mathbb{N})\alpha\} is dense in [0,1]? Equivalently, we could ask for what kind of subset H \subset \mathbb{N} such that the set \{H\alpha\} is dense in [0,1]. Specifically, is \{\mathbb{N}^2\alpha\} dense on [0,1]? Let P be the set of prime numbers, is \{P\alpha\} dense on [0,1]?

Posted in Analysis, mathematics | Leave a comment

Surprisingly, I could understand most of this post. Tao’s clarity in exposition is always amazing, in addition to his power in mathematics.

What's new

I’ve just uploaded to the arXiv my paper “Every odd number greater than 1 is the sum of at most five primes“, submitted to Mathematics of Computation. The main result of the paper is as stated in the title, and is in the spirit of (though significantly weaker than) the even Goldbach conjecture (every even natural number is the sum of at most two primes) and odd Goldbach conjecture (every odd natural number greater than 1 is the sum of at most three primes). It also improves on a result of Ramaré that every even natural number is the sum of at most six primes. This result had previously also been established by Kaniecki under the additional assumption of the Riemann hypothesis, so one can view the main result here as an unconditional version of Kaniecki’s result.

The method used is the Hardy-Littlewood circle method, which…

View original post 1,690 more words

Posted in mathematics | Leave a comment

Blacklist IPs that attack SSH and FTP services on a Linux machine

I manage a Linux server (Redhat EL4, kernel version 2.6.9-34). Yesterday,  I found the server was attacked relentlessly from 121.101.223.211. It seemed to be a brutal force dictionary attack. What to do then? The obvious way is to manually add that IP to host.deny, that is, append “sshd:  121.101.223.211” to /etc/host.deny in a separate line.

But that is so un-Linux and so unsustainable.

It was time to excise my bash-fu and sed-fu. The attacks were logged in /var/log/messages and looked like this (part):

sshd(pam_unix)[6548]: authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=121.101.223.211  user=root

The idea is to periodically check (hourly by cron) the log file and find IPs with a large number (≥ 16) of unsuccessful SSH login attempts. With the following line, the IPs can be parsed out, sorted, and counted:

sed -n "/${SERVICE}.*authentication failure/ s/^.*rhost=\([0-9.]*\).*$/\1/p" "${LOGFILE}" | sort | uniq -c

where SERVICE=sshd and LOGFILE=/var/log/messages. The output is like this:

     19 112.187.163.188
    140 121.101.223.211
     45 211.139.10.219

The leading numbers are counts of unsuccessful SSH login attempts, the latter are corresponding IPs. Then echo the IPs to /etc/host.deny if they are not there yet and their counts are no less than 16. Of course, I should also protect FTP service (vsftpd).  Not too hard, ha!

The following is the complete bash script (ip_block). Put it into /etc/cron.hourly and it will work for you 24/7, for free!

#! /bin/bash
# add ips with more invalid login attempts than permitted to hosts.deny
# usage: ip_block [invalid login limit, default to 16]

if [ "$EUID" -ne 0 ]
then
    echo " You must be root to run this script!"
    exit -1
fi

SYSLOG_FILE="/var/log/messages"
HOSTDENY_LIST="/etc/hosts.deny"
[ "$#" -ge "1" ] && LOGIN_LIMIT="$1" || LOGIN_LIMIT=16

function add_to_list ()
{
    SERVICE="$1"
    LOGFILE="$2"
    LSTFILE="$3"
    MAXCOUNT="$4"

    sed -n "/${SERVICE}.*authentication failure/ s/^.*rhost=\([0-9.]*\).*$/\1/p" "${LOGFILE}" | sort | uniq -c | while read line
    do
        count=`echo $line | sed 's/  *.*$//'`
        ip=`echo $line | sed 's/[0-9]*  *//'`
        n=`grep -c -e "${SERVICE}: *$ip" "${LSTFILE}"`

        [ "$count" -ge "$MAXCOUNT" -a "$n" -eq "0" ] && echo "${SERVICE}: $ip" >> "${LSTFILE}"
    done
}

add_to_list sshd   "${SYSLOG_FILE}" "${HOSTDENY_LIST}" $LOGIN_LIMIT
add_to_list vsftpd "${SYSLOG_FILE}" "${HOSTDENY_LIST}" $LOGIN_LIMIT
Posted in coding | Tagged , , , , , | Leave a comment