22 Concentration Inequalities I
Concentration inequalities provide mathematical bounds on the probability of a random variable deviating from some value. They can answer the questions such as “what is the largest probability that X can deviates from its mean by more than t”?
Markov’s inequality
Markov’s inequality states that the probability of a random variable larger than t is less than \frac{\mu}{t}.
Theorem 22.1 (Markov’s inequality) Let X \geq 0 be a non-negative random variable with finite mean \mu. Then for any t > 0,
\mathbb{P} \left( X \geq t \right) \leq \frac{ \mu }{ t }.
Chebyshev’s inequality
Chebyshev’s inequality states that the probability of the difference between a random variable and its mean larger than a is less than \frac{\sigma^{2}}{t^{2}}.
Theorem 22.2 (Chebyshev’s inequality) Let X be a random variable with finite mean \mu and finite variance \sigma^{2}. Then for any t > 0,
\mathbb{P} \left( \lvert X - \mu \rvert \geq t \right) \leq \frac{ \sigma^{2} }{ t^{2} }.
Average of random variables converge to mean
Let X_{1}, \dots, X_{n} be independent random variables with finite means \mu_{1}, \dots, \mu_{n} and finite variances \sigma_{1}^{2}, \dots, \sigma_{n}^{2}. Then, for any t > 0
\mathbb{P} \left( \left\lvert \sum_{i = 1}^{n} X_{i} - \sum_{i = 1}^{n} \mu_{i} \right\rvert \geq t \right) \leq \frac{ \sum_{i = 1}^{n} \sigma_{i}^{2} }{ t^{2} }
Chernoff bound
Chernoff’s bounding method
Theorem 22.3 Let X be a random variable on \mathbb{R}. Then for all the t > 0
\mathbb{P} (X \geq t) \leq \inf_{s > 0} \frac{ M_{X} (s) }{ e^{s t} } = \inf_{s > 0} \frac{ \mathbb{E}_{X} [e^{s X}] }{ e^{s t} }
where M_{X} (s) is the MGF of X.
Remark. The closed-form solution of the optimization problem
\inf_{s > 0} \frac{ \mathbb{E}_{X} [e^{s X}] }{ e^{s t} } = \inf_{s > 0} f (s)
can be obtained by solving f' (s) = 0 if f (s) is a convex function.
Chernoff bound for Bernoulli random variables
Let X_{1}, \dots, X_{n} be bounded independent Bernoulli random variables such that X_{i} \sim \mathrm{Ber} (p_{i}), \forall i. Let X = \sum_{i=1}^{n} X_{i} and \mu = \mathbb{E}_{X} (X).
Upper tail: for all \delta > 0
\mathbb{P} (X \geq (1 + \delta) \mu) \leq \exp \left[ -\frac{ \delta^{2} \mu }{ 2 + \delta } \right]
Lower tail: for all 0 < \delta < 1
\mathbb{P} (X \leq (1 - \delta) \mu) \leq \exp \left[ -\frac{ \delta^{2} \mu }{ 2 } \right]
General Chernoff bound
Let X_{1}, \dots, X_{n} be bounded independent random variables such that X_{i} \in [a, b], \forall i. Let X = \sum_{i=1}^{n} X_{i} and \mu = \mathbb{E}_{X} [X]. Then for \delta > 0
Upper tail:
\mathbb{P} (X \geq (1 + \delta) \mu) \leq \exp \left[ -\frac{ 2 \delta^{2} \mu^{2} }{ n (b - a)^{2} } \right]
Lower tail:
\mathbb{P} (X \leq (1 - \delta) \mu) \leq \exp \left[ -\frac{ \delta^{2} \mu^{2} }{ n (b - a)^{2} } \right]