Praha & EU: Investujeme do vašı budoucnosti. Roman Kotecký, Rudolf Blazek (
FIT CVUT). Statistika pro informatiku. MI-SPI, ZS 2011/12, Prednáška 4. 1 / 23 ...
Statistika pro informatiku prof. RNDr. Roman Koteck´y DrSc., Dr. Rudolf Blaˇzek, PhD Katedra teoreticke´ informatiky FIT ˇ Cesk e´ vysoke´ uˇcen´ı technicke´ v Praze
´ ska 4 MI-SPI, ZS 2011/12, Pˇrednaˇ
´ ı fond. Evropsk´y socialn´ Praha & EU: Investujeme do vaˇs´ı budoucnosti
ˇ Roman Kotecky, ´ Rudolf Blaˇzek (FIT CVUT)
Statistika pro informatiku
ˇ MI-SPI, ZS 2011/12, Pˇredna´ ska 4
1 / 23
Recapitulation
Recapitulation Random variable: a function X : Ω → R.
Probability function: pX (x ) = P (X = x ).
Cumulative distribution function: FX (x ) = P (X ≤ x ) =
Expectation: E (X ) =
P
x :pX (x )>0
P
xk :xk ≤x
p(xk ).
xpX (x ).
Linearity: a, b ∈ R, then E (aX + bY ) = aE (X ) + bE (Y ).
Expectation of an indicator: E (IA ) = P (A). k th moment: mk = E (X k ). k th central moment: σk = E ((X − m1 )k ).
Variance: var(X ) = σ2 = E ((X − E (X ))2 ) = E (X 2 ) − (E (X ))2 .. Bernoulli random variable: pX (0) = 1 − p a pX (1) = p. Binomial random variable: pX (k ) =
n k
pk (1 − p)n−k , k = 1, . . . , n.
Geometric random variable: pX (k ) = (1 − p)k −1 p, k = 1, 2, . . . . k
Poisson random variable: pX (k ) = λk ! e−λ , k = 0, 1, 2, . . . .
ˇ Roman Kotecky, ´ Rudolf Blaˇzek (FIT CVUT)
Statistika pro informatiku
ˇ MI-SPI, ZS 2011/12, Pˇredna´ ska 4
2 / 23
Expectations and variances of standard random variables
Constant random variable
Constant random variable Example X (ω) = c for each ω ∈ Ω;
Constant random variable:
i.e. pX (x ) = 1 for x = c , otherwise 0. Expectation and variance: E (X ) =
X
xk pX (xk ) = c P (x = c ) = c
xk
var(X ) = E (X − E (X ))2 = E (c − c )2 = 0. For computations we use: E (c ) = c . . . barycenter of a constant c is the constant c itself
var(c ) = 0 . . . “the width of a graph with a single value c” is 0.
ˇ Roman Kotecky, ´ Rudolf Blaˇzek (FIT CVUT)
Statistika pro informatiku
ˇ MI-SPI, ZS 2011/12, Pˇredna´ ska 4
3 / 23
Expectations and variances of standard random variables
Bernoulli random variable
Bernoulli random variable X (head) = 1 and X (tail) = 0. The occurence of “head” plays a role of “success”. Example Bernoulli random variable: Expectation and variance: E (X ) =
X
2
X
xk
E (X ) =
xk
pX (1) = p ∈ (0, 1)
pX (0) = 1 − p = q
(head, success) (tail, failure)
xk pX (xk ) = 1 · p + 0 · q = p, xk2 pX (xk ) = 12 · p + 02 · q = p,
var(X ) = E (X 2 ) − E (X )2 = p − p2 = p(1 − p) = p q .
ˇ Roman Kotecky, ´ Rudolf Blaˇzek (FIT CVUT)
Statistika pro informatiku
ˇ MI-SPI, ZS 2011/12, Pˇredna´ ska 4
4 / 23
Expectations and variances of standard random variables
Binomial random variable
Binomial random variable Number of successes during n identical and independent repetitions of a Bernoulli experiment (with P (success) = p ). Example Binomial random variable X ∼ Bin(n, p): pX (k ) = P (X = k ) = E (X ) =
X xk
n
pk q n−k ,
k
xk pX (xk ) =
k = 0, 1, . . . , n .
n X n
k
k =0
The sum on the right hand side of EX reminds
Pn
k =0
k pk q n−k .
n k
x k y n−k = (x + y )n , up
to the factor “k ” . in “k pk ” .
ˇ Roman Kotecky, ´ Rudolf Blaˇzek (FIT CVUT)
Statistika pro informatiku
ˇ MI-SPI, ZS 2011/12, Pˇredna´ ska 4
5 / 23
Expectations and variances of standard random variables
Binomial random variable
Example (continuation) Deriving this sum with respect to x and multiplying by x we get the needed n n n expression: X X d X n k n −k n n x x y =x k x k −1 y n−k = k x k y n −k . dx k k k k =0
k =0
k =0
After two derivatives with regard to x followed by a multiplication by x we get n d X
dx
d
k
dx
n
k =0
x k y n−k =
(x + y )n ,
n X n k =0
k
k x k −1 y n−k = n(x + y )n−1 ,
n X n k =0
ˇ Roman Kotecky, ´ Rudolf Blaˇzek (FIT CVUT)
k
k x k y n−k = x n(x + y )n−1 .
Statistika pro informatiku
ˇ MI-SPI, ZS 2011/12, Pˇredna´ ska 4
6 / 23
Expectations and variances of standard random variables
Binomial random variable
Example (continuation) Substituting x = p and y = q we get (recall, q = 1 − p =⇒ p + q = 1) E (X ) =
n X n k =0
k
k pk q n−k = p n(p + q )n−1 = n p.
Similarly, the second derivative of the generating function (x + y )n yields n d2 X
dx 2 n X n k =0
k
k =0
d2
k
dx 2
n
x k y n−k =
(x + y )n ,
k (k − 1) x k −2 y n−k = n(n − 1)(x + y )n−2 ,
n X n k =0
k
ˇ Roman Kotecky, ´ Rudolf Blaˇzek (FIT CVUT)
k (k − 1) x k y n−k = x 2 n(n − 1)(x + y )n−2 .
Statistika pro informatiku
ˇ MI-SPI, ZS 2011/12, Pˇredna´ ska 4
7 / 23
Expectations and variances of standard random variables
Binomial random variable
Example (continuation) Thus (again x = p and y = q with p + q = 1) n X n E (X (X − 1)) = k (k − 1) pk q n−k = p2 n(n − 1)(p + q )n−2= p2 n(n − 1) k =0
k
Hence (recall that E (X ) = n p) E (X (X − 1)) = p2 n(n − 1)
EX 2 − EX = n2 p2 − n p2
EX 2 = n2 p2 − n p2 + EX = n2 p2 − n p2 + n p
EX 2 = (np)2 + n p(1 − p).
Finally, var(X ) = E (X 2 ) − E (X )2 = n p(1 − p) = n p q .
For X ∼ Bin(n, p) thus
EX = n p and varX = n p (1 − p) = n p q . ˇ Roman Kotecky, ´ Rudolf Blaˇzek (FIT CVUT)
Statistika pro informatiku
ˇ MI-SPI, ZS 2011/12, Pˇredna´ ska 4
8 / 23
Expectations and variances of standard random variables
Geometric random variable
Geometric random variable X = “order of the experiment when the first “success” occures during identical a independent repetitions of Bernoulli experiments (with P (success) = p ∈ (0, 1) ). Example Geometric random variable: pX (k ) = (1 − p)k −1 p = q k −1 p, k = 1, 2, . . . E (X ) =
X
xk pX (xk ) =
∞ X
k q k −1 p = p
k =1
xk
∞ X
k q k −1 .
k =1
Similarly as in the case of the binomial distribution, the sum on the right hand side of EX reminds the derivative of ∞ d X
dx
k
x =
k =0
ˇ Roman Kotecky, ´ Rudolf Blaˇzek (FIT CVUT)
d
P∞
k =0
1
dx 1 − x
1 x k = 1− for |x | < 1: x
or
∞ X k =1
Statistika pro informatiku
k x k −1 =
1
(1 − x )2
.
ˇ MI-SPI, ZS 2011/12, Pˇredna´ ska 4
9 / 23
Expectations and variances of standard random variables
Geometric random variable
Example (continuation) Using thus
P∞
k =1
k x k −1 = (1−1x )2 (with q = x and 1 − q = p):
E (X ) =
∞ X
k q k −1 p = p
k =1
∞ X
k q k −1 =
k =1
P∞
k x k −1 = (1−1x )2 =⇒ Derivative with respect to x then yields Multiplying by x we get ∞ d X
dx
k xk =
k =0
d
k =0
x
dx (1 − x )
2
hence
p
1
(1 − q )2 P∞
k =0
∞ X
= . p
k x k = (1−xx )2 .
k 2 x k −1 =
k =0
1+x
(1 − x )3
,
implying E (X 2 ) =
∞ X
k 2 q k −1 p = p
k =1
ˇ Roman Kotecky, ´ Rudolf Blaˇzek (FIT CVUT)
∞ X k =1
k 2 q k −1 = p ·
Statistika pro informatiku
1+q
(1 − q )
3
=
2−p p2
.
ˇ MI-SPI, ZS 2011/12, Pˇredna´ ska 4
10 / 23
Expectations and variances of standard random variables
Geometric random variable
Example (continuation) Finally thus
var(X ) = E (X 2 ) − E (X )2 =
2−p p2
−
1 p2
=
1−p p2
=
q p2
.
Summarizing, for geometric random variable with P (success) = p ∈ (0, 1) we have
EX =
varX =
ˇ Roman Kotecky, ´ Rudolf Blaˇzek (FIT CVUT)
1
,
p 1−p p2
=
Statistika pro informatiku
q p2
.
ˇ MI-SPI, ZS 2011/12, Pˇredna´ ska 4
11 / 23
Expectations and variances of standard random variables
Poisson random variable
Poisson random variable Poisson probability distribution is often used to model the number of random events during a given time period. For example, X =“number of attempts for a connection with a server within 15 seconds”. Example Poisson random variable with parameter λ > 0: pX (k ) = P (X = k ) = E (X ) =
λk −λ e , k!
k = 0, 1, 2, . . .
X λk X λk −1 k e−λ = λ e−λ k! (k − 1)! k ≥1
= λ e−λ
k ≥1
X λm m ≥0
m!
= λ e−λ eλ
=λ
ˇ Roman Kotecky, ´ Rudolf Blaˇzek (FIT CVUT)
Statistika pro informatiku
ˇ MI-SPI, ZS 2011/12, Pˇredna´ ska 4
12 / 23
Expectations and variances of standard random variables
Poisson random variable
Example (continuation)
X λ k −1 λk −λ e = λe−λ k k! (k − 1)! k ≥1 k ≥1 k −1 k −1 X X λ λ + = λe−λ (k − 1) (k − 1)! (k − 1)! k ≥1 k ≥1 X λ k −2 + eλ = λe−λ λeλ + eλ = λ2 + λ, = λe−λ λ (k − 2)!
E (X 2 ) =
X
k2
k ≥2
and thus var(X ) = EX 2 − (EX )2 = (λ2 + λ) − λ2 = λ. For X ∼ Poisson(λ) thus
ˇ Roman Kotecky, ´ Rudolf Blaˇzek (FIT CVUT)
E (X ) = var(X ) = λ.
Statistika pro informatiku
ˇ MI-SPI, ZS 2011/12, Pˇredna´ ska 4
13 / 23
Independence of random variables
Independence revisited
Definition X and Y are independent if {X = x } and {Y = y } are independent for each x
and y.
Lemma If X and Y are independent, then E (XY ) = E (X )E (Y ). Proof. XY =
P
x ,y
xyIAx IBy and independence implies E (XY ) =
X x ,y
=
X x
ˇ Roman Kotecky, ´ Rudolf Blaˇzek (FIT CVUT)
xyP (Ax ∩ By ) =
X
xP (Ax )
X
xyP (Ax )P (By ) =
x ,y
y P (By ) = E (X )E (Y ).
y
Statistika pro informatiku
ˇ MI-SPI, ZS 2011/12, Pˇredna´ ska 4
14 / 23
Independence of random variables
Noncorrelated random variables
Definition Random variables X and Y are noncorrelated if E (XY ) = E (X )E (Y ). Example
Independence implies noncorrelation but not otherwise around: Assume that X , Y ∈ {−1, 0, 1} are such that P (X = i , Y = j ) = 1/4 if (i , j ) ∈ {(0, 1), (0, −1), (1, 0), (−1, 0)} (and it equals 0 in the remaining cases). Then, X and Y have the same probability distributions P (X = 0) = 1/2 and P (X = 1) = P (X = −1) = 1/4, and thus E (XY ) = 0 = E (X )E (Y ) since E (X ) = E (Y ) = 0,
while P (X = 0, Y = 1) = 1/4 6= P (X = 0)P (Y = 1) = 1/8.
ˇ Roman Kotecky, ´ Rudolf Blaˇzek (FIT CVUT)
Statistika pro informatiku
ˇ MI-SPI, ZS 2011/12, Pˇredna´ ska 4
15 / 23
Independence of random variables
Properties of the variance
Theorem For any random variables X and Y and arbitrary a ∈ R, we have: a) var(aX ) = a2 var(X ).
b) var(X + Y ) = var(X ) + var(Y ) if X and Y are noncorrelated. Proof. a) var(aX ) = E ((aX )2 ) − (E (aX ))2 = a2 (E (X 2 ) − (E (X ))2 ). b) It suffices to use E ((X + Y )2 ) = E (X 2 + 2XY + Y 2 ) =
E (X 2 ) + 2E (XY ) + E (Y 2 ) = E (X 2 ) + 2E (X )E (Y ) + E (Y 2 ) and E (X + Y )2 = (E (X ) + E (Y ))2 = E (X )2 + 2E (X )E (Y ) + E (Y )2 .
ˇ Roman Kotecky, ´ Rudolf Blaˇzek (FIT CVUT)
Statistika pro informatiku
ˇ MI-SPI, ZS 2011/12, Pˇredna´ ska 4
16 / 23
Independence of random variables
Uncorrelated random variables in computations
Example (Computation of the expectation of a binomial random variable, revisited) Consider X =
Pn
i =1
Xi with independent Bernoulli variables Xi with parameter p.
Then X has a binomial distribution
n
P (X = k ) =
k
pk (1 − p)n−k .
Thus E (X ) =
X
E (Xi ) = np
i
and
P i
Xi
2
=
P
E (X 2 ) =
i ,j
Xi Xj ,
X i
E (Xi2 ) +
X i 6=j
E (Xi )E (Xj ) = np + n(n − 1)p2 ,
yielding
var(X ) = np(1 − p). ˇ Roman Kotecky, ´ Rudolf Blaˇzek (FIT CVUT)
Statistika pro informatiku
ˇ MI-SPI, ZS 2011/12, Pˇredna´ ska 4
17 / 23
Independence of random variables
Uncorrelated random variables in computations
Example (Reliability) Consider a “network”
1• • 3
•4
2•
Assume that each connection e = (i , j ), i , j ∈ {1, 2, 3, 4} is working with probability pe (and not working with probability 1 − pe ).
What is the reliability
spolehlivost
R (1, 4) = the probability that a path from the node 1 to the node 4 is open? Let Xe be the indicator function that edge e is working and χ the indicator function yielding 1 if a connection π from 1 to 4 exists and 0 otherwise. In general,
χ=1−
ˇ Roman Kotecky, ´ Rudolf Blaˇzek (FIT CVUT)
Y π
Iπ not working = 1 −
Statistika pro informatiku
Y π
1−
Y
Xe .
e∈π
ˇ MI-SPI, ZS 2011/12, Pˇredna´ ska 4
18 / 23
Independence of random variables
Uncorrelated random variables in computations
Example (continuation) In our case,
χ = 1 − (1 − X1,3 X3,4 )(1 − X1,2 X2,3 X3,4 ) = = X3,4 (X1,3 + X1,2 X2,3 ) − X1,3 X1,2 X2,3 X32,4
and thus
R (1, 4) = P (χ = 1) = E (χ) = p3,4 (p1,3 + p1,2 p2,3 ) − p1,3 p1,2 p2,3 p3,4 . In particular, if the reliability of each edge connection is 90% (i.e., pe = 0.9), we have R (1, 4) = 0.92 + 0.93 − 0.94 ∼ 0.88.
ˇ Roman Kotecky, ´ Rudolf Blaˇzek (FIT CVUT)
Statistika pro informatiku
ˇ MI-SPI, ZS 2011/12, Pˇredna´ ska 4
19 / 23
Probability on uncountable spaces
Probability on uncountable spaces
Examples Infinite sequences of zeros and ones
Ω = {0, 1}N = {ω : ω = (ω1 , ω2 , . . . ), ω1 , ω2 , · · · ∈ {0, 1}}
To analyse, e.g., the following claim: A fair coin is tossed repeatedly. Show that, with probability one, a head turns up eventually.
Darts Continuous set of outcomes: T ⊂ R2 . Here Ω = T ∪ {∗}, where {∗} is a one
point set representing the result “dart missed the target”.
Defining probability on an uncountable Ω, one should be careful: one should consider only a subset F ⊂ P (Ω).
ˇ Roman Kotecky, ´ Rudolf Blaˇzek (FIT CVUT)
Statistika pro informatiku
ˇ MI-SPI, ZS 2011/12, Pˇredna´ ska 4
20 / 23
Probability on uncountable spaces
Definition Probability is a function P : F → [0, 1] such that (N) P (Ω) = 1,
(A) If A1 , A2 , · · · ∈ F are pairwise disjoint (Ai ∩ Aj = ∅ for i 6= j), then P (∪`≥1 A` ) =
X
P (A` )
`≥1
(N). . . normalisation, (A) . . . σ -additivity. Remark Here, we tacitly assume that A1 , A2 , · · · ∈ F implies ∪`≥1 A` ∈ F . Example For a countable Ω, by choosing p : Ω → [0, 1] such that prob. P is given by taking P (A) =
ˇ Roman Kotecky, ´ Rudolf Blaˇzek (FIT CVUT)
P
ω∈Ω p (ω)
= 1, the
P
ω∈A p (ω) for any A ∈ P (Ω).
Statistika pro informatiku
ˇ MI-SPI, ZS 2011/12, Pˇredna´ ska 4
21 / 23
Probability on uncountable spaces
The power set P (Ω) is too large
The power set P (Ω) might be too large Why not P (Ω) instead of F in the definition of probability distribution? Theorem (Vitali, 1905) Let Ω = {0, 1}N . Then there is no function P : P (Ω) → [0, 1] such that it satisfies
the conditions (N), (A), and
(I) for all A ⊂ Ω and n ≥ 1 it is P (Tn A) = P (A).
Here,
Tn : ω = (ω1 , ω2 , . . . ) → (ω1 , . . . , ωn−1 , ω cn , ωn+1 , . . . ), where b 0 = 1, b 1 = 0, and Tn (A) = {Tn (ω) : ω ∈ A}. Main idea of the proof: Define equivalence relation on Ω:
ω ∼ ω 0 iff they differ only in finitely many coordinates. ˇ Roman Kotecky, ´ Rudolf Blaˇzek (FIT CVUT)
Statistika pro informatiku
ˇ MI-SPI, ZS 2011/12, Pˇredna´ ska 4
22 / 23
Probability on uncountable spaces
The power set P (Ω) is too large
Proof (continuation). Take A containing one ω from each equivalence class (axiom of choice). Consider S = {S ⊂ N : S finite } and TS defined by TS = Tn1 ◦ · · · ◦ Tnk for
S = {n1 , . . . , nk }. Then:
Ω = ∪S∈S TS (A) TS (A) and TS 0 (A) are disjoint for S 6= S 0 .
This implies
1 = P (Ω) =
X
P (TS (A)) =
S ∈S
X
P (A),
S ∈S
which is a contradiction (infinite sum of a number is either 0 or ∞).
ˇ Roman Kotecky, ´ Rudolf Blaˇzek (FIT CVUT)
Statistika pro informatiku
ˇ MI-SPI, ZS 2011/12, Pˇredna´ ska 4
23 / 23