Stat 430 HW 4

Stat 430 HW 4
1. Suppose you own a factory in the land of bizarre regulations (Europe?) and the law mandates that
everyone who works for you gets the day off whenever one of them has a birthday. Further, assume
that a randomly selected person is equally likely to have a birthday on any day of the year. Also
February 29th doesn’t exist.
(a) On a given day, what is the probability the factory is working if you have n employees?
(b) What is the expected number of person-days worked on any given day?
(c) What is the expected number of person-days worked in a year?
(d) What number of employees maximizes the expected number of person days worked in a year?
To do this, notice that for such a maximizing value, we would need the expected number to be
less with n − 1 employees and greater with n + 1 employees.
(e) Give an expression for the probability that the employees get day j off given that they got day
i off? How would this let you compute the variance of the total number of person-days worked
in a year?
Solution
n
(a) The probability of no one having a birthday on a given day is 364
where n is the number of
365
employees.
n
(b) On a given day, the probability of no one working is 364
365 . And if there are n employees, n
person-days are worked for each day the factory is open. So, if Xi is the number of person-days
n
worked on day i, then E(Xi ) = nz 364
365
(c) The total number of person days worked in a year is the sum of the number worked every day.
So
365
X
Xi
i=1
So, the expectation is 365E(X1 ) = 365n 1 −
364 n
365
(d) If a number of employees is a maximum, then it must be at least as many person-days worked
as if we have n − 1 people and at least as much as if we have n + 1 people. So
365n
365n
364
365
n
364
365
n
364
365
n−1
364
365
n+1
≥ 365(n − 1)
≥ 365(n + 1)
1
Simplifying these gives
n−1
364
≤
n
365
n+1
365
≤
n
364
This implies that n = 364
363 n
(e) Let Yi be 1 if they get day i off and 0 otherwise. First, we know P (Yj = 0 | Yi = 0) = 364
.
n
Since conditional probabilities are probabilities, we know P (Yj = 1 | Yi = 0) = 1− 363
.
Now,
364
by Bayes’ rule
P (Yj = 1 | Yi = 0) P (Yi = 0)
P (Yi = 0 | Yj = 1) =
P (Yj = 1)
P (Yi = 1 | Yj = 1) = 1 −
P (Yj = 1 | Yi = 0) P (Yi = 0)
P (Yj = 1)
Note that all of the terms are things we know! This lets us compute the variance because these
probabilities let us compute the E(Xi Xj ) term to get the covariance part in the variance of the
sum.
2. In a random sample of n people from a very large population, assume the sample is taken with
replacement so that observations are independent. Let p be the proportion in the population who
favor candidate A over candidate B in an upcoming election. If X denotes the number of people
favoring A in the sample of size n, and Y is the number of people favoring B in the sample, find
(a) E(X − Y )
(b) V ar(X − Y )
Hint: Can you write X in terms of Y ?
Solution
(a) Since everyone either suppoerts A or B, X = n − Y . So E(X − Y ) = E(n − 2Y ) = n − 2n(1 − p)
(b) For the variance we have V ar(n − 2Y ) = V ar(2Y ) = 4V ar(Y ) = 4np(1 − p)
3. Suppose you owe the robot mafia $500 and they’ll clamp you to death if you don’t pay it back. You
have $250 and are going to play roulette to try and earn the money to pay your debt. Consider the
following bets and their relative merits:
(a) Bet all $250 on black once
(b) Bet $50 on black 5 times (if you haven’t won by then, you’re dead anyway).
Extra Credit Bet in increments of $d until you’ve either made your $500 or lost it all.
Solution
(a) The expected value of this bet is
18 2
is 5002 18
38 − 500 38
18
20
38 500 + 38 0
(if you’re playing american roulette. The variance
(b) Here, notice the expected value is the same. But we have 5 bets, each with variance 502 18
38 −
18 2
50 38 . So multiply by 5 to get the total variance. The variance is lower. However, this is a
bad thing, since we are much more likely to lose. You’d need to win all 5 in a row to win, which
is much less likely than winning one bet.
2
(c) Look up the “Gambler’s Ruin” problem.
4. Roll a six sided die 10 times. Let N be the number of sides that never show up.
(a) Find E(N ).
(b) Find V ar(N ).
Solution
(
1 number i doesn’t show up
(a) Let Xi =
.
0 otherwise
P6
So, N =
i=1 Xi . Note that E (Xi ) = P (Xi = 1) and that all of the Xi have the same
distribution. So P (Xi = 1) = (5/6)10 since for each number there is a 5/6 chance of not seeing
it on any given roll and all rolls are independent. Thus E (N ) = 6(5/6)10 .
(b) V ar(N ) = E N 2 − (E (N ))2 . Let’s take care of the first term, since the second we will just
work out from part a at the end.

!2 
6
X
E N2 = E 
Xi 
i=1
=
6
X
X
E Xi2 +
E (Xi Xj )
i=1
= 6E
i6=j
X12
+ 30E (X1 X2 )
The second line follows from expanding the sum and applying linearity of expectation and the
third line follows from the Xi ’s all having the save distribution and same covariance between
any two of them. Let’s compute these.
E X12 = 1 × P X12 = 1 + 0 × P X12 = 0
= 1 × P (X1 = 1)
= (5/6)10
E (X1 X2 ) = 1 × P (X1 X2 = 1) + 0 × P (X1 X2 = 0)
= P (X1 = 1, X2 = 1)
= (2/3)10
The last line follows from the fact that in order to not see 1 and 2, or any two numbers for that
matter, we have 4 of the 6 other numbers available on each roll. Finally, we have
V ar(N ) = 6(5/6)10 + 30(2/3)10 − 36(5/6)20
5. Suppose that X1 , X2 , . . . Xn are independent random variables. Let X̄ =
E(Xi ) = µ and that V ar(Xi ) = σ 2 . Find Cov(X1 − X̄, X̄).
3
1
n
Pn
i=1 Xi .
Assume that
Solution First, notice that E X̄ = E (X1 ) so Cov(X1 − X̄, X̄) = E (X1 − X̄)X̄ . Next we do
some algebra
E (X1 − X̄)X̄ = E X1 X̄ − E X̄ 2
2 = E X1 X̄ − V ar(X̄ 2 ) + E X̄
= E X1 X̄ − (σ 2 /n + µ2 )
!
n
X
1
2
=
E X1 +
E (X1 Xi ) − (σ 2 /n + µ2 )
n
i=2
1 2
2
σ + µ + (n − 1)µ2 − (σ 2 /n + µ2 )
=
n
=0
Thus, Here we have used the fact that Xi and Xj are independent for i 6= j and that E X 2 =
V ar(X) + (E (X))2
6. (a) Define the correlation ρ between two random variables X and Y in terms of V ar(X), V ar(Y )
and Cov(X, Y ).
(b) Let f be the function given by f (t) = V ar(X + tY ). Find the value of t that minimizes f (t) in
terms of V ar(X), V ar(Y ) and Cov(X, Y ). Evaluate the function at this value of t and use the
fact that f (t) ≥ 0 to show that the correlation ρ between X and Y satisfies −1 ≤ ρ ≤ 1.
Solution
(a)
Cov(X, Y )
ρ(X, Y ) = p
V ar(X)V ar(Y )
(b) First, expand the right hand side term to get
f (t) = V ar(X) + t2 V ar(Y ) + 2tCov(X, Y )
To minimize, recall that we want to find t such that f 0 (t) = 0. We have
f 0 (t) = 2tV ar(Y ) + 2Cov(X, Y )
)
Setting this equal to 0 and solving for t gives t = − Cov(X,Y
V ar(Y ) . Plugging this value in for t in our
expression for f (t) above and using the fact that f (t) ≥ 0 because variance is nonnegative gives
0 ≤ V ar(X) +
= V ar(X) −
(Cov(X, Y ))2
(Cov(X, Y ))2
−2
V ar(Y )
V ar(Y )
(Cov(X, Y ))2
V ar(Y )
Then, we rearrange terms to get
(Cov(X, Y ))2
≤1
V ar(Y )V ar(X)
Finally, notice that the lefthand side is (ρ(X, Y ))2 .
4
7. A 10 digit number is chosen randomly where each of the digits is with equal probability equal to one
of the digits 1 to 9 and where each digit is chosen independently of the other digits. Let N be the
number of digits missing from the randomly selected 10 digit number. For example if the number is
1231452832, then we are missing the digits 6,7 and 9 and so N = 3.
(a) Find E(N ).
(b) Find V ar(N ).
Solution
(a) By linearity of expectation, E(N ) = 9(8/9)10
(b) In this problem the Xi are NOT all independent so we need to compute E(Xi Xj ). Note that
Xi Xj = 1 if and only if BOTH numbers are missing. To compute this probability, we can find
P (Xi = 1, Xj = 1) = P (Xi = 1|Xj = 1)P (Xj = 1) = (7/8)10 (8/9)10 = (7/9)10
Next, we compute
E(N 2 ) =
9
X
i=1
E(Xi2 ) +
X
E(Xi Xj ) = 9(8/9)10 + 72(7/9)10
i6=j
And finally,
V ar(N ) = E(N 2 ) = (E(N ))2 = 9(8/9)10 + 72(7/9)10 − 81(8/9)20
5