A question of probability regarding expectation and variance of a random variable.
Clash Royale CLAN TAG#URR8PPP
up vote
0
down vote
favorite
Suppose that we have an infinite supply of balls and N boxes. Fix $k le N$. The balls are thrown at random into the boxes until exactly $k$ boxes are occupied. Let $X$ denote the number of throws needed for this. Calculate $E(X)$ and $mathrm Var (X)$.
If for $i geq k$, $X=i$ then that implies the success is obtained in the $i$-th trial. So it means that we have exactly $(k-1)$ occupied boxes in the $(i-1)$-th trial. Then the $i$-th ball is placed at some box other than the former $(k-1)$ boxes. How can we do that?
First we choose $(k-1)$ boxes from $N$ boxes. Then we put $(i-1)$ balls into those boxes such that no box is left empty. Then the remaining $i$-th ball can be placed in any of the remaining $(N-k+1)$ boxes.
So according to me we have for $i geq k$
$$P(X=i) = frac N choose k-1S(i-1,k-1)N-k+1choose 1 N^i.$$ Where $S(m,n)$ denotes the number of all surjective functions from an $m$ elements set onto an $n$ elements set.
But then the expressions of expectation and variance become very complicated. Is everything I have done so far correct? If 'yes' can somebody please tell me the easier way to calculate the expectation and variance of the above random variable $X$?
Thank you very much.
probability probability-theory random-variables expectation variance
add a comment |Â
up vote
0
down vote
favorite
Suppose that we have an infinite supply of balls and N boxes. Fix $k le N$. The balls are thrown at random into the boxes until exactly $k$ boxes are occupied. Let $X$ denote the number of throws needed for this. Calculate $E(X)$ and $mathrm Var (X)$.
If for $i geq k$, $X=i$ then that implies the success is obtained in the $i$-th trial. So it means that we have exactly $(k-1)$ occupied boxes in the $(i-1)$-th trial. Then the $i$-th ball is placed at some box other than the former $(k-1)$ boxes. How can we do that?
First we choose $(k-1)$ boxes from $N$ boxes. Then we put $(i-1)$ balls into those boxes such that no box is left empty. Then the remaining $i$-th ball can be placed in any of the remaining $(N-k+1)$ boxes.
So according to me we have for $i geq k$
$$P(X=i) = frac N choose k-1S(i-1,k-1)N-k+1choose 1 N^i.$$ Where $S(m,n)$ denotes the number of all surjective functions from an $m$ elements set onto an $n$ elements set.
But then the expressions of expectation and variance become very complicated. Is everything I have done so far correct? If 'yes' can somebody please tell me the easier way to calculate the expectation and variance of the above random variable $X$?
Thank you very much.
probability probability-theory random-variables expectation variance
This is a variant of the Coupon Collector Problem, for $k=N$ it is literally that problem. Similar techniques should apply.
– lulu
yesterday
add a comment |Â
up vote
0
down vote
favorite
up vote
0
down vote
favorite
Suppose that we have an infinite supply of balls and N boxes. Fix $k le N$. The balls are thrown at random into the boxes until exactly $k$ boxes are occupied. Let $X$ denote the number of throws needed for this. Calculate $E(X)$ and $mathrm Var (X)$.
If for $i geq k$, $X=i$ then that implies the success is obtained in the $i$-th trial. So it means that we have exactly $(k-1)$ occupied boxes in the $(i-1)$-th trial. Then the $i$-th ball is placed at some box other than the former $(k-1)$ boxes. How can we do that?
First we choose $(k-1)$ boxes from $N$ boxes. Then we put $(i-1)$ balls into those boxes such that no box is left empty. Then the remaining $i$-th ball can be placed in any of the remaining $(N-k+1)$ boxes.
So according to me we have for $i geq k$
$$P(X=i) = frac N choose k-1S(i-1,k-1)N-k+1choose 1 N^i.$$ Where $S(m,n)$ denotes the number of all surjective functions from an $m$ elements set onto an $n$ elements set.
But then the expressions of expectation and variance become very complicated. Is everything I have done so far correct? If 'yes' can somebody please tell me the easier way to calculate the expectation and variance of the above random variable $X$?
Thank you very much.
probability probability-theory random-variables expectation variance
Suppose that we have an infinite supply of balls and N boxes. Fix $k le N$. The balls are thrown at random into the boxes until exactly $k$ boxes are occupied. Let $X$ denote the number of throws needed for this. Calculate $E(X)$ and $mathrm Var (X)$.
If for $i geq k$, $X=i$ then that implies the success is obtained in the $i$-th trial. So it means that we have exactly $(k-1)$ occupied boxes in the $(i-1)$-th trial. Then the $i$-th ball is placed at some box other than the former $(k-1)$ boxes. How can we do that?
First we choose $(k-1)$ boxes from $N$ boxes. Then we put $(i-1)$ balls into those boxes such that no box is left empty. Then the remaining $i$-th ball can be placed in any of the remaining $(N-k+1)$ boxes.
So according to me we have for $i geq k$
$$P(X=i) = frac N choose k-1S(i-1,k-1)N-k+1choose 1 N^i.$$ Where $S(m,n)$ denotes the number of all surjective functions from an $m$ elements set onto an $n$ elements set.
But then the expressions of expectation and variance become very complicated. Is everything I have done so far correct? If 'yes' can somebody please tell me the easier way to calculate the expectation and variance of the above random variable $X$?
Thank you very much.
probability probability-theory random-variables expectation variance
edited yesterday
asked yesterday


Debabrata Chattopadhyay.
8311
8311
This is a variant of the Coupon Collector Problem, for $k=N$ it is literally that problem. Similar techniques should apply.
– lulu
yesterday
add a comment |Â
This is a variant of the Coupon Collector Problem, for $k=N$ it is literally that problem. Similar techniques should apply.
– lulu
yesterday
This is a variant of the Coupon Collector Problem, for $k=N$ it is literally that problem. Similar techniques should apply.
– lulu
yesterday
This is a variant of the Coupon Collector Problem, for $k=N$ it is literally that problem. Similar techniques should apply.
– lulu
yesterday
add a comment |Â
1 Answer
1
active
oldest
votes
up vote
2
down vote
accepted
This is related to the coupon collector's problem. The time it takes to hit a new empty box when $j$ boxes are already occupied is geometrically distributed with success probability $1-frac jN$. These times are independent. The expectation and variance of a geometric distribution with success probability $p$ are $frac1p$ and $frac1-pp^2$, respectively. The expectation and variance of the sum of independent random variables are the sum of the expectations and variances, respectively. Thus
begineqnarray*
mathsf E(X)=sum_j=0^k-1frac11-frac jN=Nsum_j=0^k-1frac1N-j
endeqnarray*
and
$$
mathsfVar(X)=sum_j=0^k-1fracfrac jNleft(1-frac jNright)^2=Nsum_j=0^k-1frac j(N-j)^2;.
$$
Where have I done mistake @joriki?
– Debabrata Chattopadhyay.
yesterday
@DebabrataChattopadhyay.: Your "mistake" was to derive a probability distribution for a problem that asks for an expectation. (A variance is a form of expectation.) More often than not, finding an entire probability distribution is orders of magnitude more work than finding an expectation, so taking the detour through the probability distribution is usually a big waste of effort.
– joriki
yesterday
@DebabrataChattopadhyay.: And this problem is a good demonstration of why this is so. You can just add the expectations and variances of the individual times of which $X$ is a sum, whereas finding the probability distribution from the probability distributions of the summands would have been a ton of work with complicated convolutions to calculate.
– joriki
yesterday
Is the PMF I wrote correct?
– Debabrata Chattopadhyay.
yesterday
1
Ok. Thanks @joriki for your help.
– Debabrata Chattopadhyay.
yesterday
 |Â
show 1 more comment
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
2
down vote
accepted
This is related to the coupon collector's problem. The time it takes to hit a new empty box when $j$ boxes are already occupied is geometrically distributed with success probability $1-frac jN$. These times are independent. The expectation and variance of a geometric distribution with success probability $p$ are $frac1p$ and $frac1-pp^2$, respectively. The expectation and variance of the sum of independent random variables are the sum of the expectations and variances, respectively. Thus
begineqnarray*
mathsf E(X)=sum_j=0^k-1frac11-frac jN=Nsum_j=0^k-1frac1N-j
endeqnarray*
and
$$
mathsfVar(X)=sum_j=0^k-1fracfrac jNleft(1-frac jNright)^2=Nsum_j=0^k-1frac j(N-j)^2;.
$$
Where have I done mistake @joriki?
– Debabrata Chattopadhyay.
yesterday
@DebabrataChattopadhyay.: Your "mistake" was to derive a probability distribution for a problem that asks for an expectation. (A variance is a form of expectation.) More often than not, finding an entire probability distribution is orders of magnitude more work than finding an expectation, so taking the detour through the probability distribution is usually a big waste of effort.
– joriki
yesterday
@DebabrataChattopadhyay.: And this problem is a good demonstration of why this is so. You can just add the expectations and variances of the individual times of which $X$ is a sum, whereas finding the probability distribution from the probability distributions of the summands would have been a ton of work with complicated convolutions to calculate.
– joriki
yesterday
Is the PMF I wrote correct?
– Debabrata Chattopadhyay.
yesterday
1
Ok. Thanks @joriki for your help.
– Debabrata Chattopadhyay.
yesterday
 |Â
show 1 more comment
up vote
2
down vote
accepted
This is related to the coupon collector's problem. The time it takes to hit a new empty box when $j$ boxes are already occupied is geometrically distributed with success probability $1-frac jN$. These times are independent. The expectation and variance of a geometric distribution with success probability $p$ are $frac1p$ and $frac1-pp^2$, respectively. The expectation and variance of the sum of independent random variables are the sum of the expectations and variances, respectively. Thus
begineqnarray*
mathsf E(X)=sum_j=0^k-1frac11-frac jN=Nsum_j=0^k-1frac1N-j
endeqnarray*
and
$$
mathsfVar(X)=sum_j=0^k-1fracfrac jNleft(1-frac jNright)^2=Nsum_j=0^k-1frac j(N-j)^2;.
$$
Where have I done mistake @joriki?
– Debabrata Chattopadhyay.
yesterday
@DebabrataChattopadhyay.: Your "mistake" was to derive a probability distribution for a problem that asks for an expectation. (A variance is a form of expectation.) More often than not, finding an entire probability distribution is orders of magnitude more work than finding an expectation, so taking the detour through the probability distribution is usually a big waste of effort.
– joriki
yesterday
@DebabrataChattopadhyay.: And this problem is a good demonstration of why this is so. You can just add the expectations and variances of the individual times of which $X$ is a sum, whereas finding the probability distribution from the probability distributions of the summands would have been a ton of work with complicated convolutions to calculate.
– joriki
yesterday
Is the PMF I wrote correct?
– Debabrata Chattopadhyay.
yesterday
1
Ok. Thanks @joriki for your help.
– Debabrata Chattopadhyay.
yesterday
 |Â
show 1 more comment
up vote
2
down vote
accepted
up vote
2
down vote
accepted
This is related to the coupon collector's problem. The time it takes to hit a new empty box when $j$ boxes are already occupied is geometrically distributed with success probability $1-frac jN$. These times are independent. The expectation and variance of a geometric distribution with success probability $p$ are $frac1p$ and $frac1-pp^2$, respectively. The expectation and variance of the sum of independent random variables are the sum of the expectations and variances, respectively. Thus
begineqnarray*
mathsf E(X)=sum_j=0^k-1frac11-frac jN=Nsum_j=0^k-1frac1N-j
endeqnarray*
and
$$
mathsfVar(X)=sum_j=0^k-1fracfrac jNleft(1-frac jNright)^2=Nsum_j=0^k-1frac j(N-j)^2;.
$$
This is related to the coupon collector's problem. The time it takes to hit a new empty box when $j$ boxes are already occupied is geometrically distributed with success probability $1-frac jN$. These times are independent. The expectation and variance of a geometric distribution with success probability $p$ are $frac1p$ and $frac1-pp^2$, respectively. The expectation and variance of the sum of independent random variables are the sum of the expectations and variances, respectively. Thus
begineqnarray*
mathsf E(X)=sum_j=0^k-1frac11-frac jN=Nsum_j=0^k-1frac1N-j
endeqnarray*
and
$$
mathsfVar(X)=sum_j=0^k-1fracfrac jNleft(1-frac jNright)^2=Nsum_j=0^k-1frac j(N-j)^2;.
$$
answered yesterday
joriki
164k10179328
164k10179328
Where have I done mistake @joriki?
– Debabrata Chattopadhyay.
yesterday
@DebabrataChattopadhyay.: Your "mistake" was to derive a probability distribution for a problem that asks for an expectation. (A variance is a form of expectation.) More often than not, finding an entire probability distribution is orders of magnitude more work than finding an expectation, so taking the detour through the probability distribution is usually a big waste of effort.
– joriki
yesterday
@DebabrataChattopadhyay.: And this problem is a good demonstration of why this is so. You can just add the expectations and variances of the individual times of which $X$ is a sum, whereas finding the probability distribution from the probability distributions of the summands would have been a ton of work with complicated convolutions to calculate.
– joriki
yesterday
Is the PMF I wrote correct?
– Debabrata Chattopadhyay.
yesterday
1
Ok. Thanks @joriki for your help.
– Debabrata Chattopadhyay.
yesterday
 |Â
show 1 more comment
Where have I done mistake @joriki?
– Debabrata Chattopadhyay.
yesterday
@DebabrataChattopadhyay.: Your "mistake" was to derive a probability distribution for a problem that asks for an expectation. (A variance is a form of expectation.) More often than not, finding an entire probability distribution is orders of magnitude more work than finding an expectation, so taking the detour through the probability distribution is usually a big waste of effort.
– joriki
yesterday
@DebabrataChattopadhyay.: And this problem is a good demonstration of why this is so. You can just add the expectations and variances of the individual times of which $X$ is a sum, whereas finding the probability distribution from the probability distributions of the summands would have been a ton of work with complicated convolutions to calculate.
– joriki
yesterday
Is the PMF I wrote correct?
– Debabrata Chattopadhyay.
yesterday
1
Ok. Thanks @joriki for your help.
– Debabrata Chattopadhyay.
yesterday
Where have I done mistake @joriki?
– Debabrata Chattopadhyay.
yesterday
Where have I done mistake @joriki?
– Debabrata Chattopadhyay.
yesterday
@DebabrataChattopadhyay.: Your "mistake" was to derive a probability distribution for a problem that asks for an expectation. (A variance is a form of expectation.) More often than not, finding an entire probability distribution is orders of magnitude more work than finding an expectation, so taking the detour through the probability distribution is usually a big waste of effort.
– joriki
yesterday
@DebabrataChattopadhyay.: Your "mistake" was to derive a probability distribution for a problem that asks for an expectation. (A variance is a form of expectation.) More often than not, finding an entire probability distribution is orders of magnitude more work than finding an expectation, so taking the detour through the probability distribution is usually a big waste of effort.
– joriki
yesterday
@DebabrataChattopadhyay.: And this problem is a good demonstration of why this is so. You can just add the expectations and variances of the individual times of which $X$ is a sum, whereas finding the probability distribution from the probability distributions of the summands would have been a ton of work with complicated convolutions to calculate.
– joriki
yesterday
@DebabrataChattopadhyay.: And this problem is a good demonstration of why this is so. You can just add the expectations and variances of the individual times of which $X$ is a sum, whereas finding the probability distribution from the probability distributions of the summands would have been a ton of work with complicated convolutions to calculate.
– joriki
yesterday
Is the PMF I wrote correct?
– Debabrata Chattopadhyay.
yesterday
Is the PMF I wrote correct?
– Debabrata Chattopadhyay.
yesterday
1
1
Ok. Thanks @joriki for your help.
– Debabrata Chattopadhyay.
yesterday
Ok. Thanks @joriki for your help.
– Debabrata Chattopadhyay.
yesterday
 |Â
show 1 more comment
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2872349%2fa-question-of-probability-regarding-expectation-and-variance-of-a-random-variabl%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
This is a variant of the Coupon Collector Problem, for $k=N$ it is literally that problem. Similar techniques should apply.
– lulu
yesterday