A question of probability regarding expectation and variance of a random variable.

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
0
down vote

favorite














Suppose that we have an infinite supply of balls and N boxes. Fix $k le N$. The balls are thrown at random into the boxes until exactly $k$ boxes are occupied. Let $X$ denote the number of throws needed for this. Calculate $E(X)$ and $mathrm Var (X)$.





If for $i geq k$, $X=i$ then that implies the success is obtained in the $i$-th trial. So it means that we have exactly $(k-1)$ occupied boxes in the $(i-1)$-th trial. Then the $i$-th ball is placed at some box other than the former $(k-1)$ boxes. How can we do that?



First we choose $(k-1)$ boxes from $N$ boxes. Then we put $(i-1)$ balls into those boxes such that no box is left empty. Then the remaining $i$-th ball can be placed in any of the remaining $(N-k+1)$ boxes.



So according to me we have for $i geq k$



$$P(X=i) = frac N choose k-1S(i-1,k-1)N-k+1choose 1 N^i.$$ Where $S(m,n)$ denotes the number of all surjective functions from an $m$ elements set onto an $n$ elements set.



But then the expressions of expectation and variance become very complicated. Is everything I have done so far correct? If 'yes' can somebody please tell me the easier way to calculate the expectation and variance of the above random variable $X$?



Thank you very much.







share|cite|improve this question





















  • This is a variant of the Coupon Collector Problem, for $k=N$ it is literally that problem. Similar techniques should apply.
    – lulu
    yesterday














up vote
0
down vote

favorite














Suppose that we have an infinite supply of balls and N boxes. Fix $k le N$. The balls are thrown at random into the boxes until exactly $k$ boxes are occupied. Let $X$ denote the number of throws needed for this. Calculate $E(X)$ and $mathrm Var (X)$.





If for $i geq k$, $X=i$ then that implies the success is obtained in the $i$-th trial. So it means that we have exactly $(k-1)$ occupied boxes in the $(i-1)$-th trial. Then the $i$-th ball is placed at some box other than the former $(k-1)$ boxes. How can we do that?



First we choose $(k-1)$ boxes from $N$ boxes. Then we put $(i-1)$ balls into those boxes such that no box is left empty. Then the remaining $i$-th ball can be placed in any of the remaining $(N-k+1)$ boxes.



So according to me we have for $i geq k$



$$P(X=i) = frac N choose k-1S(i-1,k-1)N-k+1choose 1 N^i.$$ Where $S(m,n)$ denotes the number of all surjective functions from an $m$ elements set onto an $n$ elements set.



But then the expressions of expectation and variance become very complicated. Is everything I have done so far correct? If 'yes' can somebody please tell me the easier way to calculate the expectation and variance of the above random variable $X$?



Thank you very much.







share|cite|improve this question





















  • This is a variant of the Coupon Collector Problem, for $k=N$ it is literally that problem. Similar techniques should apply.
    – lulu
    yesterday












up vote
0
down vote

favorite









up vote
0
down vote

favorite













Suppose that we have an infinite supply of balls and N boxes. Fix $k le N$. The balls are thrown at random into the boxes until exactly $k$ boxes are occupied. Let $X$ denote the number of throws needed for this. Calculate $E(X)$ and $mathrm Var (X)$.





If for $i geq k$, $X=i$ then that implies the success is obtained in the $i$-th trial. So it means that we have exactly $(k-1)$ occupied boxes in the $(i-1)$-th trial. Then the $i$-th ball is placed at some box other than the former $(k-1)$ boxes. How can we do that?



First we choose $(k-1)$ boxes from $N$ boxes. Then we put $(i-1)$ balls into those boxes such that no box is left empty. Then the remaining $i$-th ball can be placed in any of the remaining $(N-k+1)$ boxes.



So according to me we have for $i geq k$



$$P(X=i) = frac N choose k-1S(i-1,k-1)N-k+1choose 1 N^i.$$ Where $S(m,n)$ denotes the number of all surjective functions from an $m$ elements set onto an $n$ elements set.



But then the expressions of expectation and variance become very complicated. Is everything I have done so far correct? If 'yes' can somebody please tell me the easier way to calculate the expectation and variance of the above random variable $X$?



Thank you very much.







share|cite|improve this question















Suppose that we have an infinite supply of balls and N boxes. Fix $k le N$. The balls are thrown at random into the boxes until exactly $k$ boxes are occupied. Let $X$ denote the number of throws needed for this. Calculate $E(X)$ and $mathrm Var (X)$.





If for $i geq k$, $X=i$ then that implies the success is obtained in the $i$-th trial. So it means that we have exactly $(k-1)$ occupied boxes in the $(i-1)$-th trial. Then the $i$-th ball is placed at some box other than the former $(k-1)$ boxes. How can we do that?



First we choose $(k-1)$ boxes from $N$ boxes. Then we put $(i-1)$ balls into those boxes such that no box is left empty. Then the remaining $i$-th ball can be placed in any of the remaining $(N-k+1)$ boxes.



So according to me we have for $i geq k$



$$P(X=i) = frac N choose k-1S(i-1,k-1)N-k+1choose 1 N^i.$$ Where $S(m,n)$ denotes the number of all surjective functions from an $m$ elements set onto an $n$ elements set.



But then the expressions of expectation and variance become very complicated. Is everything I have done so far correct? If 'yes' can somebody please tell me the easier way to calculate the expectation and variance of the above random variable $X$?



Thank you very much.









share|cite|improve this question












share|cite|improve this question




share|cite|improve this question








edited yesterday
























asked yesterday









Debabrata Chattopadhyay.

8311




8311











  • This is a variant of the Coupon Collector Problem, for $k=N$ it is literally that problem. Similar techniques should apply.
    – lulu
    yesterday
















  • This is a variant of the Coupon Collector Problem, for $k=N$ it is literally that problem. Similar techniques should apply.
    – lulu
    yesterday















This is a variant of the Coupon Collector Problem, for $k=N$ it is literally that problem. Similar techniques should apply.
– lulu
yesterday




This is a variant of the Coupon Collector Problem, for $k=N$ it is literally that problem. Similar techniques should apply.
– lulu
yesterday










1 Answer
1






active

oldest

votes

















up vote
2
down vote



accepted










This is related to the coupon collector's problem. The time it takes to hit a new empty box when $j$ boxes are already occupied is geometrically distributed with success probability $1-frac jN$. These times are independent. The expectation and variance of a geometric distribution with success probability $p$ are $frac1p$ and $frac1-pp^2$, respectively. The expectation and variance of the sum of independent random variables are the sum of the expectations and variances, respectively. Thus



begineqnarray*
mathsf E(X)=sum_j=0^k-1frac11-frac jN=Nsum_j=0^k-1frac1N-j
endeqnarray*



and



$$
mathsfVar(X)=sum_j=0^k-1fracfrac jNleft(1-frac jNright)^2=Nsum_j=0^k-1frac j(N-j)^2;.
$$






share|cite|improve this answer





















  • Where have I done mistake @joriki?
    – Debabrata Chattopadhyay.
    yesterday










  • @DebabrataChattopadhyay.: Your "mistake" was to derive a probability distribution for a problem that asks for an expectation. (A variance is a form of expectation.) More often than not, finding an entire probability distribution is orders of magnitude more work than finding an expectation, so taking the detour through the probability distribution is usually a big waste of effort.
    – joriki
    yesterday











  • @DebabrataChattopadhyay.: And this problem is a good demonstration of why this is so. You can just add the expectations and variances of the individual times of which $X$ is a sum, whereas finding the probability distribution from the probability distributions of the summands would have been a ton of work with complicated convolutions to calculate.
    – joriki
    yesterday











  • Is the PMF I wrote correct?
    – Debabrata Chattopadhyay.
    yesterday






  • 1




    Ok. Thanks @joriki for your help.
    – Debabrata Chattopadhyay.
    yesterday










Your Answer




StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "69"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);








 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2872349%2fa-question-of-probability-regarding-expectation-and-variance-of-a-random-variabl%23new-answer', 'question_page');

);

Post as a guest






























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
2
down vote



accepted










This is related to the coupon collector's problem. The time it takes to hit a new empty box when $j$ boxes are already occupied is geometrically distributed with success probability $1-frac jN$. These times are independent. The expectation and variance of a geometric distribution with success probability $p$ are $frac1p$ and $frac1-pp^2$, respectively. The expectation and variance of the sum of independent random variables are the sum of the expectations and variances, respectively. Thus



begineqnarray*
mathsf E(X)=sum_j=0^k-1frac11-frac jN=Nsum_j=0^k-1frac1N-j
endeqnarray*



and



$$
mathsfVar(X)=sum_j=0^k-1fracfrac jNleft(1-frac jNright)^2=Nsum_j=0^k-1frac j(N-j)^2;.
$$






share|cite|improve this answer





















  • Where have I done mistake @joriki?
    – Debabrata Chattopadhyay.
    yesterday










  • @DebabrataChattopadhyay.: Your "mistake" was to derive a probability distribution for a problem that asks for an expectation. (A variance is a form of expectation.) More often than not, finding an entire probability distribution is orders of magnitude more work than finding an expectation, so taking the detour through the probability distribution is usually a big waste of effort.
    – joriki
    yesterday











  • @DebabrataChattopadhyay.: And this problem is a good demonstration of why this is so. You can just add the expectations and variances of the individual times of which $X$ is a sum, whereas finding the probability distribution from the probability distributions of the summands would have been a ton of work with complicated convolutions to calculate.
    – joriki
    yesterday











  • Is the PMF I wrote correct?
    – Debabrata Chattopadhyay.
    yesterday






  • 1




    Ok. Thanks @joriki for your help.
    – Debabrata Chattopadhyay.
    yesterday














up vote
2
down vote



accepted










This is related to the coupon collector's problem. The time it takes to hit a new empty box when $j$ boxes are already occupied is geometrically distributed with success probability $1-frac jN$. These times are independent. The expectation and variance of a geometric distribution with success probability $p$ are $frac1p$ and $frac1-pp^2$, respectively. The expectation and variance of the sum of independent random variables are the sum of the expectations and variances, respectively. Thus



begineqnarray*
mathsf E(X)=sum_j=0^k-1frac11-frac jN=Nsum_j=0^k-1frac1N-j
endeqnarray*



and



$$
mathsfVar(X)=sum_j=0^k-1fracfrac jNleft(1-frac jNright)^2=Nsum_j=0^k-1frac j(N-j)^2;.
$$






share|cite|improve this answer





















  • Where have I done mistake @joriki?
    – Debabrata Chattopadhyay.
    yesterday










  • @DebabrataChattopadhyay.: Your "mistake" was to derive a probability distribution for a problem that asks for an expectation. (A variance is a form of expectation.) More often than not, finding an entire probability distribution is orders of magnitude more work than finding an expectation, so taking the detour through the probability distribution is usually a big waste of effort.
    – joriki
    yesterday











  • @DebabrataChattopadhyay.: And this problem is a good demonstration of why this is so. You can just add the expectations and variances of the individual times of which $X$ is a sum, whereas finding the probability distribution from the probability distributions of the summands would have been a ton of work with complicated convolutions to calculate.
    – joriki
    yesterday











  • Is the PMF I wrote correct?
    – Debabrata Chattopadhyay.
    yesterday






  • 1




    Ok. Thanks @joriki for your help.
    – Debabrata Chattopadhyay.
    yesterday












up vote
2
down vote



accepted







up vote
2
down vote



accepted






This is related to the coupon collector's problem. The time it takes to hit a new empty box when $j$ boxes are already occupied is geometrically distributed with success probability $1-frac jN$. These times are independent. The expectation and variance of a geometric distribution with success probability $p$ are $frac1p$ and $frac1-pp^2$, respectively. The expectation and variance of the sum of independent random variables are the sum of the expectations and variances, respectively. Thus



begineqnarray*
mathsf E(X)=sum_j=0^k-1frac11-frac jN=Nsum_j=0^k-1frac1N-j
endeqnarray*



and



$$
mathsfVar(X)=sum_j=0^k-1fracfrac jNleft(1-frac jNright)^2=Nsum_j=0^k-1frac j(N-j)^2;.
$$






share|cite|improve this answer













This is related to the coupon collector's problem. The time it takes to hit a new empty box when $j$ boxes are already occupied is geometrically distributed with success probability $1-frac jN$. These times are independent. The expectation and variance of a geometric distribution with success probability $p$ are $frac1p$ and $frac1-pp^2$, respectively. The expectation and variance of the sum of independent random variables are the sum of the expectations and variances, respectively. Thus



begineqnarray*
mathsf E(X)=sum_j=0^k-1frac11-frac jN=Nsum_j=0^k-1frac1N-j
endeqnarray*



and



$$
mathsfVar(X)=sum_j=0^k-1fracfrac jNleft(1-frac jNright)^2=Nsum_j=0^k-1frac j(N-j)^2;.
$$







share|cite|improve this answer













share|cite|improve this answer



share|cite|improve this answer











answered yesterday









joriki

164k10179328




164k10179328











  • Where have I done mistake @joriki?
    – Debabrata Chattopadhyay.
    yesterday










  • @DebabrataChattopadhyay.: Your "mistake" was to derive a probability distribution for a problem that asks for an expectation. (A variance is a form of expectation.) More often than not, finding an entire probability distribution is orders of magnitude more work than finding an expectation, so taking the detour through the probability distribution is usually a big waste of effort.
    – joriki
    yesterday











  • @DebabrataChattopadhyay.: And this problem is a good demonstration of why this is so. You can just add the expectations and variances of the individual times of which $X$ is a sum, whereas finding the probability distribution from the probability distributions of the summands would have been a ton of work with complicated convolutions to calculate.
    – joriki
    yesterday











  • Is the PMF I wrote correct?
    – Debabrata Chattopadhyay.
    yesterday






  • 1




    Ok. Thanks @joriki for your help.
    – Debabrata Chattopadhyay.
    yesterday
















  • Where have I done mistake @joriki?
    – Debabrata Chattopadhyay.
    yesterday










  • @DebabrataChattopadhyay.: Your "mistake" was to derive a probability distribution for a problem that asks for an expectation. (A variance is a form of expectation.) More often than not, finding an entire probability distribution is orders of magnitude more work than finding an expectation, so taking the detour through the probability distribution is usually a big waste of effort.
    – joriki
    yesterday











  • @DebabrataChattopadhyay.: And this problem is a good demonstration of why this is so. You can just add the expectations and variances of the individual times of which $X$ is a sum, whereas finding the probability distribution from the probability distributions of the summands would have been a ton of work with complicated convolutions to calculate.
    – joriki
    yesterday











  • Is the PMF I wrote correct?
    – Debabrata Chattopadhyay.
    yesterday






  • 1




    Ok. Thanks @joriki for your help.
    – Debabrata Chattopadhyay.
    yesterday















Where have I done mistake @joriki?
– Debabrata Chattopadhyay.
yesterday




Where have I done mistake @joriki?
– Debabrata Chattopadhyay.
yesterday












@DebabrataChattopadhyay.: Your "mistake" was to derive a probability distribution for a problem that asks for an expectation. (A variance is a form of expectation.) More often than not, finding an entire probability distribution is orders of magnitude more work than finding an expectation, so taking the detour through the probability distribution is usually a big waste of effort.
– joriki
yesterday





@DebabrataChattopadhyay.: Your "mistake" was to derive a probability distribution for a problem that asks for an expectation. (A variance is a form of expectation.) More often than not, finding an entire probability distribution is orders of magnitude more work than finding an expectation, so taking the detour through the probability distribution is usually a big waste of effort.
– joriki
yesterday













@DebabrataChattopadhyay.: And this problem is a good demonstration of why this is so. You can just add the expectations and variances of the individual times of which $X$ is a sum, whereas finding the probability distribution from the probability distributions of the summands would have been a ton of work with complicated convolutions to calculate.
– joriki
yesterday





@DebabrataChattopadhyay.: And this problem is a good demonstration of why this is so. You can just add the expectations and variances of the individual times of which $X$ is a sum, whereas finding the probability distribution from the probability distributions of the summands would have been a ton of work with complicated convolutions to calculate.
– joriki
yesterday













Is the PMF I wrote correct?
– Debabrata Chattopadhyay.
yesterday




Is the PMF I wrote correct?
– Debabrata Chattopadhyay.
yesterday




1




1




Ok. Thanks @joriki for your help.
– Debabrata Chattopadhyay.
yesterday




Ok. Thanks @joriki for your help.
– Debabrata Chattopadhyay.
yesterday












 

draft saved


draft discarded


























 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2872349%2fa-question-of-probability-regarding-expectation-and-variance-of-a-random-variabl%23new-answer', 'question_page');

);

Post as a guest













































































Comments

Popular posts from this blog

What is the equation of a 3D cone with generalised tilt?

Color the edges and diagonals of a regular polygon

Relationship between determinant of matrix and determinant of adjoint?