A question of probability regarding expectation and variance of a random variable.

up vote
0
down vote

favorite

Suppose that we have an infinite supply of balls and N boxes. Fix $k le N$. The balls are thrown at random into the boxes until exactly $k$ boxes are occupied. Let $X$ denote the number of throws needed for this. Calculate $E(X)$ and $mathrm Var (X)$.

If for $i geq k$, $X=i$ then that implies the success is obtained in the $i$-th trial. So it means that we have exactly $(k-1)$ occupied boxes in the $(i-1)$-th trial. Then the $i$-th ball is placed at some box other than the former $(k-1)$ boxes. How can we do that?

First we choose $(k-1)$ boxes from $N$ boxes. Then we put $(i-1)$ balls into those boxes such that no box is left empty. Then the remaining $i$-th ball can be placed in any of the remaining $(N-k+1)$ boxes.

So according to me we have for $i geq k$

$$P(X=i) = frac N choose k-1S(i-1,k-1)N-k+1choose 1 N^i.$$ Where $S(m,n)$ denotes the number of all surjective functions from an $m$ elements set onto an $n$ elements set.

But then the expressions of expectation and variance become very complicated. Is everything I have done so far correct? If 'yes' can somebody please tell me the easier way to calculate the expectation and variance of the above random variable $X$?

Thank you very much.

edited yesterday

asked yesterday

Debabrata Chattopadhyay.

8311

This is a variant of the Coupon Collector Problem, for $k=N$ it is literally that problem. Similar techniques should apply.
â€“Â lulu
yesterday

add a commentÂ |Â

up vote
0
down vote

favorite

Suppose that we have an infinite supply of balls and N boxes. Fix $k le N$. The balls are thrown at random into the boxes until exactly $k$ boxes are occupied. Let $X$ denote the number of throws needed for this. Calculate $E(X)$ and $mathrm Var (X)$.

So according to me we have for $i geq k$

$$P(X=i) = frac N choose k-1S(i-1,k-1)N-k+1choose 1 N^i.$$ Where $S(m,n)$ denotes the number of all surjective functions from an $m$ elements set onto an $n$ elements set.

Thank you very much.

edited yesterday

asked yesterday

Debabrata Chattopadhyay.

8311

This is a variant of the Coupon Collector Problem, for $k=N$ it is literally that problem. Similar techniques should apply.
â€“Â lulu
yesterday

add a commentÂ |Â

up vote
0
down vote

favorite

Suppose that we have an infinite supply of balls and N boxes. Fix $k le N$. The balls are thrown at random into the boxes until exactly $k$ boxes are occupied. Let $X$ denote the number of throws needed for this. Calculate $E(X)$ and $mathrm Var (X)$.

So according to me we have for $i geq k$

$$P(X=i) = frac N choose k-1S(i-1,k-1)N-k+1choose 1 N^i.$$ Where $S(m,n)$ denotes the number of all surjective functions from an $m$ elements set onto an $n$ elements set.

Thank you very much.

edited yesterday

asked yesterday

Debabrata Chattopadhyay.

8311

Suppose that we have an infinite supply of balls and N boxes. Fix $k le N$. The balls are thrown at random into the boxes until exactly $k$ boxes are occupied. Let $X$ denote the number of throws needed for this. Calculate $E(X)$ and $mathrm Var (X)$.

So according to me we have for $i geq k$

$$P(X=i) = frac N choose k-1S(i-1,k-1)N-k+1choose 1 N^i.$$ Where $S(m,n)$ denotes the number of all surjective functions from an $m$ elements set onto an $n$ elements set.

Thank you very much.

edited yesterday

asked yesterday

Debabrata Chattopadhyay.

8311

edited yesterday

asked yesterday

Debabrata Chattopadhyay.

8311

asked yesterday

Debabrata Chattopadhyay.

8311

asked yesterday

Debabrata Chattopadhyay.

8311

This is a variant of the Coupon Collector Problem, for $k=N$ it is literally that problem. Similar techniques should apply.
â€“Â lulu
yesterday

add a commentÂ |Â

This is a variant of the Coupon Collector Problem, for $k=N$ it is literally that problem. Similar techniques should apply.
â€“Â lulu
yesterday

This is a variant of the Coupon Collector Problem, for $k=N$ it is literally that problem. Similar techniques should apply.
â€“Â lulu
yesterday

add a commentÂ |Â

1 Answer
1

active

oldest

votes

up vote
2
down vote

accepted

This is related to the coupon collector's problem. The time it takes to hit a new empty box when $j$ boxes are already occupied is geometrically distributed with success probability $1-frac jN$. These times are independent. The expectation and variance of a geometric distribution with success probability $p$ are $frac1p$ and $frac1-pp^2$, respectively. The expectation and variance of the sum of independent random variables are the sum of the expectations and variances, respectively. Thus

begineqnarray*
mathsf E(X)=sum_j=0^k-1frac11-frac jN=Nsum_j=0^k-1frac1N-j
endeqnarray*

and

$$
mathsfVar(X)=sum_j=0^k-1fracfrac jNleft(1-frac jNright)^2=Nsum_j=0^k-1frac j(N-j)^2;.
$$

answered yesterday

joriki

164k10179328

Where have I done mistake @joriki?
â€“Â Debabrata Chattopadhyay.
yesterday

@DebabrataChattopadhyay.: Your "mistake" was to derive a probability distribution for a problem that asks for an expectation. (A variance is a form of expectation.) More often than not, finding an entire probability distribution is orders of magnitude more work than finding an expectation, so taking the detour through the probability distribution is usually a big waste of effort.
â€“Â joriki
yesterday

@DebabrataChattopadhyay.: And this problem is a good demonstration of why this is so. You can just add the expectations and variances of the individual times of which $X$ is a sum, whereas finding the probability distribution from the probability distributions of the summands would have been a ton of work with complicated convolutions to calculate.
â€“Â joriki
yesterday

Is the PMF I wrote correct?
â€“Â Debabrata Chattopadhyay.
yesterday

1

Ok. Thanks @joriki for your help.
â€“Â Debabrata Chattopadhyay.
yesterday

Â |Â
show 1 more comment

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "69"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2872349%2fa-question-of-probability-regarding-expectation-and-variance-of-a-random-variabl%23new-answer', 'question_page');

);

Post as a guest

Name

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
2
down vote

accepted

begineqnarray*
mathsf E(X)=sum_j=0^k-1frac11-frac jN=Nsum_j=0^k-1frac1N-j
endeqnarray*

and

$$
mathsfVar(X)=sum_j=0^k-1fracfrac jNleft(1-frac jNright)^2=Nsum_j=0^k-1frac j(N-j)^2;.
$$

answered yesterday

joriki

164k10179328

Where have I done mistake @joriki?
â€“Â Debabrata Chattopadhyay.
yesterday

@DebabrataChattopadhyay.: Your "mistake" was to derive a probability distribution for a problem that asks for an expectation. (A variance is a form of expectation.) More often than not, finding an entire probability distribution is orders of magnitude more work than finding an expectation, so taking the detour through the probability distribution is usually a big waste of effort.
â€“Â joriki
yesterday

@DebabrataChattopadhyay.: And this problem is a good demonstration of why this is so. You can just add the expectations and variances of the individual times of which $X$ is a sum, whereas finding the probability distribution from the probability distributions of the summands would have been a ton of work with complicated convolutions to calculate.
â€“Â joriki
yesterday

Is the PMF I wrote correct?
â€“Â Debabrata Chattopadhyay.
yesterday

1

Ok. Thanks @joriki for your help.
â€“Â Debabrata Chattopadhyay.
yesterday

Â |Â
show 1 more comment

up vote
2
down vote

accepted

begineqnarray*
mathsf E(X)=sum_j=0^k-1frac11-frac jN=Nsum_j=0^k-1frac1N-j
endeqnarray*

and

$$
mathsfVar(X)=sum_j=0^k-1fracfrac jNleft(1-frac jNright)^2=Nsum_j=0^k-1frac j(N-j)^2;.
$$

answered yesterday

joriki

164k10179328

Where have I done mistake @joriki?
â€“Â Debabrata Chattopadhyay.
yesterday

@DebabrataChattopadhyay.: Your "mistake" was to derive a probability distribution for a problem that asks for an expectation. (A variance is a form of expectation.) More often than not, finding an entire probability distribution is orders of magnitude more work than finding an expectation, so taking the detour through the probability distribution is usually a big waste of effort.
â€“Â joriki
yesterday

@DebabrataChattopadhyay.: And this problem is a good demonstration of why this is so. You can just add the expectations and variances of the individual times of which $X$ is a sum, whereas finding the probability distribution from the probability distributions of the summands would have been a ton of work with complicated convolutions to calculate.
â€“Â joriki
yesterday

Is the PMF I wrote correct?
â€“Â Debabrata Chattopadhyay.
yesterday

1

Ok. Thanks @joriki for your help.
â€“Â Debabrata Chattopadhyay.
yesterday

Â |Â
show 1 more comment

up vote
2
down vote

accepted

begineqnarray*
mathsf E(X)=sum_j=0^k-1frac11-frac jN=Nsum_j=0^k-1frac1N-j
endeqnarray*

and

$$
mathsfVar(X)=sum_j=0^k-1fracfrac jNleft(1-frac jNright)^2=Nsum_j=0^k-1frac j(N-j)^2;.
$$

answered yesterday

joriki

164k10179328

begineqnarray*
mathsf E(X)=sum_j=0^k-1frac11-frac jN=Nsum_j=0^k-1frac1N-j
endeqnarray*

and

$$
mathsfVar(X)=sum_j=0^k-1fracfrac jNleft(1-frac jNright)^2=Nsum_j=0^k-1frac j(N-j)^2;.
$$

answered yesterday

joriki

164k10179328

answered yesterday

joriki

164k10179328

answered yesterday

joriki

164k10179328

answered yesterday

joriki

164k10179328

Where have I done mistake @joriki?
â€“Â Debabrata Chattopadhyay.
yesterday

@DebabrataChattopadhyay.: Your "mistake" was to derive a probability distribution for a problem that asks for an expectation. (A variance is a form of expectation.) More often than not, finding an entire probability distribution is orders of magnitude more work than finding an expectation, so taking the detour through the probability distribution is usually a big waste of effort.
â€“Â joriki
yesterday

@DebabrataChattopadhyay.: And this problem is a good demonstration of why this is so. You can just add the expectations and variances of the individual times of which $X$ is a sum, whereas finding the probability distribution from the probability distributions of the summands would have been a ton of work with complicated convolutions to calculate.
â€“Â joriki
yesterday

Is the PMF I wrote correct?
â€“Â Debabrata Chattopadhyay.
yesterday

1

Ok. Thanks @joriki for your help.
â€“Â Debabrata Chattopadhyay.
yesterday

Â |Â
show 1 more comment

Where have I done mistake @joriki?
â€“Â Debabrata Chattopadhyay.
yesterday

@DebabrataChattopadhyay.: Your "mistake" was to derive a probability distribution for a problem that asks for an expectation. (A variance is a form of expectation.) More often than not, finding an entire probability distribution is orders of magnitude more work than finding an expectation, so taking the detour through the probability distribution is usually a big waste of effort.
â€“Â joriki
yesterday

@DebabrataChattopadhyay.: And this problem is a good demonstration of why this is so. You can just add the expectations and variances of the individual times of which $X$ is a sum, whereas finding the probability distribution from the probability distributions of the summands would have been a ton of work with complicated convolutions to calculate.
â€“Â joriki
yesterday

Is the PMF I wrote correct?
â€“Â Debabrata Chattopadhyay.
yesterday

1

Ok. Thanks @joriki for your help.
â€“Â Debabrata Chattopadhyay.
yesterday

Where have I done mistake @joriki?
â€“Â Debabrata Chattopadhyay.
yesterday

@DebabrataChattopadhyay.: Your "mistake" was to derive a probability distribution for a problem that asks for an expectation. (A variance is a form of expectation.) More often than not, finding an entire probability distribution is orders of magnitude more work than finding an expectation, so taking the detour through the probability distribution is usually a big waste of effort.
â€“Â joriki
yesterday

@DebabrataChattopadhyay.: And this problem is a good demonstration of why this is so. You can just add the expectations and variances of the individual times of which $X$ is a sum, whereas finding the probability distribution from the probability distributions of the summands would have been a ton of work with complicated convolutions to calculate.
â€“Â joriki
yesterday

Is the PMF I wrote correct?
â€“Â Debabrata Chattopadhyay.
yesterday

Ok. Thanks @joriki for your help.
â€“Â Debabrata Chattopadhyay.
yesterday

Â |Â
show 1 more comment

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

Search This Blog

ukmuiik