Mutual Information between dataset and output of algorithm
Clash Royale CLAN TAG#URR8PPP
up vote
1
down vote
favorite
I am trying to understand how the Mutual Information behaves when I have a sequence of iid samples $X^n$, where each $X$ takes values in an alphabet $mathcalX$ and I have an algorithm $mathcalA:mathcalX^nto mathcalY$. I am not posing any constraint on $mathcalA$, but I prefer to think about it as a randomized function.
More precisely I am trying to lower bound $I(X^n;Y)$ where $Y=mathcalA(X^n)$.
I know that:
$$I(X^n;Y)geq n(I(X;Y)) $$ and that $0leq I(X;Y) leq log(|mathcalX|)$.
Since $X$ and $Y$ are far from independent I would argue that $I(X;Y)>0$. Do you think that it is possible that $I(X^n;Y)$ is not polynomial in $n$? Could it be that $I(X;Y)approx(1/n)$? Is there something trivial that I am not considering?
Thanks for any tip!
probability information-theory upper-lower-bounds
add a comment |Â
up vote
1
down vote
favorite
I am trying to understand how the Mutual Information behaves when I have a sequence of iid samples $X^n$, where each $X$ takes values in an alphabet $mathcalX$ and I have an algorithm $mathcalA:mathcalX^nto mathcalY$. I am not posing any constraint on $mathcalA$, but I prefer to think about it as a randomized function.
More precisely I am trying to lower bound $I(X^n;Y)$ where $Y=mathcalA(X^n)$.
I know that:
$$I(X^n;Y)geq n(I(X;Y)) $$ and that $0leq I(X;Y) leq log(|mathcalX|)$.
Since $X$ and $Y$ are far from independent I would argue that $I(X;Y)>0$. Do you think that it is possible that $I(X^n;Y)$ is not polynomial in $n$? Could it be that $I(X;Y)approx(1/n)$? Is there something trivial that I am not considering?
Thanks for any tip!
probability information-theory upper-lower-bounds
I suspect you need some constraint on your algorithm, as it is always possible to construct an algorithm for which I(X^n;Y)=0, which necessarily implies that I(X;Y)=0.
â John Polcari
Jul 16 at 15:43
@JohnPolcari, you are totally right. I did not properly formalize it, but ideally I am considering algorithms where this does not happen (more specifically, Learning algorithms)
â user1868607
Jul 16 at 15:49
1
I would expect that how you might bound I(X^n;Y) would intimately depend upon how you choose to constrain your algorithm.
â John Polcari
Jul 16 at 15:51
I'm not sure $I(X^n;Y) ge nI(X;Y)$ follows - consider the case $mathcalA(X^n) = X_1.$ Also, you need to be a little less cavalier about writing $I(X;Y)$ - what is $X$? $Y$ is a stochastic function of a sequence $X^n,$ is $X$ one of these? Some independent copy of one of these? This is of course just reiterating John's point, but really at this level of generality basically nothing can be said.
â stochasticboy321
Jul 17 at 23:03
add a comment |Â
up vote
1
down vote
favorite
up vote
1
down vote
favorite
I am trying to understand how the Mutual Information behaves when I have a sequence of iid samples $X^n$, where each $X$ takes values in an alphabet $mathcalX$ and I have an algorithm $mathcalA:mathcalX^nto mathcalY$. I am not posing any constraint on $mathcalA$, but I prefer to think about it as a randomized function.
More precisely I am trying to lower bound $I(X^n;Y)$ where $Y=mathcalA(X^n)$.
I know that:
$$I(X^n;Y)geq n(I(X;Y)) $$ and that $0leq I(X;Y) leq log(|mathcalX|)$.
Since $X$ and $Y$ are far from independent I would argue that $I(X;Y)>0$. Do you think that it is possible that $I(X^n;Y)$ is not polynomial in $n$? Could it be that $I(X;Y)approx(1/n)$? Is there something trivial that I am not considering?
Thanks for any tip!
probability information-theory upper-lower-bounds
I am trying to understand how the Mutual Information behaves when I have a sequence of iid samples $X^n$, where each $X$ takes values in an alphabet $mathcalX$ and I have an algorithm $mathcalA:mathcalX^nto mathcalY$. I am not posing any constraint on $mathcalA$, but I prefer to think about it as a randomized function.
More precisely I am trying to lower bound $I(X^n;Y)$ where $Y=mathcalA(X^n)$.
I know that:
$$I(X^n;Y)geq n(I(X;Y)) $$ and that $0leq I(X;Y) leq log(|mathcalX|)$.
Since $X$ and $Y$ are far from independent I would argue that $I(X;Y)>0$. Do you think that it is possible that $I(X^n;Y)$ is not polynomial in $n$? Could it be that $I(X;Y)approx(1/n)$? Is there something trivial that I am not considering?
Thanks for any tip!
probability information-theory upper-lower-bounds
asked Jul 16 at 15:04
user1868607
1941110
1941110
I suspect you need some constraint on your algorithm, as it is always possible to construct an algorithm for which I(X^n;Y)=0, which necessarily implies that I(X;Y)=0.
â John Polcari
Jul 16 at 15:43
@JohnPolcari, you are totally right. I did not properly formalize it, but ideally I am considering algorithms where this does not happen (more specifically, Learning algorithms)
â user1868607
Jul 16 at 15:49
1
I would expect that how you might bound I(X^n;Y) would intimately depend upon how you choose to constrain your algorithm.
â John Polcari
Jul 16 at 15:51
I'm not sure $I(X^n;Y) ge nI(X;Y)$ follows - consider the case $mathcalA(X^n) = X_1.$ Also, you need to be a little less cavalier about writing $I(X;Y)$ - what is $X$? $Y$ is a stochastic function of a sequence $X^n,$ is $X$ one of these? Some independent copy of one of these? This is of course just reiterating John's point, but really at this level of generality basically nothing can be said.
â stochasticboy321
Jul 17 at 23:03
add a comment |Â
I suspect you need some constraint on your algorithm, as it is always possible to construct an algorithm for which I(X^n;Y)=0, which necessarily implies that I(X;Y)=0.
â John Polcari
Jul 16 at 15:43
@JohnPolcari, you are totally right. I did not properly formalize it, but ideally I am considering algorithms where this does not happen (more specifically, Learning algorithms)
â user1868607
Jul 16 at 15:49
1
I would expect that how you might bound I(X^n;Y) would intimately depend upon how you choose to constrain your algorithm.
â John Polcari
Jul 16 at 15:51
I'm not sure $I(X^n;Y) ge nI(X;Y)$ follows - consider the case $mathcalA(X^n) = X_1.$ Also, you need to be a little less cavalier about writing $I(X;Y)$ - what is $X$? $Y$ is a stochastic function of a sequence $X^n,$ is $X$ one of these? Some independent copy of one of these? This is of course just reiterating John's point, but really at this level of generality basically nothing can be said.
â stochasticboy321
Jul 17 at 23:03
I suspect you need some constraint on your algorithm, as it is always possible to construct an algorithm for which I(X^n;Y)=0, which necessarily implies that I(X;Y)=0.
â John Polcari
Jul 16 at 15:43
I suspect you need some constraint on your algorithm, as it is always possible to construct an algorithm for which I(X^n;Y)=0, which necessarily implies that I(X;Y)=0.
â John Polcari
Jul 16 at 15:43
@JohnPolcari, you are totally right. I did not properly formalize it, but ideally I am considering algorithms where this does not happen (more specifically, Learning algorithms)
â user1868607
Jul 16 at 15:49
@JohnPolcari, you are totally right. I did not properly formalize it, but ideally I am considering algorithms where this does not happen (more specifically, Learning algorithms)
â user1868607
Jul 16 at 15:49
1
1
I would expect that how you might bound I(X^n;Y) would intimately depend upon how you choose to constrain your algorithm.
â John Polcari
Jul 16 at 15:51
I would expect that how you might bound I(X^n;Y) would intimately depend upon how you choose to constrain your algorithm.
â John Polcari
Jul 16 at 15:51
I'm not sure $I(X^n;Y) ge nI(X;Y)$ follows - consider the case $mathcalA(X^n) = X_1.$ Also, you need to be a little less cavalier about writing $I(X;Y)$ - what is $X$? $Y$ is a stochastic function of a sequence $X^n,$ is $X$ one of these? Some independent copy of one of these? This is of course just reiterating John's point, but really at this level of generality basically nothing can be said.
â stochasticboy321
Jul 17 at 23:03
I'm not sure $I(X^n;Y) ge nI(X;Y)$ follows - consider the case $mathcalA(X^n) = X_1.$ Also, you need to be a little less cavalier about writing $I(X;Y)$ - what is $X$? $Y$ is a stochastic function of a sequence $X^n,$ is $X$ one of these? Some independent copy of one of these? This is of course just reiterating John's point, but really at this level of generality basically nothing can be said.
â stochasticboy321
Jul 17 at 23:03
add a comment |Â
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2853491%2fmutual-information-between-dataset-and-output-of-algorithm%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
I suspect you need some constraint on your algorithm, as it is always possible to construct an algorithm for which I(X^n;Y)=0, which necessarily implies that I(X;Y)=0.
â John Polcari
Jul 16 at 15:43
@JohnPolcari, you are totally right. I did not properly formalize it, but ideally I am considering algorithms where this does not happen (more specifically, Learning algorithms)
â user1868607
Jul 16 at 15:49
1
I would expect that how you might bound I(X^n;Y) would intimately depend upon how you choose to constrain your algorithm.
â John Polcari
Jul 16 at 15:51
I'm not sure $I(X^n;Y) ge nI(X;Y)$ follows - consider the case $mathcalA(X^n) = X_1.$ Also, you need to be a little less cavalier about writing $I(X;Y)$ - what is $X$? $Y$ is a stochastic function of a sequence $X^n,$ is $X$ one of these? Some independent copy of one of these? This is of course just reiterating John's point, but really at this level of generality basically nothing can be said.
â stochasticboy321
Jul 17 at 23:03