Mutual Information between dataset and output of algorithm

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
1
down vote

favorite












I am trying to understand how the Mutual Information behaves when I have a sequence of iid samples $X^n$, where each $X$ takes values in an alphabet $mathcalX$ and I have an algorithm $mathcalA:mathcalX^nto mathcalY$. I am not posing any constraint on $mathcalA$, but I prefer to think about it as a randomized function.



More precisely I am trying to lower bound $I(X^n;Y)$ where $Y=mathcalA(X^n)$.



I know that:
$$I(X^n;Y)geq n(I(X;Y)) $$ and that $0leq I(X;Y) leq log(|mathcalX|)$.



Since $X$ and $Y$ are far from independent I would argue that $I(X;Y)>0$. Do you think that it is possible that $I(X^n;Y)$ is not polynomial in $n$? Could it be that $I(X;Y)approx(1/n)$? Is there something trivial that I am not considering?



Thanks for any tip!







share|cite|improve this question



















  • I suspect you need some constraint on your algorithm, as it is always possible to construct an algorithm for which I(X^n;Y)=0, which necessarily implies that I(X;Y)=0.
    – John Polcari
    Jul 16 at 15:43










  • @JohnPolcari, you are totally right. I did not properly formalize it, but ideally I am considering algorithms where this does not happen (more specifically, Learning algorithms)
    – user1868607
    Jul 16 at 15:49






  • 1




    I would expect that how you might bound I(X^n;Y) would intimately depend upon how you choose to constrain your algorithm.
    – John Polcari
    Jul 16 at 15:51










  • I'm not sure $I(X^n;Y) ge nI(X;Y)$ follows - consider the case $mathcalA(X^n) = X_1.$ Also, you need to be a little less cavalier about writing $I(X;Y)$ - what is $X$? $Y$ is a stochastic function of a sequence $X^n,$ is $X$ one of these? Some independent copy of one of these? This is of course just reiterating John's point, but really at this level of generality basically nothing can be said.
    – stochasticboy321
    Jul 17 at 23:03















up vote
1
down vote

favorite












I am trying to understand how the Mutual Information behaves when I have a sequence of iid samples $X^n$, where each $X$ takes values in an alphabet $mathcalX$ and I have an algorithm $mathcalA:mathcalX^nto mathcalY$. I am not posing any constraint on $mathcalA$, but I prefer to think about it as a randomized function.



More precisely I am trying to lower bound $I(X^n;Y)$ where $Y=mathcalA(X^n)$.



I know that:
$$I(X^n;Y)geq n(I(X;Y)) $$ and that $0leq I(X;Y) leq log(|mathcalX|)$.



Since $X$ and $Y$ are far from independent I would argue that $I(X;Y)>0$. Do you think that it is possible that $I(X^n;Y)$ is not polynomial in $n$? Could it be that $I(X;Y)approx(1/n)$? Is there something trivial that I am not considering?



Thanks for any tip!







share|cite|improve this question



















  • I suspect you need some constraint on your algorithm, as it is always possible to construct an algorithm for which I(X^n;Y)=0, which necessarily implies that I(X;Y)=0.
    – John Polcari
    Jul 16 at 15:43










  • @JohnPolcari, you are totally right. I did not properly formalize it, but ideally I am considering algorithms where this does not happen (more specifically, Learning algorithms)
    – user1868607
    Jul 16 at 15:49






  • 1




    I would expect that how you might bound I(X^n;Y) would intimately depend upon how you choose to constrain your algorithm.
    – John Polcari
    Jul 16 at 15:51










  • I'm not sure $I(X^n;Y) ge nI(X;Y)$ follows - consider the case $mathcalA(X^n) = X_1.$ Also, you need to be a little less cavalier about writing $I(X;Y)$ - what is $X$? $Y$ is a stochastic function of a sequence $X^n,$ is $X$ one of these? Some independent copy of one of these? This is of course just reiterating John's point, but really at this level of generality basically nothing can be said.
    – stochasticboy321
    Jul 17 at 23:03













up vote
1
down vote

favorite









up vote
1
down vote

favorite











I am trying to understand how the Mutual Information behaves when I have a sequence of iid samples $X^n$, where each $X$ takes values in an alphabet $mathcalX$ and I have an algorithm $mathcalA:mathcalX^nto mathcalY$. I am not posing any constraint on $mathcalA$, but I prefer to think about it as a randomized function.



More precisely I am trying to lower bound $I(X^n;Y)$ where $Y=mathcalA(X^n)$.



I know that:
$$I(X^n;Y)geq n(I(X;Y)) $$ and that $0leq I(X;Y) leq log(|mathcalX|)$.



Since $X$ and $Y$ are far from independent I would argue that $I(X;Y)>0$. Do you think that it is possible that $I(X^n;Y)$ is not polynomial in $n$? Could it be that $I(X;Y)approx(1/n)$? Is there something trivial that I am not considering?



Thanks for any tip!







share|cite|improve this question











I am trying to understand how the Mutual Information behaves when I have a sequence of iid samples $X^n$, where each $X$ takes values in an alphabet $mathcalX$ and I have an algorithm $mathcalA:mathcalX^nto mathcalY$. I am not posing any constraint on $mathcalA$, but I prefer to think about it as a randomized function.



More precisely I am trying to lower bound $I(X^n;Y)$ where $Y=mathcalA(X^n)$.



I know that:
$$I(X^n;Y)geq n(I(X;Y)) $$ and that $0leq I(X;Y) leq log(|mathcalX|)$.



Since $X$ and $Y$ are far from independent I would argue that $I(X;Y)>0$. Do you think that it is possible that $I(X^n;Y)$ is not polynomial in $n$? Could it be that $I(X;Y)approx(1/n)$? Is there something trivial that I am not considering?



Thanks for any tip!









share|cite|improve this question










share|cite|improve this question




share|cite|improve this question









asked Jul 16 at 15:04









user1868607

1941110




1941110











  • I suspect you need some constraint on your algorithm, as it is always possible to construct an algorithm for which I(X^n;Y)=0, which necessarily implies that I(X;Y)=0.
    – John Polcari
    Jul 16 at 15:43










  • @JohnPolcari, you are totally right. I did not properly formalize it, but ideally I am considering algorithms where this does not happen (more specifically, Learning algorithms)
    – user1868607
    Jul 16 at 15:49






  • 1




    I would expect that how you might bound I(X^n;Y) would intimately depend upon how you choose to constrain your algorithm.
    – John Polcari
    Jul 16 at 15:51










  • I'm not sure $I(X^n;Y) ge nI(X;Y)$ follows - consider the case $mathcalA(X^n) = X_1.$ Also, you need to be a little less cavalier about writing $I(X;Y)$ - what is $X$? $Y$ is a stochastic function of a sequence $X^n,$ is $X$ one of these? Some independent copy of one of these? This is of course just reiterating John's point, but really at this level of generality basically nothing can be said.
    – stochasticboy321
    Jul 17 at 23:03

















  • I suspect you need some constraint on your algorithm, as it is always possible to construct an algorithm for which I(X^n;Y)=0, which necessarily implies that I(X;Y)=0.
    – John Polcari
    Jul 16 at 15:43










  • @JohnPolcari, you are totally right. I did not properly formalize it, but ideally I am considering algorithms where this does not happen (more specifically, Learning algorithms)
    – user1868607
    Jul 16 at 15:49






  • 1




    I would expect that how you might bound I(X^n;Y) would intimately depend upon how you choose to constrain your algorithm.
    – John Polcari
    Jul 16 at 15:51










  • I'm not sure $I(X^n;Y) ge nI(X;Y)$ follows - consider the case $mathcalA(X^n) = X_1.$ Also, you need to be a little less cavalier about writing $I(X;Y)$ - what is $X$? $Y$ is a stochastic function of a sequence $X^n,$ is $X$ one of these? Some independent copy of one of these? This is of course just reiterating John's point, but really at this level of generality basically nothing can be said.
    – stochasticboy321
    Jul 17 at 23:03
















I suspect you need some constraint on your algorithm, as it is always possible to construct an algorithm for which I(X^n;Y)=0, which necessarily implies that I(X;Y)=0.
– John Polcari
Jul 16 at 15:43




I suspect you need some constraint on your algorithm, as it is always possible to construct an algorithm for which I(X^n;Y)=0, which necessarily implies that I(X;Y)=0.
– John Polcari
Jul 16 at 15:43












@JohnPolcari, you are totally right. I did not properly formalize it, but ideally I am considering algorithms where this does not happen (more specifically, Learning algorithms)
– user1868607
Jul 16 at 15:49




@JohnPolcari, you are totally right. I did not properly formalize it, but ideally I am considering algorithms where this does not happen (more specifically, Learning algorithms)
– user1868607
Jul 16 at 15:49




1




1




I would expect that how you might bound I(X^n;Y) would intimately depend upon how you choose to constrain your algorithm.
– John Polcari
Jul 16 at 15:51




I would expect that how you might bound I(X^n;Y) would intimately depend upon how you choose to constrain your algorithm.
– John Polcari
Jul 16 at 15:51












I'm not sure $I(X^n;Y) ge nI(X;Y)$ follows - consider the case $mathcalA(X^n) = X_1.$ Also, you need to be a little less cavalier about writing $I(X;Y)$ - what is $X$? $Y$ is a stochastic function of a sequence $X^n,$ is $X$ one of these? Some independent copy of one of these? This is of course just reiterating John's point, but really at this level of generality basically nothing can be said.
– stochasticboy321
Jul 17 at 23:03





I'm not sure $I(X^n;Y) ge nI(X;Y)$ follows - consider the case $mathcalA(X^n) = X_1.$ Also, you need to be a little less cavalier about writing $I(X;Y)$ - what is $X$? $Y$ is a stochastic function of a sequence $X^n,$ is $X$ one of these? Some independent copy of one of these? This is of course just reiterating John's point, but really at this level of generality basically nothing can be said.
– stochasticboy321
Jul 17 at 23:03
















active

oldest

votes











Your Answer




StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "69"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);








 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2853491%2fmutual-information-between-dataset-and-output-of-algorithm%23new-answer', 'question_page');

);

Post as a guest



































active

oldest

votes













active

oldest

votes









active

oldest

votes






active

oldest

votes










 

draft saved


draft discarded


























 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2853491%2fmutual-information-between-dataset-and-output-of-algorithm%23new-answer', 'question_page');

);

Post as a guest













































































Comments

Popular posts from this blog

Color the edges and diagonals of a regular polygon

Relationship between determinant of matrix and determinant of adjoint?

What is the equation of a 3D cone with generalised tilt?