data processing inequality-mutual information

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
0
down vote

favorite












suppose that we have a family of probability mass functions $f_theta left( x right)$ indexed by $theta$, and let $x$ be a sample from this distribution. Then from the information theory, we have the following relation:



$Ileft( theta ,Tleft( x right) right) le Ileft( theta ,x right)$



where $Tleft( x right)$ is a function of the samples. The above relation says:
by processing the samples, we do not get any new information.
Now my question is that, is this claim aways true?
For example, consider the following scenario:



Again, assume that we have a family of probability mass functions $f_theta left( x right)$. Also, we are given a matrix that shows some correlations among x's dimensions. Now, we process the samples $x$ by this matrix to get new samples as,



$y = Ax$



where matrix $A$ is a similarity matrix indicating some correlations between $x$'s dimensions that we are given. Does $y$ contain new information in my example?







share|cite|improve this question















  • 1




    The short answer is no, it does not, in a quantitative sense. The question of what that actually means on a practical basis is a much longer discussion that ultimately leads the question of precisely what mutual information measures. In that regard, there are other accounting systems for information content that more clearly identify what is meant by quantitative "information measurement", but still exhibit equivalent data processing inequalities.
    – John Polcari
    Jul 16 at 13:45










  • I should add that this assumes that the original x is the same vector of samples from which you compute y. If you meant "can the vector y provide additional information compared to any one sample in the vector x", then yes, it can (but doesn't necessarily), but only because of any additional information provided by the other samples in the x vector.
    – John Polcari
    Jul 16 at 14:19











  • Thank you very much for your response and explanation. What do you mean by "the original x is the same vector of samples from which you compute y" ?
    – user51780
    Jul 16 at 14:55










  • At the very start of your question, you mention x being "a sample", which might be taken to mean an individual sample or, alternatively, the vector of samples used to compute y. I initially assumed the latter, because x represents a vector later in your question. It then dawned on me that you might have meant the former, which prompted the additional comment.
    – John Polcari
    Jul 16 at 15:01










  • Sorry for the confusion. In my problem, I compute $y$ for each sample of $x$.Therefore, $y$ is a new random variable. My issue is that is this new random variable provide extra information about $x$.
    – user51780
    Jul 16 at 15:46














up vote
0
down vote

favorite












suppose that we have a family of probability mass functions $f_theta left( x right)$ indexed by $theta$, and let $x$ be a sample from this distribution. Then from the information theory, we have the following relation:



$Ileft( theta ,Tleft( x right) right) le Ileft( theta ,x right)$



where $Tleft( x right)$ is a function of the samples. The above relation says:
by processing the samples, we do not get any new information.
Now my question is that, is this claim aways true?
For example, consider the following scenario:



Again, assume that we have a family of probability mass functions $f_theta left( x right)$. Also, we are given a matrix that shows some correlations among x's dimensions. Now, we process the samples $x$ by this matrix to get new samples as,



$y = Ax$



where matrix $A$ is a similarity matrix indicating some correlations between $x$'s dimensions that we are given. Does $y$ contain new information in my example?







share|cite|improve this question















  • 1




    The short answer is no, it does not, in a quantitative sense. The question of what that actually means on a practical basis is a much longer discussion that ultimately leads the question of precisely what mutual information measures. In that regard, there are other accounting systems for information content that more clearly identify what is meant by quantitative "information measurement", but still exhibit equivalent data processing inequalities.
    – John Polcari
    Jul 16 at 13:45










  • I should add that this assumes that the original x is the same vector of samples from which you compute y. If you meant "can the vector y provide additional information compared to any one sample in the vector x", then yes, it can (but doesn't necessarily), but only because of any additional information provided by the other samples in the x vector.
    – John Polcari
    Jul 16 at 14:19











  • Thank you very much for your response and explanation. What do you mean by "the original x is the same vector of samples from which you compute y" ?
    – user51780
    Jul 16 at 14:55










  • At the very start of your question, you mention x being "a sample", which might be taken to mean an individual sample or, alternatively, the vector of samples used to compute y. I initially assumed the latter, because x represents a vector later in your question. It then dawned on me that you might have meant the former, which prompted the additional comment.
    – John Polcari
    Jul 16 at 15:01










  • Sorry for the confusion. In my problem, I compute $y$ for each sample of $x$.Therefore, $y$ is a new random variable. My issue is that is this new random variable provide extra information about $x$.
    – user51780
    Jul 16 at 15:46












up vote
0
down vote

favorite









up vote
0
down vote

favorite











suppose that we have a family of probability mass functions $f_theta left( x right)$ indexed by $theta$, and let $x$ be a sample from this distribution. Then from the information theory, we have the following relation:



$Ileft( theta ,Tleft( x right) right) le Ileft( theta ,x right)$



where $Tleft( x right)$ is a function of the samples. The above relation says:
by processing the samples, we do not get any new information.
Now my question is that, is this claim aways true?
For example, consider the following scenario:



Again, assume that we have a family of probability mass functions $f_theta left( x right)$. Also, we are given a matrix that shows some correlations among x's dimensions. Now, we process the samples $x$ by this matrix to get new samples as,



$y = Ax$



where matrix $A$ is a similarity matrix indicating some correlations between $x$'s dimensions that we are given. Does $y$ contain new information in my example?







share|cite|improve this question











suppose that we have a family of probability mass functions $f_theta left( x right)$ indexed by $theta$, and let $x$ be a sample from this distribution. Then from the information theory, we have the following relation:



$Ileft( theta ,Tleft( x right) right) le Ileft( theta ,x right)$



where $Tleft( x right)$ is a function of the samples. The above relation says:
by processing the samples, we do not get any new information.
Now my question is that, is this claim aways true?
For example, consider the following scenario:



Again, assume that we have a family of probability mass functions $f_theta left( x right)$. Also, we are given a matrix that shows some correlations among x's dimensions. Now, we process the samples $x$ by this matrix to get new samples as,



$y = Ax$



where matrix $A$ is a similarity matrix indicating some correlations between $x$'s dimensions that we are given. Does $y$ contain new information in my example?









share|cite|improve this question










share|cite|improve this question




share|cite|improve this question









asked Jul 16 at 12:49









user51780

937




937







  • 1




    The short answer is no, it does not, in a quantitative sense. The question of what that actually means on a practical basis is a much longer discussion that ultimately leads the question of precisely what mutual information measures. In that regard, there are other accounting systems for information content that more clearly identify what is meant by quantitative "information measurement", but still exhibit equivalent data processing inequalities.
    – John Polcari
    Jul 16 at 13:45










  • I should add that this assumes that the original x is the same vector of samples from which you compute y. If you meant "can the vector y provide additional information compared to any one sample in the vector x", then yes, it can (but doesn't necessarily), but only because of any additional information provided by the other samples in the x vector.
    – John Polcari
    Jul 16 at 14:19











  • Thank you very much for your response and explanation. What do you mean by "the original x is the same vector of samples from which you compute y" ?
    – user51780
    Jul 16 at 14:55










  • At the very start of your question, you mention x being "a sample", which might be taken to mean an individual sample or, alternatively, the vector of samples used to compute y. I initially assumed the latter, because x represents a vector later in your question. It then dawned on me that you might have meant the former, which prompted the additional comment.
    – John Polcari
    Jul 16 at 15:01










  • Sorry for the confusion. In my problem, I compute $y$ for each sample of $x$.Therefore, $y$ is a new random variable. My issue is that is this new random variable provide extra information about $x$.
    – user51780
    Jul 16 at 15:46












  • 1




    The short answer is no, it does not, in a quantitative sense. The question of what that actually means on a practical basis is a much longer discussion that ultimately leads the question of precisely what mutual information measures. In that regard, there are other accounting systems for information content that more clearly identify what is meant by quantitative "information measurement", but still exhibit equivalent data processing inequalities.
    – John Polcari
    Jul 16 at 13:45










  • I should add that this assumes that the original x is the same vector of samples from which you compute y. If you meant "can the vector y provide additional information compared to any one sample in the vector x", then yes, it can (but doesn't necessarily), but only because of any additional information provided by the other samples in the x vector.
    – John Polcari
    Jul 16 at 14:19











  • Thank you very much for your response and explanation. What do you mean by "the original x is the same vector of samples from which you compute y" ?
    – user51780
    Jul 16 at 14:55










  • At the very start of your question, you mention x being "a sample", which might be taken to mean an individual sample or, alternatively, the vector of samples used to compute y. I initially assumed the latter, because x represents a vector later in your question. It then dawned on me that you might have meant the former, which prompted the additional comment.
    – John Polcari
    Jul 16 at 15:01










  • Sorry for the confusion. In my problem, I compute $y$ for each sample of $x$.Therefore, $y$ is a new random variable. My issue is that is this new random variable provide extra information about $x$.
    – user51780
    Jul 16 at 15:46







1




1




The short answer is no, it does not, in a quantitative sense. The question of what that actually means on a practical basis is a much longer discussion that ultimately leads the question of precisely what mutual information measures. In that regard, there are other accounting systems for information content that more clearly identify what is meant by quantitative "information measurement", but still exhibit equivalent data processing inequalities.
– John Polcari
Jul 16 at 13:45




The short answer is no, it does not, in a quantitative sense. The question of what that actually means on a practical basis is a much longer discussion that ultimately leads the question of precisely what mutual information measures. In that regard, there are other accounting systems for information content that more clearly identify what is meant by quantitative "information measurement", but still exhibit equivalent data processing inequalities.
– John Polcari
Jul 16 at 13:45












I should add that this assumes that the original x is the same vector of samples from which you compute y. If you meant "can the vector y provide additional information compared to any one sample in the vector x", then yes, it can (but doesn't necessarily), but only because of any additional information provided by the other samples in the x vector.
– John Polcari
Jul 16 at 14:19





I should add that this assumes that the original x is the same vector of samples from which you compute y. If you meant "can the vector y provide additional information compared to any one sample in the vector x", then yes, it can (but doesn't necessarily), but only because of any additional information provided by the other samples in the x vector.
– John Polcari
Jul 16 at 14:19













Thank you very much for your response and explanation. What do you mean by "the original x is the same vector of samples from which you compute y" ?
– user51780
Jul 16 at 14:55




Thank you very much for your response and explanation. What do you mean by "the original x is the same vector of samples from which you compute y" ?
– user51780
Jul 16 at 14:55












At the very start of your question, you mention x being "a sample", which might be taken to mean an individual sample or, alternatively, the vector of samples used to compute y. I initially assumed the latter, because x represents a vector later in your question. It then dawned on me that you might have meant the former, which prompted the additional comment.
– John Polcari
Jul 16 at 15:01




At the very start of your question, you mention x being "a sample", which might be taken to mean an individual sample or, alternatively, the vector of samples used to compute y. I initially assumed the latter, because x represents a vector later in your question. It then dawned on me that you might have meant the former, which prompted the additional comment.
– John Polcari
Jul 16 at 15:01












Sorry for the confusion. In my problem, I compute $y$ for each sample of $x$.Therefore, $y$ is a new random variable. My issue is that is this new random variable provide extra information about $x$.
– user51780
Jul 16 at 15:46




Sorry for the confusion. In my problem, I compute $y$ for each sample of $x$.Therefore, $y$ is a new random variable. My issue is that is this new random variable provide extra information about $x$.
– user51780
Jul 16 at 15:46















active

oldest

votes











Your Answer




StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "69"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);








 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2853374%2fdata-processing-inequality-mutual-information%23new-answer', 'question_page');

);

Post as a guest



































active

oldest

votes













active

oldest

votes









active

oldest

votes






active

oldest

votes










 

draft saved


draft discarded


























 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2853374%2fdata-processing-inequality-mutual-information%23new-answer', 'question_page');

);

Post as a guest













































































Comments

Popular posts from this blog

What is the equation of a 3D cone with generalised tilt?

Color the edges and diagonals of a regular polygon

Relationship between determinant of matrix and determinant of adjoint?