Amount of information in one experiment

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
1
down vote

favorite












Let $X$ be a random variable with two possible values as outcomes and probabilities $p,1-p$. The experiment described, once carried out, answers a true/false question.



If I'm not mistaking the definitions: one defines that this experiment after carried out yields one bit of information.



So one bit of information quantifies the knowledge of the answer to a true/false variable.



Now let $Z$ be a discrete random variable with values $(z_i)_iin I$ and probabilities $(p_i)_iin I$. Here I'm assuming that $I$ is finite.



If one performs the experiment, how do we know the amount of information one will have gained?



My initial idea was to "decompose $Z$ into true/false questions". So for fixed $i$ we ask "is the value observed $z_i$?" This is a true/false question and the answer yields one bit of information.



This would suggest that observing $Z$ yields $|I|$ bits of information. But this is clearly wrong. The initial random variable $X$ has $I=0,1$ and hence $|I|=2$, while we know we should have 1 bit of information only.



Again, I might have the wrong definition of what is information, or of what is one bit.



So my question is: quantifying information in bits, if $Z$ is a random variable as above described, after we observe $Z$, how many bits of information we have?







share|cite|improve this question



















  • " one defines that this experiment after carried out yields one bit of information." No, the experiment yields $h(p)$ bits of information (one bit if $p=frac12$). Furthermore, that's the average amount of information.
    – leonbloy
    Jul 17 at 13:42














up vote
1
down vote

favorite












Let $X$ be a random variable with two possible values as outcomes and probabilities $p,1-p$. The experiment described, once carried out, answers a true/false question.



If I'm not mistaking the definitions: one defines that this experiment after carried out yields one bit of information.



So one bit of information quantifies the knowledge of the answer to a true/false variable.



Now let $Z$ be a discrete random variable with values $(z_i)_iin I$ and probabilities $(p_i)_iin I$. Here I'm assuming that $I$ is finite.



If one performs the experiment, how do we know the amount of information one will have gained?



My initial idea was to "decompose $Z$ into true/false questions". So for fixed $i$ we ask "is the value observed $z_i$?" This is a true/false question and the answer yields one bit of information.



This would suggest that observing $Z$ yields $|I|$ bits of information. But this is clearly wrong. The initial random variable $X$ has $I=0,1$ and hence $|I|=2$, while we know we should have 1 bit of information only.



Again, I might have the wrong definition of what is information, or of what is one bit.



So my question is: quantifying information in bits, if $Z$ is a random variable as above described, after we observe $Z$, how many bits of information we have?







share|cite|improve this question



















  • " one defines that this experiment after carried out yields one bit of information." No, the experiment yields $h(p)$ bits of information (one bit if $p=frac12$). Furthermore, that's the average amount of information.
    – leonbloy
    Jul 17 at 13:42












up vote
1
down vote

favorite









up vote
1
down vote

favorite











Let $X$ be a random variable with two possible values as outcomes and probabilities $p,1-p$. The experiment described, once carried out, answers a true/false question.



If I'm not mistaking the definitions: one defines that this experiment after carried out yields one bit of information.



So one bit of information quantifies the knowledge of the answer to a true/false variable.



Now let $Z$ be a discrete random variable with values $(z_i)_iin I$ and probabilities $(p_i)_iin I$. Here I'm assuming that $I$ is finite.



If one performs the experiment, how do we know the amount of information one will have gained?



My initial idea was to "decompose $Z$ into true/false questions". So for fixed $i$ we ask "is the value observed $z_i$?" This is a true/false question and the answer yields one bit of information.



This would suggest that observing $Z$ yields $|I|$ bits of information. But this is clearly wrong. The initial random variable $X$ has $I=0,1$ and hence $|I|=2$, while we know we should have 1 bit of information only.



Again, I might have the wrong definition of what is information, or of what is one bit.



So my question is: quantifying information in bits, if $Z$ is a random variable as above described, after we observe $Z$, how many bits of information we have?







share|cite|improve this question











Let $X$ be a random variable with two possible values as outcomes and probabilities $p,1-p$. The experiment described, once carried out, answers a true/false question.



If I'm not mistaking the definitions: one defines that this experiment after carried out yields one bit of information.



So one bit of information quantifies the knowledge of the answer to a true/false variable.



Now let $Z$ be a discrete random variable with values $(z_i)_iin I$ and probabilities $(p_i)_iin I$. Here I'm assuming that $I$ is finite.



If one performs the experiment, how do we know the amount of information one will have gained?



My initial idea was to "decompose $Z$ into true/false questions". So for fixed $i$ we ask "is the value observed $z_i$?" This is a true/false question and the answer yields one bit of information.



This would suggest that observing $Z$ yields $|I|$ bits of information. But this is clearly wrong. The initial random variable $X$ has $I=0,1$ and hence $|I|=2$, while we know we should have 1 bit of information only.



Again, I might have the wrong definition of what is information, or of what is one bit.



So my question is: quantifying information in bits, if $Z$ is a random variable as above described, after we observe $Z$, how many bits of information we have?









share|cite|improve this question










share|cite|improve this question




share|cite|improve this question









asked Jul 16 at 19:46









user1620696

11k336103




11k336103











  • " one defines that this experiment after carried out yields one bit of information." No, the experiment yields $h(p)$ bits of information (one bit if $p=frac12$). Furthermore, that's the average amount of information.
    – leonbloy
    Jul 17 at 13:42
















  • " one defines that this experiment after carried out yields one bit of information." No, the experiment yields $h(p)$ bits of information (one bit if $p=frac12$). Furthermore, that's the average amount of information.
    – leonbloy
    Jul 17 at 13:42















" one defines that this experiment after carried out yields one bit of information." No, the experiment yields $h(p)$ bits of information (one bit if $p=frac12$). Furthermore, that's the average amount of information.
– leonbloy
Jul 17 at 13:42




" one defines that this experiment after carried out yields one bit of information." No, the experiment yields $h(p)$ bits of information (one bit if $p=frac12$). Furthermore, that's the average amount of information.
– leonbloy
Jul 17 at 13:42










1 Answer
1






active

oldest

votes

















up vote
0
down vote













The instantaneous information content (measured in terms of Shannnon entropy) of your binary experiment (yielding one exemplar of X) in bits is $- log _2p$ if you get true and $- log _2left( 1 - p right)$ if you get false. This equals one bit only if $p = 0.5$. The equivalent instantaneous information content (also measured in terms of Shannnon entropy) of your multi-outcome experiment (yielding one exemplar of Z) in bits is $- log _2p_i$, which depends upon which particular outcome you observe. Needless to say, rarer outcomes carry more information content.



Your intuition of breaking things down into binary questions is a (VERY) good one, but doing it using the Shannon information measurement scale can be quite cumbersome. A much more intuitive way is to use the discriminating information measurement scale, where the instantaneous information content (in bits) from a binary experiment is $log _2left( fracp1 - p right)$ if you get true and $log _2left( frac1 - pp right) = - log _2left( fracp1 - p right)$ if you get false, although we typically use natural logs, leading to units of "nats". The discriminating information scale can be thought of as an alternative scale for measuring information (like Celsius and Fahrenheit temperature scales, except that the relationship between entropy and discriminating information is not linear). Using the discriminating information scale, the amount of information the exemplar of Z provides is precisely the same as the amount of information that it would provide for the binary experiment were the result is true, namely $log _2left( fracp_i1 - p_i right)$. This makes it very easy to cast more complicated experiments into equivalent sets of binary experiments, and suggests (I don't know how to prove this yet) that information measurement for any problem of arbitrary complexity can always be cast in terms of the discriminating information content of all the underlying binary experiments.






share|cite|improve this answer





















    Your Answer




    StackExchange.ifUsing("editor", function ()
    return StackExchange.using("mathjaxEditing", function ()
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
    );
    );
    , "mathjax-editing");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "69"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    convertImagesToLinks: true,
    noModals: false,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    noCode: true, onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );








     

    draft saved


    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2853791%2famount-of-information-in-one-experiment%23new-answer', 'question_page');

    );

    Post as a guest






























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    0
    down vote













    The instantaneous information content (measured in terms of Shannnon entropy) of your binary experiment (yielding one exemplar of X) in bits is $- log _2p$ if you get true and $- log _2left( 1 - p right)$ if you get false. This equals one bit only if $p = 0.5$. The equivalent instantaneous information content (also measured in terms of Shannnon entropy) of your multi-outcome experiment (yielding one exemplar of Z) in bits is $- log _2p_i$, which depends upon which particular outcome you observe. Needless to say, rarer outcomes carry more information content.



    Your intuition of breaking things down into binary questions is a (VERY) good one, but doing it using the Shannon information measurement scale can be quite cumbersome. A much more intuitive way is to use the discriminating information measurement scale, where the instantaneous information content (in bits) from a binary experiment is $log _2left( fracp1 - p right)$ if you get true and $log _2left( frac1 - pp right) = - log _2left( fracp1 - p right)$ if you get false, although we typically use natural logs, leading to units of "nats". The discriminating information scale can be thought of as an alternative scale for measuring information (like Celsius and Fahrenheit temperature scales, except that the relationship between entropy and discriminating information is not linear). Using the discriminating information scale, the amount of information the exemplar of Z provides is precisely the same as the amount of information that it would provide for the binary experiment were the result is true, namely $log _2left( fracp_i1 - p_i right)$. This makes it very easy to cast more complicated experiments into equivalent sets of binary experiments, and suggests (I don't know how to prove this yet) that information measurement for any problem of arbitrary complexity can always be cast in terms of the discriminating information content of all the underlying binary experiments.






    share|cite|improve this answer

























      up vote
      0
      down vote













      The instantaneous information content (measured in terms of Shannnon entropy) of your binary experiment (yielding one exemplar of X) in bits is $- log _2p$ if you get true and $- log _2left( 1 - p right)$ if you get false. This equals one bit only if $p = 0.5$. The equivalent instantaneous information content (also measured in terms of Shannnon entropy) of your multi-outcome experiment (yielding one exemplar of Z) in bits is $- log _2p_i$, which depends upon which particular outcome you observe. Needless to say, rarer outcomes carry more information content.



      Your intuition of breaking things down into binary questions is a (VERY) good one, but doing it using the Shannon information measurement scale can be quite cumbersome. A much more intuitive way is to use the discriminating information measurement scale, where the instantaneous information content (in bits) from a binary experiment is $log _2left( fracp1 - p right)$ if you get true and $log _2left( frac1 - pp right) = - log _2left( fracp1 - p right)$ if you get false, although we typically use natural logs, leading to units of "nats". The discriminating information scale can be thought of as an alternative scale for measuring information (like Celsius and Fahrenheit temperature scales, except that the relationship between entropy and discriminating information is not linear). Using the discriminating information scale, the amount of information the exemplar of Z provides is precisely the same as the amount of information that it would provide for the binary experiment were the result is true, namely $log _2left( fracp_i1 - p_i right)$. This makes it very easy to cast more complicated experiments into equivalent sets of binary experiments, and suggests (I don't know how to prove this yet) that information measurement for any problem of arbitrary complexity can always be cast in terms of the discriminating information content of all the underlying binary experiments.






      share|cite|improve this answer























        up vote
        0
        down vote










        up vote
        0
        down vote









        The instantaneous information content (measured in terms of Shannnon entropy) of your binary experiment (yielding one exemplar of X) in bits is $- log _2p$ if you get true and $- log _2left( 1 - p right)$ if you get false. This equals one bit only if $p = 0.5$. The equivalent instantaneous information content (also measured in terms of Shannnon entropy) of your multi-outcome experiment (yielding one exemplar of Z) in bits is $- log _2p_i$, which depends upon which particular outcome you observe. Needless to say, rarer outcomes carry more information content.



        Your intuition of breaking things down into binary questions is a (VERY) good one, but doing it using the Shannon information measurement scale can be quite cumbersome. A much more intuitive way is to use the discriminating information measurement scale, where the instantaneous information content (in bits) from a binary experiment is $log _2left( fracp1 - p right)$ if you get true and $log _2left( frac1 - pp right) = - log _2left( fracp1 - p right)$ if you get false, although we typically use natural logs, leading to units of "nats". The discriminating information scale can be thought of as an alternative scale for measuring information (like Celsius and Fahrenheit temperature scales, except that the relationship between entropy and discriminating information is not linear). Using the discriminating information scale, the amount of information the exemplar of Z provides is precisely the same as the amount of information that it would provide for the binary experiment were the result is true, namely $log _2left( fracp_i1 - p_i right)$. This makes it very easy to cast more complicated experiments into equivalent sets of binary experiments, and suggests (I don't know how to prove this yet) that information measurement for any problem of arbitrary complexity can always be cast in terms of the discriminating information content of all the underlying binary experiments.






        share|cite|improve this answer













        The instantaneous information content (measured in terms of Shannnon entropy) of your binary experiment (yielding one exemplar of X) in bits is $- log _2p$ if you get true and $- log _2left( 1 - p right)$ if you get false. This equals one bit only if $p = 0.5$. The equivalent instantaneous information content (also measured in terms of Shannnon entropy) of your multi-outcome experiment (yielding one exemplar of Z) in bits is $- log _2p_i$, which depends upon which particular outcome you observe. Needless to say, rarer outcomes carry more information content.



        Your intuition of breaking things down into binary questions is a (VERY) good one, but doing it using the Shannon information measurement scale can be quite cumbersome. A much more intuitive way is to use the discriminating information measurement scale, where the instantaneous information content (in bits) from a binary experiment is $log _2left( fracp1 - p right)$ if you get true and $log _2left( frac1 - pp right) = - log _2left( fracp1 - p right)$ if you get false, although we typically use natural logs, leading to units of "nats". The discriminating information scale can be thought of as an alternative scale for measuring information (like Celsius and Fahrenheit temperature scales, except that the relationship between entropy and discriminating information is not linear). Using the discriminating information scale, the amount of information the exemplar of Z provides is precisely the same as the amount of information that it would provide for the binary experiment were the result is true, namely $log _2left( fracp_i1 - p_i right)$. This makes it very easy to cast more complicated experiments into equivalent sets of binary experiments, and suggests (I don't know how to prove this yet) that information measurement for any problem of arbitrary complexity can always be cast in terms of the discriminating information content of all the underlying binary experiments.







        share|cite|improve this answer













        share|cite|improve this answer



        share|cite|improve this answer











        answered Jul 16 at 20:41









        John Polcari

        382111




        382111






















             

            draft saved


            draft discarded


























             


            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2853791%2famount-of-information-in-one-experiment%23new-answer', 'question_page');

            );

            Post as a guest













































































            Comments

            Popular posts from this blog

            What is the equation of a 3D cone with generalised tilt?

            Color the edges and diagonals of a regular polygon

            Relationship between determinant of matrix and determinant of adjoint?