The number of combinations of Fi to F255 that satisfies the following inequality.

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
0
down vote

favorite
1












Consider the following equation:



$H(x) = - sum_i=0^255 P_i log_2 (P_i) $



When $ P_0 = P_1 = ... =P_255 = frac 1256$



$H(x) = - sum_i=0^255 frac 1256 log_2 (frac 1256) = 8 $



Where $p_i$ is the probability of the character $x_i$ to occur in the file.



The file Size is 256 characters.



We will refer to the frequency of the character $x_i$ as $f_i$. ($p_i$ = $f_i / 256)$.



The sum of the frequencies of the characters will always be $256$.



My question is:



Given that $H(x)$ is $ge 7$, How to calculate the number of possible combinations of characters frequencies (from $f_0$ to $f_255$) that makes the inequality true?







share|cite|improve this question

















  • 2




    Better if you can write out the equation. The help menu has instructions for formatting mathematics here.
    – Gerry Myerson
    Jul 21 at 4:56










  • This seems difficult. Could you give us some context how you encountered this problem? Do you have reason to believe that there's a solution in closed form?
    – joriki
    Jul 21 at 5:01










  • What inequality are you talking about?
    – Ali
    Jul 21 at 9:29






  • 1




    @GerryMyerson Done it.
    – Hesham H.
    Jul 21 at 14:41










  • @joriki I encountered this problem while I was thinking about the entropy of data and information theory, it is not in a textbook, I was wondering how many files of size 256 bytes and has a data entropy of 7 or higher, or in general how to calculate the number of files with an entropy ranging between 2 values. Can you explain what do mean by "a solution in closed form"?
    – Hesham H.
    Jul 21 at 14:56















up vote
0
down vote

favorite
1












Consider the following equation:



$H(x) = - sum_i=0^255 P_i log_2 (P_i) $



When $ P_0 = P_1 = ... =P_255 = frac 1256$



$H(x) = - sum_i=0^255 frac 1256 log_2 (frac 1256) = 8 $



Where $p_i$ is the probability of the character $x_i$ to occur in the file.



The file Size is 256 characters.



We will refer to the frequency of the character $x_i$ as $f_i$. ($p_i$ = $f_i / 256)$.



The sum of the frequencies of the characters will always be $256$.



My question is:



Given that $H(x)$ is $ge 7$, How to calculate the number of possible combinations of characters frequencies (from $f_0$ to $f_255$) that makes the inequality true?







share|cite|improve this question

















  • 2




    Better if you can write out the equation. The help menu has instructions for formatting mathematics here.
    – Gerry Myerson
    Jul 21 at 4:56










  • This seems difficult. Could you give us some context how you encountered this problem? Do you have reason to believe that there's a solution in closed form?
    – joriki
    Jul 21 at 5:01










  • What inequality are you talking about?
    – Ali
    Jul 21 at 9:29






  • 1




    @GerryMyerson Done it.
    – Hesham H.
    Jul 21 at 14:41










  • @joriki I encountered this problem while I was thinking about the entropy of data and information theory, it is not in a textbook, I was wondering how many files of size 256 bytes and has a data entropy of 7 or higher, or in general how to calculate the number of files with an entropy ranging between 2 values. Can you explain what do mean by "a solution in closed form"?
    – Hesham H.
    Jul 21 at 14:56













up vote
0
down vote

favorite
1









up vote
0
down vote

favorite
1






1





Consider the following equation:



$H(x) = - sum_i=0^255 P_i log_2 (P_i) $



When $ P_0 = P_1 = ... =P_255 = frac 1256$



$H(x) = - sum_i=0^255 frac 1256 log_2 (frac 1256) = 8 $



Where $p_i$ is the probability of the character $x_i$ to occur in the file.



The file Size is 256 characters.



We will refer to the frequency of the character $x_i$ as $f_i$. ($p_i$ = $f_i / 256)$.



The sum of the frequencies of the characters will always be $256$.



My question is:



Given that $H(x)$ is $ge 7$, How to calculate the number of possible combinations of characters frequencies (from $f_0$ to $f_255$) that makes the inequality true?







share|cite|improve this question













Consider the following equation:



$H(x) = - sum_i=0^255 P_i log_2 (P_i) $



When $ P_0 = P_1 = ... =P_255 = frac 1256$



$H(x) = - sum_i=0^255 frac 1256 log_2 (frac 1256) = 8 $



Where $p_i$ is the probability of the character $x_i$ to occur in the file.



The file Size is 256 characters.



We will refer to the frequency of the character $x_i$ as $f_i$. ($p_i$ = $f_i / 256)$.



The sum of the frequencies of the characters will always be $256$.



My question is:



Given that $H(x)$ is $ge 7$, How to calculate the number of possible combinations of characters frequencies (from $f_0$ to $f_255$) that makes the inequality true?









share|cite|improve this question












share|cite|improve this question




share|cite|improve this question








edited Jul 21 at 22:41
























asked Jul 21 at 1:14









Hesham H.

33




33







  • 2




    Better if you can write out the equation. The help menu has instructions for formatting mathematics here.
    – Gerry Myerson
    Jul 21 at 4:56










  • This seems difficult. Could you give us some context how you encountered this problem? Do you have reason to believe that there's a solution in closed form?
    – joriki
    Jul 21 at 5:01










  • What inequality are you talking about?
    – Ali
    Jul 21 at 9:29






  • 1




    @GerryMyerson Done it.
    – Hesham H.
    Jul 21 at 14:41










  • @joriki I encountered this problem while I was thinking about the entropy of data and information theory, it is not in a textbook, I was wondering how many files of size 256 bytes and has a data entropy of 7 or higher, or in general how to calculate the number of files with an entropy ranging between 2 values. Can you explain what do mean by "a solution in closed form"?
    – Hesham H.
    Jul 21 at 14:56













  • 2




    Better if you can write out the equation. The help menu has instructions for formatting mathematics here.
    – Gerry Myerson
    Jul 21 at 4:56










  • This seems difficult. Could you give us some context how you encountered this problem? Do you have reason to believe that there's a solution in closed form?
    – joriki
    Jul 21 at 5:01










  • What inequality are you talking about?
    – Ali
    Jul 21 at 9:29






  • 1




    @GerryMyerson Done it.
    – Hesham H.
    Jul 21 at 14:41










  • @joriki I encountered this problem while I was thinking about the entropy of data and information theory, it is not in a textbook, I was wondering how many files of size 256 bytes and has a data entropy of 7 or higher, or in general how to calculate the number of files with an entropy ranging between 2 values. Can you explain what do mean by "a solution in closed form"?
    – Hesham H.
    Jul 21 at 14:56








2




2




Better if you can write out the equation. The help menu has instructions for formatting mathematics here.
– Gerry Myerson
Jul 21 at 4:56




Better if you can write out the equation. The help menu has instructions for formatting mathematics here.
– Gerry Myerson
Jul 21 at 4:56












This seems difficult. Could you give us some context how you encountered this problem? Do you have reason to believe that there's a solution in closed form?
– joriki
Jul 21 at 5:01




This seems difficult. Could you give us some context how you encountered this problem? Do you have reason to believe that there's a solution in closed form?
– joriki
Jul 21 at 5:01












What inequality are you talking about?
– Ali
Jul 21 at 9:29




What inequality are you talking about?
– Ali
Jul 21 at 9:29




1




1




@GerryMyerson Done it.
– Hesham H.
Jul 21 at 14:41




@GerryMyerson Done it.
– Hesham H.
Jul 21 at 14:41












@joriki I encountered this problem while I was thinking about the entropy of data and information theory, it is not in a textbook, I was wondering how many files of size 256 bytes and has a data entropy of 7 or higher, or in general how to calculate the number of files with an entropy ranging between 2 values. Can you explain what do mean by "a solution in closed form"?
– Hesham H.
Jul 21 at 14:56





@joriki I encountered this problem while I was thinking about the entropy of data and information theory, it is not in a textbook, I was wondering how many files of size 256 bytes and has a data entropy of 7 or higher, or in general how to calculate the number of files with an entropy ranging between 2 values. Can you explain what do mean by "a solution in closed form"?
– Hesham H.
Jul 21 at 14:56











1 Answer
1






active

oldest

votes

















up vote
0
down vote



accepted










You want



begineqnarray*
H&=&-sum_ip_ilog_2p_i\
&=&
-sum_ifracf_i256log_2fracf_i256\
&=&-frac1256sum_ileft(f_ilog_2f_i-f_ilog_2256right)\
&=&8-frac1256sum_if_ilog_2f_i\
&ge&7
endeqnarray*



and thus



$$
sum_if_ilog_2f_ile256;,
$$



or (with $0^0=1$)



$$
prod_if_i^f_ile2^256;,
$$



which is a good form of the condition for computer purposes, as it allows us to do everything with integer computations without rounding errors.



This Java code performs the computation.



The number of admissible frequency assignments with the characters treated as indistinguishable is $83497348$.



The number of admissible frequency assignments with the characters treated as distinguishable is




24377117336293373068077194760018608890164075844989457788709896790123975565346885123594568979784410896832589991747589003175962663714550060904409876659.




The number of admissible files is




32294749681296695275407031178128991719155449234352428397819516042408274669320014204971984810980113839713404991542286248511856534533725621770720443204257942757841685726869445418911988343694467610597716668644890003072566046313323943032907344235616323318993946853455308766090267464412982787428679760943574604833815718162734374857821826072728560557821111484218984087078219375690986647992988689743586158787866836166182263828747140499381872352135857955196142673766290736723267626340276801993482047192554123014017270771149942354934825395952764480929741003537616825060105246264524800000000000000000000000000000000000000000000.




There are $256^256$ files in total, so approximately $99.93%$ of the files fulfil the condition.



The results should be reasonably reliable, as the total number of files is correctly reproduced when the entropy condition is omitted, and this depends on all three counts being performed correctly.



Here's a $log_2$-plot of the proportion of files with $n$ characters chosen from $n$ symbols for which



$$
prod_if_i^f_igt2^n;,
$$



that is, the proportion of inadmissible files. The line is a linear fit to the last $40$ data points, with slope approximately $-0.0317$. I don't know how this constant could be calculated. Note the interesting odd/even structure. (Click on the image to see the full high-resolution version.)



log_2 plot of the proportion of inadmissible files






share|cite|improve this answer





















    Your Answer




    StackExchange.ifUsing("editor", function ()
    return StackExchange.using("mathjaxEditing", function ()
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
    );
    );
    , "mathjax-editing");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "69"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    convertImagesToLinks: true,
    noModals: false,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    noCode: true, onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );








     

    draft saved


    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2858152%2fthe-number-of-combinations-of-fi-to-f255-that-satisfies-the-following-inequality%23new-answer', 'question_page');

    );

    Post as a guest






























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    0
    down vote



    accepted










    You want



    begineqnarray*
    H&=&-sum_ip_ilog_2p_i\
    &=&
    -sum_ifracf_i256log_2fracf_i256\
    &=&-frac1256sum_ileft(f_ilog_2f_i-f_ilog_2256right)\
    &=&8-frac1256sum_if_ilog_2f_i\
    &ge&7
    endeqnarray*



    and thus



    $$
    sum_if_ilog_2f_ile256;,
    $$



    or (with $0^0=1$)



    $$
    prod_if_i^f_ile2^256;,
    $$



    which is a good form of the condition for computer purposes, as it allows us to do everything with integer computations without rounding errors.



    This Java code performs the computation.



    The number of admissible frequency assignments with the characters treated as indistinguishable is $83497348$.



    The number of admissible frequency assignments with the characters treated as distinguishable is




    24377117336293373068077194760018608890164075844989457788709896790123975565346885123594568979784410896832589991747589003175962663714550060904409876659.




    The number of admissible files is




    32294749681296695275407031178128991719155449234352428397819516042408274669320014204971984810980113839713404991542286248511856534533725621770720443204257942757841685726869445418911988343694467610597716668644890003072566046313323943032907344235616323318993946853455308766090267464412982787428679760943574604833815718162734374857821826072728560557821111484218984087078219375690986647992988689743586158787866836166182263828747140499381872352135857955196142673766290736723267626340276801993482047192554123014017270771149942354934825395952764480929741003537616825060105246264524800000000000000000000000000000000000000000000.




    There are $256^256$ files in total, so approximately $99.93%$ of the files fulfil the condition.



    The results should be reasonably reliable, as the total number of files is correctly reproduced when the entropy condition is omitted, and this depends on all three counts being performed correctly.



    Here's a $log_2$-plot of the proportion of files with $n$ characters chosen from $n$ symbols for which



    $$
    prod_if_i^f_igt2^n;,
    $$



    that is, the proportion of inadmissible files. The line is a linear fit to the last $40$ data points, with slope approximately $-0.0317$. I don't know how this constant could be calculated. Note the interesting odd/even structure. (Click on the image to see the full high-resolution version.)



    log_2 plot of the proportion of inadmissible files






    share|cite|improve this answer

























      up vote
      0
      down vote



      accepted










      You want



      begineqnarray*
      H&=&-sum_ip_ilog_2p_i\
      &=&
      -sum_ifracf_i256log_2fracf_i256\
      &=&-frac1256sum_ileft(f_ilog_2f_i-f_ilog_2256right)\
      &=&8-frac1256sum_if_ilog_2f_i\
      &ge&7
      endeqnarray*



      and thus



      $$
      sum_if_ilog_2f_ile256;,
      $$



      or (with $0^0=1$)



      $$
      prod_if_i^f_ile2^256;,
      $$



      which is a good form of the condition for computer purposes, as it allows us to do everything with integer computations without rounding errors.



      This Java code performs the computation.



      The number of admissible frequency assignments with the characters treated as indistinguishable is $83497348$.



      The number of admissible frequency assignments with the characters treated as distinguishable is




      24377117336293373068077194760018608890164075844989457788709896790123975565346885123594568979784410896832589991747589003175962663714550060904409876659.




      The number of admissible files is




      32294749681296695275407031178128991719155449234352428397819516042408274669320014204971984810980113839713404991542286248511856534533725621770720443204257942757841685726869445418911988343694467610597716668644890003072566046313323943032907344235616323318993946853455308766090267464412982787428679760943574604833815718162734374857821826072728560557821111484218984087078219375690986647992988689743586158787866836166182263828747140499381872352135857955196142673766290736723267626340276801993482047192554123014017270771149942354934825395952764480929741003537616825060105246264524800000000000000000000000000000000000000000000.




      There are $256^256$ files in total, so approximately $99.93%$ of the files fulfil the condition.



      The results should be reasonably reliable, as the total number of files is correctly reproduced when the entropy condition is omitted, and this depends on all three counts being performed correctly.



      Here's a $log_2$-plot of the proportion of files with $n$ characters chosen from $n$ symbols for which



      $$
      prod_if_i^f_igt2^n;,
      $$



      that is, the proportion of inadmissible files. The line is a linear fit to the last $40$ data points, with slope approximately $-0.0317$. I don't know how this constant could be calculated. Note the interesting odd/even structure. (Click on the image to see the full high-resolution version.)



      log_2 plot of the proportion of inadmissible files






      share|cite|improve this answer























        up vote
        0
        down vote



        accepted







        up vote
        0
        down vote



        accepted






        You want



        begineqnarray*
        H&=&-sum_ip_ilog_2p_i\
        &=&
        -sum_ifracf_i256log_2fracf_i256\
        &=&-frac1256sum_ileft(f_ilog_2f_i-f_ilog_2256right)\
        &=&8-frac1256sum_if_ilog_2f_i\
        &ge&7
        endeqnarray*



        and thus



        $$
        sum_if_ilog_2f_ile256;,
        $$



        or (with $0^0=1$)



        $$
        prod_if_i^f_ile2^256;,
        $$



        which is a good form of the condition for computer purposes, as it allows us to do everything with integer computations without rounding errors.



        This Java code performs the computation.



        The number of admissible frequency assignments with the characters treated as indistinguishable is $83497348$.



        The number of admissible frequency assignments with the characters treated as distinguishable is




        24377117336293373068077194760018608890164075844989457788709896790123975565346885123594568979784410896832589991747589003175962663714550060904409876659.




        The number of admissible files is




        32294749681296695275407031178128991719155449234352428397819516042408274669320014204971984810980113839713404991542286248511856534533725621770720443204257942757841685726869445418911988343694467610597716668644890003072566046313323943032907344235616323318993946853455308766090267464412982787428679760943574604833815718162734374857821826072728560557821111484218984087078219375690986647992988689743586158787866836166182263828747140499381872352135857955196142673766290736723267626340276801993482047192554123014017270771149942354934825395952764480929741003537616825060105246264524800000000000000000000000000000000000000000000.




        There are $256^256$ files in total, so approximately $99.93%$ of the files fulfil the condition.



        The results should be reasonably reliable, as the total number of files is correctly reproduced when the entropy condition is omitted, and this depends on all three counts being performed correctly.



        Here's a $log_2$-plot of the proportion of files with $n$ characters chosen from $n$ symbols for which



        $$
        prod_if_i^f_igt2^n;,
        $$



        that is, the proportion of inadmissible files. The line is a linear fit to the last $40$ data points, with slope approximately $-0.0317$. I don't know how this constant could be calculated. Note the interesting odd/even structure. (Click on the image to see the full high-resolution version.)



        log_2 plot of the proportion of inadmissible files






        share|cite|improve this answer













        You want



        begineqnarray*
        H&=&-sum_ip_ilog_2p_i\
        &=&
        -sum_ifracf_i256log_2fracf_i256\
        &=&-frac1256sum_ileft(f_ilog_2f_i-f_ilog_2256right)\
        &=&8-frac1256sum_if_ilog_2f_i\
        &ge&7
        endeqnarray*



        and thus



        $$
        sum_if_ilog_2f_ile256;,
        $$



        or (with $0^0=1$)



        $$
        prod_if_i^f_ile2^256;,
        $$



        which is a good form of the condition for computer purposes, as it allows us to do everything with integer computations without rounding errors.



        This Java code performs the computation.



        The number of admissible frequency assignments with the characters treated as indistinguishable is $83497348$.



        The number of admissible frequency assignments with the characters treated as distinguishable is




        24377117336293373068077194760018608890164075844989457788709896790123975565346885123594568979784410896832589991747589003175962663714550060904409876659.




        The number of admissible files is




        32294749681296695275407031178128991719155449234352428397819516042408274669320014204971984810980113839713404991542286248511856534533725621770720443204257942757841685726869445418911988343694467610597716668644890003072566046313323943032907344235616323318993946853455308766090267464412982787428679760943574604833815718162734374857821826072728560557821111484218984087078219375690986647992988689743586158787866836166182263828747140499381872352135857955196142673766290736723267626340276801993482047192554123014017270771149942354934825395952764480929741003537616825060105246264524800000000000000000000000000000000000000000000.




        There are $256^256$ files in total, so approximately $99.93%$ of the files fulfil the condition.



        The results should be reasonably reliable, as the total number of files is correctly reproduced when the entropy condition is omitted, and this depends on all three counts being performed correctly.



        Here's a $log_2$-plot of the proportion of files with $n$ characters chosen from $n$ symbols for which



        $$
        prod_if_i^f_igt2^n;,
        $$



        that is, the proportion of inadmissible files. The line is a linear fit to the last $40$ data points, with slope approximately $-0.0317$. I don't know how this constant could be calculated. Note the interesting odd/even structure. (Click on the image to see the full high-resolution version.)



        log_2 plot of the proportion of inadmissible files







        share|cite|improve this answer













        share|cite|improve this answer



        share|cite|improve this answer











        answered Jul 22 at 8:06









        joriki

        164k10180328




        164k10180328






















             

            draft saved


            draft discarded


























             


            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2858152%2fthe-number-of-combinations-of-fi-to-f255-that-satisfies-the-following-inequality%23new-answer', 'question_page');

            );

            Post as a guest













































































            Comments

            Popular posts from this blog

            What is the equation of a 3D cone with generalised tilt?

            Color the edges and diagonals of a regular polygon

            Relationship between determinant of matrix and determinant of adjoint?