Equivalence test for binominal data

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;







up vote
2
down vote

favorite
1












I want to apply an equivalence test on my sample to infer whether they are equivalent or not.



Since my data are bionominal [0,1] I don’t know whether the TOST procedure (tost() in R) can handle my problem or not.



My data consists of two groups (G1 and G2) which are not equal in numbers of samples. E.g., G1= 164 people and G2=280 people. The samples are binominal, such that G1 includes the number of people who finished and failed in completing a game “in our case study”, and specified with 1 and 0, respectively. E.g., G1 includes 55 players who could finish the game and the rest fail the game. The same for G2 with different numbers. My question is, what is the best equivalence test for this type of data to infer whether they are equivalent? I implemented tost() test in r and got the result, but I am not sure whether the test is correct, since the function automatically calculates mean and SD. For example tost() calculates the mean for the above example G1, m= 0.35, but, when I calculate the mean following n*p(0.5) formula, I obtained m=82.







share|cite|improve this question





















  • Can you show your output from tost? And give the number of successes for G2? Are you testing $H_0: p_1 = p_2$ vs $H_a: p_1 ne p_2?$ Unclear whether this is mainly about how to use tost or mainly about testing a hypothesis.
    – BruceET
    7 hours ago

















up vote
2
down vote

favorite
1












I want to apply an equivalence test on my sample to infer whether they are equivalent or not.



Since my data are bionominal [0,1] I don’t know whether the TOST procedure (tost() in R) can handle my problem or not.



My data consists of two groups (G1 and G2) which are not equal in numbers of samples. E.g., G1= 164 people and G2=280 people. The samples are binominal, such that G1 includes the number of people who finished and failed in completing a game “in our case study”, and specified with 1 and 0, respectively. E.g., G1 includes 55 players who could finish the game and the rest fail the game. The same for G2 with different numbers. My question is, what is the best equivalence test for this type of data to infer whether they are equivalent? I implemented tost() test in r and got the result, but I am not sure whether the test is correct, since the function automatically calculates mean and SD. For example tost() calculates the mean for the above example G1, m= 0.35, but, when I calculate the mean following n*p(0.5) formula, I obtained m=82.







share|cite|improve this question





















  • Can you show your output from tost? And give the number of successes for G2? Are you testing $H_0: p_1 = p_2$ vs $H_a: p_1 ne p_2?$ Unclear whether this is mainly about how to use tost or mainly about testing a hypothesis.
    – BruceET
    7 hours ago













up vote
2
down vote

favorite
1









up vote
2
down vote

favorite
1






1





I want to apply an equivalence test on my sample to infer whether they are equivalent or not.



Since my data are bionominal [0,1] I don’t know whether the TOST procedure (tost() in R) can handle my problem or not.



My data consists of two groups (G1 and G2) which are not equal in numbers of samples. E.g., G1= 164 people and G2=280 people. The samples are binominal, such that G1 includes the number of people who finished and failed in completing a game “in our case study”, and specified with 1 and 0, respectively. E.g., G1 includes 55 players who could finish the game and the rest fail the game. The same for G2 with different numbers. My question is, what is the best equivalence test for this type of data to infer whether they are equivalent? I implemented tost() test in r and got the result, but I am not sure whether the test is correct, since the function automatically calculates mean and SD. For example tost() calculates the mean for the above example G1, m= 0.35, but, when I calculate the mean following n*p(0.5) formula, I obtained m=82.







share|cite|improve this question













I want to apply an equivalence test on my sample to infer whether they are equivalent or not.



Since my data are bionominal [0,1] I don’t know whether the TOST procedure (tost() in R) can handle my problem or not.



My data consists of two groups (G1 and G2) which are not equal in numbers of samples. E.g., G1= 164 people and G2=280 people. The samples are binominal, such that G1 includes the number of people who finished and failed in completing a game “in our case study”, and specified with 1 and 0, respectively. E.g., G1 includes 55 players who could finish the game and the rest fail the game. The same for G2 with different numbers. My question is, what is the best equivalence test for this type of data to infer whether they are equivalent? I implemented tost() test in r and got the result, but I am not sure whether the test is correct, since the function automatically calculates mean and SD. For example tost() calculates the mean for the above example G1, m= 0.35, but, when I calculate the mean following n*p(0.5) formula, I obtained m=82.









share|cite|improve this question












share|cite|improve this question




share|cite|improve this question








edited 8 hours ago









Alexis

14.1k34387




14.1k34387









asked 8 hours ago









nahid khosh

111




111











  • Can you show your output from tost? And give the number of successes for G2? Are you testing $H_0: p_1 = p_2$ vs $H_a: p_1 ne p_2?$ Unclear whether this is mainly about how to use tost or mainly about testing a hypothesis.
    – BruceET
    7 hours ago

















  • Can you show your output from tost? And give the number of successes for G2? Are you testing $H_0: p_1 = p_2$ vs $H_a: p_1 ne p_2?$ Unclear whether this is mainly about how to use tost or mainly about testing a hypothesis.
    – BruceET
    7 hours ago
















Can you show your output from tost? And give the number of successes for G2? Are you testing $H_0: p_1 = p_2$ vs $H_a: p_1 ne p_2?$ Unclear whether this is mainly about how to use tost or mainly about testing a hypothesis.
– BruceET
7 hours ago





Can you show your output from tost? And give the number of successes for G2? Are you testing $H_0: p_1 = p_2$ vs $H_a: p_1 ne p_2?$ Unclear whether this is mainly about how to use tost or mainly about testing a hypothesis.
– BruceET
7 hours ago











1 Answer
1






active

oldest

votes

















up vote
2
down vote













While one can use the t test to test for proportion difference, the z test is a tad more precise, since it uses an estimate of the standard deviation formulated specifically for binomial (i.e. dichotomous, nominal, etc.) data. The same applies to the z test for proportion equivalence.



First, the z test for difference in proportions of two independent samples is pretty straightforward:






About z tests for unpaired proportion difference

The null hypothesis is H$_0text: p_1 - p_2 = 0$ (i.e. H$_0text: p_1 = p_2$), with H$_textAtext: p_1 - p_2 ne 0$.



$z = frachatp_1-hatp_2sqrthatpleft(1-hatpright)left[frac1n_1 + frac1n_2right]$,

where:

$hatp_1$ and $hatp_1$ are the sample proportions in group 1 and group 2;
$n_1$ and $n_2$ are the sample sizes in group 1 and group 2; and

$hatp$ is the estimate of the sample means if H$_0$ is true, the best guess of which is simply the overall sample proportion (i.e. of all the data, ignoring which group an observation is from).



You might want to consider a continuity correction. For example, Hauck and Anderson's (1986) correction gives:



$c_textHA = frac12min(n_1,n_2)$, and a redefined $s_hatp$:



$s_hatp= sqrt frachatp_1(1-hatp_1)n_1-1 + frachatp_2(1-hatp_2)n_2-1$, so that



$z = frachatp_1 - hatp_2rightsqrt frachatp_1(1-hatp_1)n_1-1 + frachatp_2(1-hatp_2)n_2-1$



The appropriate $p$-value for this $z$-statistic is then calculated or looked up in a table, and compared to $alpha/2$ (two-tailed test).






About z tests for unpaired proportion equivalence

Because all differences are "statistically significant" given a large enough sample size, it is a good idea to decide beforehand what the smallest relevant difference in proportions is to you, and then look for evidence of such relevance. You find such evidence by combining the inferences from the test for difference just described, with a test for equivalence.



Suppose you decide beforehand that a meaningful difference in proportion for your purposes is on that is at least 0.05 (i.e. $|p_1 - p_2| ge 0.05$), then the corresponding test for equivalence of proportions for two independent groups is:



H$^text-_0text: |p_1 - p_2| ge 0.05$, which translates into two one-sided null hypotheses:



  1. H$^text-_01text: p_1 - p_2 ge 0.05$

  2. H$^text-_02text: p_1 - p_2 le -0.05$

These two one-sided null hypotheses can be tested with (these test statistics have been constructed both for upper tail one-sided tests):



  1. $z_1 = frac0.05 - left(hatp_1-hatp_2right)sqrthatpleft(1-hatpright)left[frac1n_1 + frac1n_2right]$, and

  2. $z_2 = fracleft(hatp_1-hatp_2right)+0.05sqrthatpleft(1-hatpright)left[frac1n_1 + frac1n_2right]$.

With a continuity correction $z_1$ and $z_2$ instead become (see Tu, 1997):



  1. $z_1 = frac0.05 - left(hatp_1-hatp_2right) + c_textHAsqrt frachatp_1(1-hatp_1)n_1-1 + frachatp_2(1-hatp_2)n_2-1$, and

  2. $z_2 = fracleft(hatp_1-hatp_2right)+0.05-c_textHAsqrt frachatp_1(1-hatp_1)n_1-1 + frachatp_2(1-hatp_2)n_2-1$.

If you reject both H$^text-_01$ and H$^text-_02$ (both tested at $alpha$, not $alpha/2$, and both tested with right tail rejection regions), then you can conclude that you have evidence of equivalence.




About relevance tests
Finally... if you combine inference from tests of H$_0$ and H$^text-_0$ (i.e. test for difference and test for equivalence), then you get one of the following possibilities:



  1. reject H$_0$ and reject H$^text-_0$: conclude trivial difference between proportions (i.e. yes there is a difference, but it's too small for you to care about because it is smaller than 0.05);

  2. reject H$_0$ and not reject H$^text-_0$: conclude relevant difference between proportions (i.e. larger than 0.05);

  3. not reject H$_0$ and reject H$^text-_0$: conclude equivalence of proportions; or

  4. not reject H$_0$ and not reject H$^text-_0$: conclude indeterminate (i.e. underpowered tests).




R code



First the test for difference:



Assume g1 and g2 are vectors containing the binomial data for group 1 and group 2 respectively.



n1 <- length(g1) #sample size group 1
n2 <- length(g2) #sample size group 2
p1 <- sum(g1)/n1 #p1 hat
p2 <- sum(g2)/n2 #p2 hat
n <- n1 + n2 #overall sample size
p <- sum(g1,g2)/n #p hat
cHA <- 1/(2*min(n1,n2))

# without continuity correction
z <- (p1 - p2)/sqrt(p*(1-p)*(1/n1 + 1/n2)) #test statistic
pval <- 1 - pnorm(abs(z)) #p-value reject H0 if it is <= alpha/2 (two-tailed)

# with continuity correction
zHA <- (abs(p1 - p2) - cHA)/sqrt((p1*(1-p1)/n1-1) + (p2*(1-p2)/n2-1)) #with continuity correction
pvalHA <- 1 - pnorm(abs(zHA)) #p-value reject H0 if it is <= alpha/2 (two-tailed)


Next the test for equivalence:



Delta <- 0.05 #Equivalence threshold of +/- 5%.
# You will want to carefully think about and select your own
# value for Delta before you conduct your test.


Again, assume g1 and g2 are vectors containing the binomial data for group 1 and group 2 respectively.



n1 <- length(g1) #sample size group 1
n2 <- length(g2) #sample size group 2
p1 <- sum(g1)/n1 #p1 hat
p2 <- sum(g2)/n2 #p2 hat
n <- n1 + n2 #overall sample size
p <- sum(g1,g2)/n #p hat
cHAeq <- sign(p1-p2)* (1/(2*min(n1,n2)))

# without continuity correction
z1 <- (Delta - (p1 - p2))/sqrt(p*(1-p)*(1/n1 + 1/n2)) #test statistic for H01
z2 <- ((p1 - p2) + Delta)/sqrt(p*(1-p)*(1/n1 + 1/n2)) #test statistic for H02
pval1 <- 1 - pnorm(z1) #p-value (upper tail) reject H0 if it is <= alpha (one tail)
pval2 <- 1 - pnorm(z2) #p-value (upper tail) reject H0 if it is <= alpha (one tail)

# with continuity correction
zHA1 <- (Delta - abs(p1 - p2) + cHAeq)/sqrt((p1*(1-p1)/n1-1) + (p2*(1-p2)/n2-1)) #with continuity correction
zHA2 <- (abs(p1 - p2) + Delta - cHAeq)/sqrt((p1*(1-p1)/n1-1) + (p2*(1-p2)/n2-1)) #with continuity correction
pvalHA1 <- 1 - pnorm(zHA1) #p-value (upper tail) reject H0 if it is <= alpha (one tail)
pvalHA2 <- 1 - pnorm(zHA2) #p-value (upper tail) reject H0 if it is <= alpha (one tail)





References



Hauck, W. W. and Anderson, S. (1986). A comparison of large-sample confidence interval methods for the difference of two binomial probabilities. The American Statistician, 40(4):318–322.



Tu, D. (1997). Two one-sided tests procedures in establishing therapeutic equivalence with binary clinical endpoints: fixed sample performances and sample size determination. Journal of Statistical Computation and Simulation, 59(3):271–290.






share|cite|improve this answer























    Your Answer




    StackExchange.ifUsing("editor", function ()
    return StackExchange.using("mathjaxEditing", function ()
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
    );
    );
    , "mathjax-editing");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "65"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    convertImagesToLinks: false,
    noModals: false,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );








     

    draft saved


    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f360826%2fequivalence-test-for-binominal-data%23new-answer', 'question_page');

    );

    Post as a guest






























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    2
    down vote













    While one can use the t test to test for proportion difference, the z test is a tad more precise, since it uses an estimate of the standard deviation formulated specifically for binomial (i.e. dichotomous, nominal, etc.) data. The same applies to the z test for proportion equivalence.



    First, the z test for difference in proportions of two independent samples is pretty straightforward:






    About z tests for unpaired proportion difference

    The null hypothesis is H$_0text: p_1 - p_2 = 0$ (i.e. H$_0text: p_1 = p_2$), with H$_textAtext: p_1 - p_2 ne 0$.



    $z = frachatp_1-hatp_2sqrthatpleft(1-hatpright)left[frac1n_1 + frac1n_2right]$,

    where:

    $hatp_1$ and $hatp_1$ are the sample proportions in group 1 and group 2;
    $n_1$ and $n_2$ are the sample sizes in group 1 and group 2; and

    $hatp$ is the estimate of the sample means if H$_0$ is true, the best guess of which is simply the overall sample proportion (i.e. of all the data, ignoring which group an observation is from).



    You might want to consider a continuity correction. For example, Hauck and Anderson's (1986) correction gives:



    $c_textHA = frac12min(n_1,n_2)$, and a redefined $s_hatp$:



    $s_hatp= sqrt frachatp_1(1-hatp_1)n_1-1 + frachatp_2(1-hatp_2)n_2-1$, so that



    $z = frachatp_1 - hatp_2rightsqrt frachatp_1(1-hatp_1)n_1-1 + frachatp_2(1-hatp_2)n_2-1$



    The appropriate $p$-value for this $z$-statistic is then calculated or looked up in a table, and compared to $alpha/2$ (two-tailed test).






    About z tests for unpaired proportion equivalence

    Because all differences are "statistically significant" given a large enough sample size, it is a good idea to decide beforehand what the smallest relevant difference in proportions is to you, and then look for evidence of such relevance. You find such evidence by combining the inferences from the test for difference just described, with a test for equivalence.



    Suppose you decide beforehand that a meaningful difference in proportion for your purposes is on that is at least 0.05 (i.e. $|p_1 - p_2| ge 0.05$), then the corresponding test for equivalence of proportions for two independent groups is:



    H$^text-_0text: |p_1 - p_2| ge 0.05$, which translates into two one-sided null hypotheses:



    1. H$^text-_01text: p_1 - p_2 ge 0.05$

    2. H$^text-_02text: p_1 - p_2 le -0.05$

    These two one-sided null hypotheses can be tested with (these test statistics have been constructed both for upper tail one-sided tests):



    1. $z_1 = frac0.05 - left(hatp_1-hatp_2right)sqrthatpleft(1-hatpright)left[frac1n_1 + frac1n_2right]$, and

    2. $z_2 = fracleft(hatp_1-hatp_2right)+0.05sqrthatpleft(1-hatpright)left[frac1n_1 + frac1n_2right]$.

    With a continuity correction $z_1$ and $z_2$ instead become (see Tu, 1997):



    1. $z_1 = frac0.05 - left(hatp_1-hatp_2right) + c_textHAsqrt frachatp_1(1-hatp_1)n_1-1 + frachatp_2(1-hatp_2)n_2-1$, and

    2. $z_2 = fracleft(hatp_1-hatp_2right)+0.05-c_textHAsqrt frachatp_1(1-hatp_1)n_1-1 + frachatp_2(1-hatp_2)n_2-1$.

    If you reject both H$^text-_01$ and H$^text-_02$ (both tested at $alpha$, not $alpha/2$, and both tested with right tail rejection regions), then you can conclude that you have evidence of equivalence.




    About relevance tests
    Finally... if you combine inference from tests of H$_0$ and H$^text-_0$ (i.e. test for difference and test for equivalence), then you get one of the following possibilities:



    1. reject H$_0$ and reject H$^text-_0$: conclude trivial difference between proportions (i.e. yes there is a difference, but it's too small for you to care about because it is smaller than 0.05);

    2. reject H$_0$ and not reject H$^text-_0$: conclude relevant difference between proportions (i.e. larger than 0.05);

    3. not reject H$_0$ and reject H$^text-_0$: conclude equivalence of proportions; or

    4. not reject H$_0$ and not reject H$^text-_0$: conclude indeterminate (i.e. underpowered tests).




    R code



    First the test for difference:



    Assume g1 and g2 are vectors containing the binomial data for group 1 and group 2 respectively.



    n1 <- length(g1) #sample size group 1
    n2 <- length(g2) #sample size group 2
    p1 <- sum(g1)/n1 #p1 hat
    p2 <- sum(g2)/n2 #p2 hat
    n <- n1 + n2 #overall sample size
    p <- sum(g1,g2)/n #p hat
    cHA <- 1/(2*min(n1,n2))

    # without continuity correction
    z <- (p1 - p2)/sqrt(p*(1-p)*(1/n1 + 1/n2)) #test statistic
    pval <- 1 - pnorm(abs(z)) #p-value reject H0 if it is <= alpha/2 (two-tailed)

    # with continuity correction
    zHA <- (abs(p1 - p2) - cHA)/sqrt((p1*(1-p1)/n1-1) + (p2*(1-p2)/n2-1)) #with continuity correction
    pvalHA <- 1 - pnorm(abs(zHA)) #p-value reject H0 if it is <= alpha/2 (two-tailed)


    Next the test for equivalence:



    Delta <- 0.05 #Equivalence threshold of +/- 5%.
    # You will want to carefully think about and select your own
    # value for Delta before you conduct your test.


    Again, assume g1 and g2 are vectors containing the binomial data for group 1 and group 2 respectively.



    n1 <- length(g1) #sample size group 1
    n2 <- length(g2) #sample size group 2
    p1 <- sum(g1)/n1 #p1 hat
    p2 <- sum(g2)/n2 #p2 hat
    n <- n1 + n2 #overall sample size
    p <- sum(g1,g2)/n #p hat
    cHAeq <- sign(p1-p2)* (1/(2*min(n1,n2)))

    # without continuity correction
    z1 <- (Delta - (p1 - p2))/sqrt(p*(1-p)*(1/n1 + 1/n2)) #test statistic for H01
    z2 <- ((p1 - p2) + Delta)/sqrt(p*(1-p)*(1/n1 + 1/n2)) #test statistic for H02
    pval1 <- 1 - pnorm(z1) #p-value (upper tail) reject H0 if it is <= alpha (one tail)
    pval2 <- 1 - pnorm(z2) #p-value (upper tail) reject H0 if it is <= alpha (one tail)

    # with continuity correction
    zHA1 <- (Delta - abs(p1 - p2) + cHAeq)/sqrt((p1*(1-p1)/n1-1) + (p2*(1-p2)/n2-1)) #with continuity correction
    zHA2 <- (abs(p1 - p2) + Delta - cHAeq)/sqrt((p1*(1-p1)/n1-1) + (p2*(1-p2)/n2-1)) #with continuity correction
    pvalHA1 <- 1 - pnorm(zHA1) #p-value (upper tail) reject H0 if it is <= alpha (one tail)
    pvalHA2 <- 1 - pnorm(zHA2) #p-value (upper tail) reject H0 if it is <= alpha (one tail)





    References



    Hauck, W. W. and Anderson, S. (1986). A comparison of large-sample confidence interval methods for the difference of two binomial probabilities. The American Statistician, 40(4):318–322.



    Tu, D. (1997). Two one-sided tests procedures in establishing therapeutic equivalence with binary clinical endpoints: fixed sample performances and sample size determination. Journal of Statistical Computation and Simulation, 59(3):271–290.






    share|cite|improve this answer



























      up vote
      2
      down vote













      While one can use the t test to test for proportion difference, the z test is a tad more precise, since it uses an estimate of the standard deviation formulated specifically for binomial (i.e. dichotomous, nominal, etc.) data. The same applies to the z test for proportion equivalence.



      First, the z test for difference in proportions of two independent samples is pretty straightforward:






      About z tests for unpaired proportion difference

      The null hypothesis is H$_0text: p_1 - p_2 = 0$ (i.e. H$_0text: p_1 = p_2$), with H$_textAtext: p_1 - p_2 ne 0$.



      $z = frachatp_1-hatp_2sqrthatpleft(1-hatpright)left[frac1n_1 + frac1n_2right]$,

      where:

      $hatp_1$ and $hatp_1$ are the sample proportions in group 1 and group 2;
      $n_1$ and $n_2$ are the sample sizes in group 1 and group 2; and

      $hatp$ is the estimate of the sample means if H$_0$ is true, the best guess of which is simply the overall sample proportion (i.e. of all the data, ignoring which group an observation is from).



      You might want to consider a continuity correction. For example, Hauck and Anderson's (1986) correction gives:



      $c_textHA = frac12min(n_1,n_2)$, and a redefined $s_hatp$:



      $s_hatp= sqrt frachatp_1(1-hatp_1)n_1-1 + frachatp_2(1-hatp_2)n_2-1$, so that



      $z = frachatp_1 - hatp_2rightsqrt frachatp_1(1-hatp_1)n_1-1 + frachatp_2(1-hatp_2)n_2-1$



      The appropriate $p$-value for this $z$-statistic is then calculated or looked up in a table, and compared to $alpha/2$ (two-tailed test).






      About z tests for unpaired proportion equivalence

      Because all differences are "statistically significant" given a large enough sample size, it is a good idea to decide beforehand what the smallest relevant difference in proportions is to you, and then look for evidence of such relevance. You find such evidence by combining the inferences from the test for difference just described, with a test for equivalence.



      Suppose you decide beforehand that a meaningful difference in proportion for your purposes is on that is at least 0.05 (i.e. $|p_1 - p_2| ge 0.05$), then the corresponding test for equivalence of proportions for two independent groups is:



      H$^text-_0text: |p_1 - p_2| ge 0.05$, which translates into two one-sided null hypotheses:



      1. H$^text-_01text: p_1 - p_2 ge 0.05$

      2. H$^text-_02text: p_1 - p_2 le -0.05$

      These two one-sided null hypotheses can be tested with (these test statistics have been constructed both for upper tail one-sided tests):



      1. $z_1 = frac0.05 - left(hatp_1-hatp_2right)sqrthatpleft(1-hatpright)left[frac1n_1 + frac1n_2right]$, and

      2. $z_2 = fracleft(hatp_1-hatp_2right)+0.05sqrthatpleft(1-hatpright)left[frac1n_1 + frac1n_2right]$.

      With a continuity correction $z_1$ and $z_2$ instead become (see Tu, 1997):



      1. $z_1 = frac0.05 - left(hatp_1-hatp_2right) + c_textHAsqrt frachatp_1(1-hatp_1)n_1-1 + frachatp_2(1-hatp_2)n_2-1$, and

      2. $z_2 = fracleft(hatp_1-hatp_2right)+0.05-c_textHAsqrt frachatp_1(1-hatp_1)n_1-1 + frachatp_2(1-hatp_2)n_2-1$.

      If you reject both H$^text-_01$ and H$^text-_02$ (both tested at $alpha$, not $alpha/2$, and both tested with right tail rejection regions), then you can conclude that you have evidence of equivalence.




      About relevance tests
      Finally... if you combine inference from tests of H$_0$ and H$^text-_0$ (i.e. test for difference and test for equivalence), then you get one of the following possibilities:



      1. reject H$_0$ and reject H$^text-_0$: conclude trivial difference between proportions (i.e. yes there is a difference, but it's too small for you to care about because it is smaller than 0.05);

      2. reject H$_0$ and not reject H$^text-_0$: conclude relevant difference between proportions (i.e. larger than 0.05);

      3. not reject H$_0$ and reject H$^text-_0$: conclude equivalence of proportions; or

      4. not reject H$_0$ and not reject H$^text-_0$: conclude indeterminate (i.e. underpowered tests).




      R code



      First the test for difference:



      Assume g1 and g2 are vectors containing the binomial data for group 1 and group 2 respectively.



      n1 <- length(g1) #sample size group 1
      n2 <- length(g2) #sample size group 2
      p1 <- sum(g1)/n1 #p1 hat
      p2 <- sum(g2)/n2 #p2 hat
      n <- n1 + n2 #overall sample size
      p <- sum(g1,g2)/n #p hat
      cHA <- 1/(2*min(n1,n2))

      # without continuity correction
      z <- (p1 - p2)/sqrt(p*(1-p)*(1/n1 + 1/n2)) #test statistic
      pval <- 1 - pnorm(abs(z)) #p-value reject H0 if it is <= alpha/2 (two-tailed)

      # with continuity correction
      zHA <- (abs(p1 - p2) - cHA)/sqrt((p1*(1-p1)/n1-1) + (p2*(1-p2)/n2-1)) #with continuity correction
      pvalHA <- 1 - pnorm(abs(zHA)) #p-value reject H0 if it is <= alpha/2 (two-tailed)


      Next the test for equivalence:



      Delta <- 0.05 #Equivalence threshold of +/- 5%.
      # You will want to carefully think about and select your own
      # value for Delta before you conduct your test.


      Again, assume g1 and g2 are vectors containing the binomial data for group 1 and group 2 respectively.



      n1 <- length(g1) #sample size group 1
      n2 <- length(g2) #sample size group 2
      p1 <- sum(g1)/n1 #p1 hat
      p2 <- sum(g2)/n2 #p2 hat
      n <- n1 + n2 #overall sample size
      p <- sum(g1,g2)/n #p hat
      cHAeq <- sign(p1-p2)* (1/(2*min(n1,n2)))

      # without continuity correction
      z1 <- (Delta - (p1 - p2))/sqrt(p*(1-p)*(1/n1 + 1/n2)) #test statistic for H01
      z2 <- ((p1 - p2) + Delta)/sqrt(p*(1-p)*(1/n1 + 1/n2)) #test statistic for H02
      pval1 <- 1 - pnorm(z1) #p-value (upper tail) reject H0 if it is <= alpha (one tail)
      pval2 <- 1 - pnorm(z2) #p-value (upper tail) reject H0 if it is <= alpha (one tail)

      # with continuity correction
      zHA1 <- (Delta - abs(p1 - p2) + cHAeq)/sqrt((p1*(1-p1)/n1-1) + (p2*(1-p2)/n2-1)) #with continuity correction
      zHA2 <- (abs(p1 - p2) + Delta - cHAeq)/sqrt((p1*(1-p1)/n1-1) + (p2*(1-p2)/n2-1)) #with continuity correction
      pvalHA1 <- 1 - pnorm(zHA1) #p-value (upper tail) reject H0 if it is <= alpha (one tail)
      pvalHA2 <- 1 - pnorm(zHA2) #p-value (upper tail) reject H0 if it is <= alpha (one tail)





      References



      Hauck, W. W. and Anderson, S. (1986). A comparison of large-sample confidence interval methods for the difference of two binomial probabilities. The American Statistician, 40(4):318–322.



      Tu, D. (1997). Two one-sided tests procedures in establishing therapeutic equivalence with binary clinical endpoints: fixed sample performances and sample size determination. Journal of Statistical Computation and Simulation, 59(3):271–290.






      share|cite|improve this answer

























        up vote
        2
        down vote










        up vote
        2
        down vote









        While one can use the t test to test for proportion difference, the z test is a tad more precise, since it uses an estimate of the standard deviation formulated specifically for binomial (i.e. dichotomous, nominal, etc.) data. The same applies to the z test for proportion equivalence.



        First, the z test for difference in proportions of two independent samples is pretty straightforward:






        About z tests for unpaired proportion difference

        The null hypothesis is H$_0text: p_1 - p_2 = 0$ (i.e. H$_0text: p_1 = p_2$), with H$_textAtext: p_1 - p_2 ne 0$.



        $z = frachatp_1-hatp_2sqrthatpleft(1-hatpright)left[frac1n_1 + frac1n_2right]$,

        where:

        $hatp_1$ and $hatp_1$ are the sample proportions in group 1 and group 2;
        $n_1$ and $n_2$ are the sample sizes in group 1 and group 2; and

        $hatp$ is the estimate of the sample means if H$_0$ is true, the best guess of which is simply the overall sample proportion (i.e. of all the data, ignoring which group an observation is from).



        You might want to consider a continuity correction. For example, Hauck and Anderson's (1986) correction gives:



        $c_textHA = frac12min(n_1,n_2)$, and a redefined $s_hatp$:



        $s_hatp= sqrt frachatp_1(1-hatp_1)n_1-1 + frachatp_2(1-hatp_2)n_2-1$, so that



        $z = frachatp_1 - hatp_2rightsqrt frachatp_1(1-hatp_1)n_1-1 + frachatp_2(1-hatp_2)n_2-1$



        The appropriate $p$-value for this $z$-statistic is then calculated or looked up in a table, and compared to $alpha/2$ (two-tailed test).






        About z tests for unpaired proportion equivalence

        Because all differences are "statistically significant" given a large enough sample size, it is a good idea to decide beforehand what the smallest relevant difference in proportions is to you, and then look for evidence of such relevance. You find such evidence by combining the inferences from the test for difference just described, with a test for equivalence.



        Suppose you decide beforehand that a meaningful difference in proportion for your purposes is on that is at least 0.05 (i.e. $|p_1 - p_2| ge 0.05$), then the corresponding test for equivalence of proportions for two independent groups is:



        H$^text-_0text: |p_1 - p_2| ge 0.05$, which translates into two one-sided null hypotheses:



        1. H$^text-_01text: p_1 - p_2 ge 0.05$

        2. H$^text-_02text: p_1 - p_2 le -0.05$

        These two one-sided null hypotheses can be tested with (these test statistics have been constructed both for upper tail one-sided tests):



        1. $z_1 = frac0.05 - left(hatp_1-hatp_2right)sqrthatpleft(1-hatpright)left[frac1n_1 + frac1n_2right]$, and

        2. $z_2 = fracleft(hatp_1-hatp_2right)+0.05sqrthatpleft(1-hatpright)left[frac1n_1 + frac1n_2right]$.

        With a continuity correction $z_1$ and $z_2$ instead become (see Tu, 1997):



        1. $z_1 = frac0.05 - left(hatp_1-hatp_2right) + c_textHAsqrt frachatp_1(1-hatp_1)n_1-1 + frachatp_2(1-hatp_2)n_2-1$, and

        2. $z_2 = fracleft(hatp_1-hatp_2right)+0.05-c_textHAsqrt frachatp_1(1-hatp_1)n_1-1 + frachatp_2(1-hatp_2)n_2-1$.

        If you reject both H$^text-_01$ and H$^text-_02$ (both tested at $alpha$, not $alpha/2$, and both tested with right tail rejection regions), then you can conclude that you have evidence of equivalence.




        About relevance tests
        Finally... if you combine inference from tests of H$_0$ and H$^text-_0$ (i.e. test for difference and test for equivalence), then you get one of the following possibilities:



        1. reject H$_0$ and reject H$^text-_0$: conclude trivial difference between proportions (i.e. yes there is a difference, but it's too small for you to care about because it is smaller than 0.05);

        2. reject H$_0$ and not reject H$^text-_0$: conclude relevant difference between proportions (i.e. larger than 0.05);

        3. not reject H$_0$ and reject H$^text-_0$: conclude equivalence of proportions; or

        4. not reject H$_0$ and not reject H$^text-_0$: conclude indeterminate (i.e. underpowered tests).




        R code



        First the test for difference:



        Assume g1 and g2 are vectors containing the binomial data for group 1 and group 2 respectively.



        n1 <- length(g1) #sample size group 1
        n2 <- length(g2) #sample size group 2
        p1 <- sum(g1)/n1 #p1 hat
        p2 <- sum(g2)/n2 #p2 hat
        n <- n1 + n2 #overall sample size
        p <- sum(g1,g2)/n #p hat
        cHA <- 1/(2*min(n1,n2))

        # without continuity correction
        z <- (p1 - p2)/sqrt(p*(1-p)*(1/n1 + 1/n2)) #test statistic
        pval <- 1 - pnorm(abs(z)) #p-value reject H0 if it is <= alpha/2 (two-tailed)

        # with continuity correction
        zHA <- (abs(p1 - p2) - cHA)/sqrt((p1*(1-p1)/n1-1) + (p2*(1-p2)/n2-1)) #with continuity correction
        pvalHA <- 1 - pnorm(abs(zHA)) #p-value reject H0 if it is <= alpha/2 (two-tailed)


        Next the test for equivalence:



        Delta <- 0.05 #Equivalence threshold of +/- 5%.
        # You will want to carefully think about and select your own
        # value for Delta before you conduct your test.


        Again, assume g1 and g2 are vectors containing the binomial data for group 1 and group 2 respectively.



        n1 <- length(g1) #sample size group 1
        n2 <- length(g2) #sample size group 2
        p1 <- sum(g1)/n1 #p1 hat
        p2 <- sum(g2)/n2 #p2 hat
        n <- n1 + n2 #overall sample size
        p <- sum(g1,g2)/n #p hat
        cHAeq <- sign(p1-p2)* (1/(2*min(n1,n2)))

        # without continuity correction
        z1 <- (Delta - (p1 - p2))/sqrt(p*(1-p)*(1/n1 + 1/n2)) #test statistic for H01
        z2 <- ((p1 - p2) + Delta)/sqrt(p*(1-p)*(1/n1 + 1/n2)) #test statistic for H02
        pval1 <- 1 - pnorm(z1) #p-value (upper tail) reject H0 if it is <= alpha (one tail)
        pval2 <- 1 - pnorm(z2) #p-value (upper tail) reject H0 if it is <= alpha (one tail)

        # with continuity correction
        zHA1 <- (Delta - abs(p1 - p2) + cHAeq)/sqrt((p1*(1-p1)/n1-1) + (p2*(1-p2)/n2-1)) #with continuity correction
        zHA2 <- (abs(p1 - p2) + Delta - cHAeq)/sqrt((p1*(1-p1)/n1-1) + (p2*(1-p2)/n2-1)) #with continuity correction
        pvalHA1 <- 1 - pnorm(zHA1) #p-value (upper tail) reject H0 if it is <= alpha (one tail)
        pvalHA2 <- 1 - pnorm(zHA2) #p-value (upper tail) reject H0 if it is <= alpha (one tail)





        References



        Hauck, W. W. and Anderson, S. (1986). A comparison of large-sample confidence interval methods for the difference of two binomial probabilities. The American Statistician, 40(4):318–322.



        Tu, D. (1997). Two one-sided tests procedures in establishing therapeutic equivalence with binary clinical endpoints: fixed sample performances and sample size determination. Journal of Statistical Computation and Simulation, 59(3):271–290.






        share|cite|improve this answer















        While one can use the t test to test for proportion difference, the z test is a tad more precise, since it uses an estimate of the standard deviation formulated specifically for binomial (i.e. dichotomous, nominal, etc.) data. The same applies to the z test for proportion equivalence.



        First, the z test for difference in proportions of two independent samples is pretty straightforward:






        About z tests for unpaired proportion difference

        The null hypothesis is H$_0text: p_1 - p_2 = 0$ (i.e. H$_0text: p_1 = p_2$), with H$_textAtext: p_1 - p_2 ne 0$.



        $z = frachatp_1-hatp_2sqrthatpleft(1-hatpright)left[frac1n_1 + frac1n_2right]$,

        where:

        $hatp_1$ and $hatp_1$ are the sample proportions in group 1 and group 2;
        $n_1$ and $n_2$ are the sample sizes in group 1 and group 2; and

        $hatp$ is the estimate of the sample means if H$_0$ is true, the best guess of which is simply the overall sample proportion (i.e. of all the data, ignoring which group an observation is from).



        You might want to consider a continuity correction. For example, Hauck and Anderson's (1986) correction gives:



        $c_textHA = frac12min(n_1,n_2)$, and a redefined $s_hatp$:



        $s_hatp= sqrt frachatp_1(1-hatp_1)n_1-1 + frachatp_2(1-hatp_2)n_2-1$, so that



        $z = frachatp_1 - hatp_2rightsqrt frachatp_1(1-hatp_1)n_1-1 + frachatp_2(1-hatp_2)n_2-1$



        The appropriate $p$-value for this $z$-statistic is then calculated or looked up in a table, and compared to $alpha/2$ (two-tailed test).






        About z tests for unpaired proportion equivalence

        Because all differences are "statistically significant" given a large enough sample size, it is a good idea to decide beforehand what the smallest relevant difference in proportions is to you, and then look for evidence of such relevance. You find such evidence by combining the inferences from the test for difference just described, with a test for equivalence.



        Suppose you decide beforehand that a meaningful difference in proportion for your purposes is on that is at least 0.05 (i.e. $|p_1 - p_2| ge 0.05$), then the corresponding test for equivalence of proportions for two independent groups is:



        H$^text-_0text: |p_1 - p_2| ge 0.05$, which translates into two one-sided null hypotheses:



        1. H$^text-_01text: p_1 - p_2 ge 0.05$

        2. H$^text-_02text: p_1 - p_2 le -0.05$

        These two one-sided null hypotheses can be tested with (these test statistics have been constructed both for upper tail one-sided tests):



        1. $z_1 = frac0.05 - left(hatp_1-hatp_2right)sqrthatpleft(1-hatpright)left[frac1n_1 + frac1n_2right]$, and

        2. $z_2 = fracleft(hatp_1-hatp_2right)+0.05sqrthatpleft(1-hatpright)left[frac1n_1 + frac1n_2right]$.

        With a continuity correction $z_1$ and $z_2$ instead become (see Tu, 1997):



        1. $z_1 = frac0.05 - left(hatp_1-hatp_2right) + c_textHAsqrt frachatp_1(1-hatp_1)n_1-1 + frachatp_2(1-hatp_2)n_2-1$, and

        2. $z_2 = fracleft(hatp_1-hatp_2right)+0.05-c_textHAsqrt frachatp_1(1-hatp_1)n_1-1 + frachatp_2(1-hatp_2)n_2-1$.

        If you reject both H$^text-_01$ and H$^text-_02$ (both tested at $alpha$, not $alpha/2$, and both tested with right tail rejection regions), then you can conclude that you have evidence of equivalence.




        About relevance tests
        Finally... if you combine inference from tests of H$_0$ and H$^text-_0$ (i.e. test for difference and test for equivalence), then you get one of the following possibilities:



        1. reject H$_0$ and reject H$^text-_0$: conclude trivial difference between proportions (i.e. yes there is a difference, but it's too small for you to care about because it is smaller than 0.05);

        2. reject H$_0$ and not reject H$^text-_0$: conclude relevant difference between proportions (i.e. larger than 0.05);

        3. not reject H$_0$ and reject H$^text-_0$: conclude equivalence of proportions; or

        4. not reject H$_0$ and not reject H$^text-_0$: conclude indeterminate (i.e. underpowered tests).




        R code



        First the test for difference:



        Assume g1 and g2 are vectors containing the binomial data for group 1 and group 2 respectively.



        n1 <- length(g1) #sample size group 1
        n2 <- length(g2) #sample size group 2
        p1 <- sum(g1)/n1 #p1 hat
        p2 <- sum(g2)/n2 #p2 hat
        n <- n1 + n2 #overall sample size
        p <- sum(g1,g2)/n #p hat
        cHA <- 1/(2*min(n1,n2))

        # without continuity correction
        z <- (p1 - p2)/sqrt(p*(1-p)*(1/n1 + 1/n2)) #test statistic
        pval <- 1 - pnorm(abs(z)) #p-value reject H0 if it is <= alpha/2 (two-tailed)

        # with continuity correction
        zHA <- (abs(p1 - p2) - cHA)/sqrt((p1*(1-p1)/n1-1) + (p2*(1-p2)/n2-1)) #with continuity correction
        pvalHA <- 1 - pnorm(abs(zHA)) #p-value reject H0 if it is <= alpha/2 (two-tailed)


        Next the test for equivalence:



        Delta <- 0.05 #Equivalence threshold of +/- 5%.
        # You will want to carefully think about and select your own
        # value for Delta before you conduct your test.


        Again, assume g1 and g2 are vectors containing the binomial data for group 1 and group 2 respectively.



        n1 <- length(g1) #sample size group 1
        n2 <- length(g2) #sample size group 2
        p1 <- sum(g1)/n1 #p1 hat
        p2 <- sum(g2)/n2 #p2 hat
        n <- n1 + n2 #overall sample size
        p <- sum(g1,g2)/n #p hat
        cHAeq <- sign(p1-p2)* (1/(2*min(n1,n2)))

        # without continuity correction
        z1 <- (Delta - (p1 - p2))/sqrt(p*(1-p)*(1/n1 + 1/n2)) #test statistic for H01
        z2 <- ((p1 - p2) + Delta)/sqrt(p*(1-p)*(1/n1 + 1/n2)) #test statistic for H02
        pval1 <- 1 - pnorm(z1) #p-value (upper tail) reject H0 if it is <= alpha (one tail)
        pval2 <- 1 - pnorm(z2) #p-value (upper tail) reject H0 if it is <= alpha (one tail)

        # with continuity correction
        zHA1 <- (Delta - abs(p1 - p2) + cHAeq)/sqrt((p1*(1-p1)/n1-1) + (p2*(1-p2)/n2-1)) #with continuity correction
        zHA2 <- (abs(p1 - p2) + Delta - cHAeq)/sqrt((p1*(1-p1)/n1-1) + (p2*(1-p2)/n2-1)) #with continuity correction
        pvalHA1 <- 1 - pnorm(zHA1) #p-value (upper tail) reject H0 if it is <= alpha (one tail)
        pvalHA2 <- 1 - pnorm(zHA2) #p-value (upper tail) reject H0 if it is <= alpha (one tail)





        References



        Hauck, W. W. and Anderson, S. (1986). A comparison of large-sample confidence interval methods for the difference of two binomial probabilities. The American Statistician, 40(4):318–322.



        Tu, D. (1997). Two one-sided tests procedures in establishing therapeutic equivalence with binary clinical endpoints: fixed sample performances and sample size determination. Journal of Statistical Computation and Simulation, 59(3):271–290.







        share|cite|improve this answer















        share|cite|improve this answer



        share|cite|improve this answer








        edited 6 hours ago


























        answered 7 hours ago









        Alexis

        14.1k34387




        14.1k34387






















             

            draft saved


            draft discarded


























             


            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f360826%2fequivalence-test-for-binominal-data%23new-answer', 'question_page');

            );

            Post as a guest













































































            Comments

            Popular posts from this blog

            What is the equation of a 3D cone with generalised tilt?

            Relationship between determinant of matrix and determinant of adjoint?

            Color the edges and diagonals of a regular polygon