Does comparing two p-values make sense?

Clash Royale CLAN TAG#URR8PPP

up vote
3
down vote

favorite

Does comparing two p-values make sense?

For example, the p-value of factors willingness to pay and the number of owned cars is 0.3.

The p-value of willingness to pay and the number of owned pets is 0.6.

Can I claim that

the number of owned cars has a stronger relationship with willingness to pay

and the number of owned cars explains willingness to pay

more than the number of owned pets does?

I know that p-value with less than 0.05 is significant but not sure if the p-value is larger then 0.05 we can compare two p-values.

asked Jul 28 at 22:53

Marcus Thornton

1212

1

When you write "p-value" between two factors do you mean the correlation $rho$? If so, your final comment about $0.05$ is something else
â€“Â Henry
Jul 28 at 23:27

IÃ¢Â€Â™m talking something like this: medcalc.org/manual/chi-square-table.php
â€“Â Marcus Thornton
Jul 29 at 1:11

Pondering your question. I guess I know what 'number of cars owned' looks like: (0, 1, 3, 1, 2, 1, 0, 0, 1, ...). And similarly for pets. But what do data for 'willingness to pay' look like? Likert scale (ordinal) or some sort of numeric scale? // And how are the chi-squared statistics computed? // I don't think P-values should be used in the way you propose, but I'd like to give meaningful examples why not. And maybe suggest an alternative that would work.
â€“Â BruceET
Jul 31 at 6:42

Willingness to pay is a class having high, medium, and low.
â€“Â Marcus Thornton
Jul 31 at 23:32

Are numbers of cars and pets also expressed as high, medium, and low? If so, you can do chi-sq tests of independence to compare 'Nr. Cars' and 'Willingness', etc.// You can't "prove" that one connection "explains" another, but you might collect evidence to make speculation worthwhile.// As an alternative to looking at P-values (not a good idea as I hope I've explained in my Answer), you may want to look at correlations as measured by 'Kendall's tau' $tau$ or Spearman's $rho.$
â€“Â BruceET
Aug 1 at 1:16

Â |Â
show 1 more comment

up vote
3
down vote

favorite

Does comparing two p-values make sense?

For example, the p-value of factors willingness to pay and the number of owned cars is 0.3.

The p-value of willingness to pay and the number of owned pets is 0.6.

Can I claim that

the number of owned cars has a stronger relationship with willingness to pay

and the number of owned cars explains willingness to pay

more than the number of owned pets does?

I know that p-value with less than 0.05 is significant but not sure if the p-value is larger then 0.05 we can compare two p-values.

asked Jul 28 at 22:53

Marcus Thornton

1212

1

When you write "p-value" between two factors do you mean the correlation $rho$? If so, your final comment about $0.05$ is something else
â€“Â Henry
Jul 28 at 23:27

IÃ¢Â€Â™m talking something like this: medcalc.org/manual/chi-square-table.php
â€“Â Marcus Thornton
Jul 29 at 1:11

Pondering your question. I guess I know what 'number of cars owned' looks like: (0, 1, 3, 1, 2, 1, 0, 0, 1, ...). And similarly for pets. But what do data for 'willingness to pay' look like? Likert scale (ordinal) or some sort of numeric scale? // And how are the chi-squared statistics computed? // I don't think P-values should be used in the way you propose, but I'd like to give meaningful examples why not. And maybe suggest an alternative that would work.
â€“Â BruceET
Jul 31 at 6:42

Willingness to pay is a class having high, medium, and low.
â€“Â Marcus Thornton
Jul 31 at 23:32

Are numbers of cars and pets also expressed as high, medium, and low? If so, you can do chi-sq tests of independence to compare 'Nr. Cars' and 'Willingness', etc.// You can't "prove" that one connection "explains" another, but you might collect evidence to make speculation worthwhile.// As an alternative to looking at P-values (not a good idea as I hope I've explained in my Answer), you may want to look at correlations as measured by 'Kendall's tau' $tau$ or Spearman's $rho.$
â€“Â BruceET
Aug 1 at 1:16

Â |Â
show 1 more comment

up vote
3
down vote

favorite

Does comparing two p-values make sense?

For example, the p-value of factors willingness to pay and the number of owned cars is 0.3.

The p-value of willingness to pay and the number of owned pets is 0.6.

Can I claim that

the number of owned cars has a stronger relationship with willingness to pay

and the number of owned cars explains willingness to pay

more than the number of owned pets does?

I know that p-value with less than 0.05 is significant but not sure if the p-value is larger then 0.05 we can compare two p-values.

asked Jul 28 at 22:53

Marcus Thornton

1212

Does comparing two p-values make sense?

For example, the p-value of factors willingness to pay and the number of owned cars is 0.3.

The p-value of willingness to pay and the number of owned pets is 0.6.

Can I claim that

the number of owned cars has a stronger relationship with willingness to pay

and the number of owned cars explains willingness to pay

more than the number of owned pets does?

I know that p-value with less than 0.05 is significant but not sure if the p-value is larger then 0.05 we can compare two p-values.

asked Jul 28 at 22:53

Marcus Thornton

1212

asked Jul 28 at 22:53

Marcus Thornton

1212

asked Jul 28 at 22:53

Marcus Thornton

1212

asked Jul 28 at 22:53

Marcus Thornton

1212

1

When you write "p-value" between two factors do you mean the correlation $rho$? If so, your final comment about $0.05$ is something else
â€“Â Henry
Jul 28 at 23:27

IÃ¢Â€Â™m talking something like this: medcalc.org/manual/chi-square-table.php
â€“Â Marcus Thornton
Jul 29 at 1:11

Pondering your question. I guess I know what 'number of cars owned' looks like: (0, 1, 3, 1, 2, 1, 0, 0, 1, ...). And similarly for pets. But what do data for 'willingness to pay' look like? Likert scale (ordinal) or some sort of numeric scale? // And how are the chi-squared statistics computed? // I don't think P-values should be used in the way you propose, but I'd like to give meaningful examples why not. And maybe suggest an alternative that would work.
â€“Â BruceET
Jul 31 at 6:42

Willingness to pay is a class having high, medium, and low.
â€“Â Marcus Thornton
Jul 31 at 23:32

Are numbers of cars and pets also expressed as high, medium, and low? If so, you can do chi-sq tests of independence to compare 'Nr. Cars' and 'Willingness', etc.// You can't "prove" that one connection "explains" another, but you might collect evidence to make speculation worthwhile.// As an alternative to looking at P-values (not a good idea as I hope I've explained in my Answer), you may want to look at correlations as measured by 'Kendall's tau' $tau$ or Spearman's $rho.$
â€“Â BruceET
Aug 1 at 1:16

Â |Â
show 1 more comment

1

When you write "p-value" between two factors do you mean the correlation $rho$? If so, your final comment about $0.05$ is something else
â€“Â Henry
Jul 28 at 23:27

IÃ¢Â€Â™m talking something like this: medcalc.org/manual/chi-square-table.php
â€“Â Marcus Thornton
Jul 29 at 1:11

Pondering your question. I guess I know what 'number of cars owned' looks like: (0, 1, 3, 1, 2, 1, 0, 0, 1, ...). And similarly for pets. But what do data for 'willingness to pay' look like? Likert scale (ordinal) or some sort of numeric scale? // And how are the chi-squared statistics computed? // I don't think P-values should be used in the way you propose, but I'd like to give meaningful examples why not. And maybe suggest an alternative that would work.
â€“Â BruceET
Jul 31 at 6:42

Willingness to pay is a class having high, medium, and low.
â€“Â Marcus Thornton
Jul 31 at 23:32

Are numbers of cars and pets also expressed as high, medium, and low? If so, you can do chi-sq tests of independence to compare 'Nr. Cars' and 'Willingness', etc.// You can't "prove" that one connection "explains" another, but you might collect evidence to make speculation worthwhile.// As an alternative to looking at P-values (not a good idea as I hope I've explained in my Answer), you may want to look at correlations as measured by 'Kendall's tau' $tau$ or Spearman's $rho.$
â€“Â BruceET
Aug 1 at 1:16

When you write "p-value" between two factors do you mean the correlation $rho$? If so, your final comment about $0.05$ is something else
â€“Â Henry
Jul 28 at 23:27

IÃ¢Â€Â™m talking something like this: medcalc.org/manual/chi-square-table.php
â€“Â Marcus Thornton
Jul 29 at 1:11

Pondering your question. I guess I know what 'number of cars owned' looks like: (0, 1, 3, 1, 2, 1, 0, 0, 1, ...). And similarly for pets. But what do data for 'willingness to pay' look like? Likert scale (ordinal) or some sort of numeric scale? // And how are the chi-squared statistics computed? // I don't think P-values should be used in the way you propose, but I'd like to give meaningful examples why not. And maybe suggest an alternative that would work.
â€“Â BruceET
Jul 31 at 6:42

Willingness to pay is a class having high, medium, and low.
â€“Â Marcus Thornton
Jul 31 at 23:32

Are numbers of cars and pets also expressed as high, medium, and low? If so, you can do chi-sq tests of independence to compare 'Nr. Cars' and 'Willingness', etc.// You can't "prove" that one connection "explains" another, but you might collect evidence to make speculation worthwhile.// As an alternative to looking at P-values (not a good idea as I hope I've explained in my Answer), you may want to look at correlations as measured by 'Kendall's tau' $tau$ or Spearman's $rho.$
â€“Â BruceET
Aug 1 at 1:16

Â |Â
show 1 more comment

1 Answer
1

active

oldest

votes

up vote
1
down vote

Absent requested clarifications, I can only make generic comments on
the proper uses of P-values.

If a chi-squared goodness-of-fit test or test for independence has a
statistic $Q$ that is approximately distributed as $mathsfChisq(textdf = 5),$
then the critical critical values for tests at the 5% and 1% levels, respectively, are $c = 11.07$ and $c = 15.07.$ You can find these values
on row 5 of the table to which you linked; I have found them using R statistical
software below:

qchisq(c(.95, .99), 5)
[1] 11.07050 15.08627

So if your computed value of the test statistic is $Q = 12.33,$ you can
reject the null hypothesis at the 5% level, but not at the 1% level.

Nowadays, most statistical software gives P-values instead of dealing
with specific fixed levels of significance. Software can do that because it
can find more detailed information about a particular distribution
(for example, $mathsfChisq(textdf = 5)$) than is convenient to print
in a published table.

Specifically, the P-value 0.0305 corresponding to $Q = 12.33$ is the area under
the density function for $mathsfChisq(textdf = 5)$ to the right of
of 12.33. You would reject at the 5% level because $0.0305 < 0.05,$ but not
at the 1% level because $0.0305 > 0.01.$

1 - pchisq(12.33, 5)
[1] 0.03053538

Thus given the P-value, a person can choose their own significance level, and
make a determination whether the test shows a significant result at that level.
So it is fair to say that small P-values are useful to determine the result
of a test, and that a tiny P-value such as 0.0003 indicates stronger evidence
against $H_0$ than does a larger one such as 0.045--even though both P-values lead
to rejection at the 5% level.

However, it is not generally useful to make distinctions between the
'information contained' in larger P-values such as 0.3 and 0.6. That is
because, assuming $H_0$ to be true, the P-value is a random variable
that is approximately uniform on the interval $(0,1).$ For a continuous
test statistic, such as $Z$ in a normal test or $T$ in a t test, one can
prove that P-values are precisely $mathsfUnif(0,1).$ For most discrete
test statistics P-values are roughly, but not exactly uniform. (One
usually explores the distributions of such P-values through simulation.)

The test statistic $Q$ for a chi-squared goodness-of-fit statistic is discrete,
because its values are based on integer counts. A simple example is to
see what happens in repeated tests whether a die is fair. If a die is rolled
$n = 600$ times, then we ought to see each of the six faces "about 100" times.
The purpose of the chi-squared statistic is to assess whether the actual
face counts are sufficiently close to the expected 100 to say results are
consistent with a fair die.

The R code below simulates 100,000 such 600-roll experiments and finds the test
statistic
$Q = sum_i=1^6 frac(X_i-100)^2100$ for each experiment. Then we can
make a histogram of the 100,000 values of $Q$ and also a histogram of the
corresponding 100,000 P-values.

set.seed(1234)
m = 10^5; n = 600; E = n/6; die = 1:6; q = numeric(m)
for (i in 1:m) 
 faces = sample(die, 600, rep=T)
 X = rle(sort(faces))$lengths
 q[i] = sum((X-E)^2/E) 

mean(q >= 11.07)
[1] 0.04864

pv = 1 - pchisq(q, 5)
mean(pv <= .05)
[1] 0.04864

Because rolls of fair dice are simulated, it is not surprising to see that
$Q > 11.07$ for about 5% of the 600-roll experiments. Equivalently, about 5% of the P-values are below 0.05.

From the histogram we can see that $Q$ has approximately the target chi-squared
distribution, rejecting for values to the right of the vertical broken line.
Also, the P-values are approximately normally distributed, rejecting for
values to the left of the vertical line.

enter image description here

The point of this demonstration is that the uniform distribution of P-values
makes it difficult to say that particular P-values such as .3 and .6 are
more remarkable or meaningful than others. Ordinarily, we only care about whether P-values
are small enough to lead to rejection at our chosen significance level.

edited Aug 1 at 1:05

answered Jul 31 at 19:33

BruceET

33.1k61440

add a commentÂ |Â

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "69"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2865648%2fdoes-comparing-two-p-values-make-sense%23new-answer', 'question_page');

);

Post as a guest

Name

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
1
down vote

Absent requested clarifications, I can only make generic comments on
the proper uses of P-values.

qchisq(c(.95, .99), 5)
[1] 11.07050 15.08627

So if your computed value of the test statistic is $Q = 12.33,$ you can
reject the null hypothesis at the 5% level, but not at the 1% level.

1 - pchisq(12.33, 5)
[1] 0.03053538

set.seed(1234)
m = 10^5; n = 600; E = n/6; die = 1:6; q = numeric(m)
for (i in 1:m) 
 faces = sample(die, 600, rep=T)
 X = rle(sort(faces))$lengths
 q[i] = sum((X-E)^2/E) 

mean(q >= 11.07)
[1] 0.04864

pv = 1 - pchisq(q, 5)
mean(pv <= .05)
[1] 0.04864

Because rolls of fair dice are simulated, it is not surprising to see that
$Q > 11.07$ for about 5% of the 600-roll experiments. Equivalently, about 5% of the P-values are below 0.05.

enter image description here

edited Aug 1 at 1:05

answered Jul 31 at 19:33

BruceET

33.1k61440

add a commentÂ |Â

up vote
1
down vote

Absent requested clarifications, I can only make generic comments on
the proper uses of P-values.

qchisq(c(.95, .99), 5)
[1] 11.07050 15.08627

So if your computed value of the test statistic is $Q = 12.33,$ you can
reject the null hypothesis at the 5% level, but not at the 1% level.

1 - pchisq(12.33, 5)
[1] 0.03053538

set.seed(1234)
m = 10^5; n = 600; E = n/6; die = 1:6; q = numeric(m)
for (i in 1:m) 
 faces = sample(die, 600, rep=T)
 X = rle(sort(faces))$lengths
 q[i] = sum((X-E)^2/E) 

mean(q >= 11.07)
[1] 0.04864

pv = 1 - pchisq(q, 5)
mean(pv <= .05)
[1] 0.04864

Because rolls of fair dice are simulated, it is not surprising to see that
$Q > 11.07$ for about 5% of the 600-roll experiments. Equivalently, about 5% of the P-values are below 0.05.

enter image description here

edited Aug 1 at 1:05

answered Jul 31 at 19:33

BruceET

33.1k61440

add a commentÂ |Â

up vote
1
down vote

Absent requested clarifications, I can only make generic comments on
the proper uses of P-values.

qchisq(c(.95, .99), 5)
[1] 11.07050 15.08627

So if your computed value of the test statistic is $Q = 12.33,$ you can
reject the null hypothesis at the 5% level, but not at the 1% level.

1 - pchisq(12.33, 5)
[1] 0.03053538

set.seed(1234)
m = 10^5; n = 600; E = n/6; die = 1:6; q = numeric(m)
for (i in 1:m) 
 faces = sample(die, 600, rep=T)
 X = rle(sort(faces))$lengths
 q[i] = sum((X-E)^2/E) 

mean(q >= 11.07)
[1] 0.04864

pv = 1 - pchisq(q, 5)
mean(pv <= .05)
[1] 0.04864

Because rolls of fair dice are simulated, it is not surprising to see that
$Q > 11.07$ for about 5% of the 600-roll experiments. Equivalently, about 5% of the P-values are below 0.05.

enter image description here

edited Aug 1 at 1:05

answered Jul 31 at 19:33

BruceET

33.1k61440

Absent requested clarifications, I can only make generic comments on
the proper uses of P-values.

qchisq(c(.95, .99), 5)
[1] 11.07050 15.08627

So if your computed value of the test statistic is $Q = 12.33,$ you can
reject the null hypothesis at the 5% level, but not at the 1% level.

1 - pchisq(12.33, 5)
[1] 0.03053538

set.seed(1234)
m = 10^5; n = 600; E = n/6; die = 1:6; q = numeric(m)
for (i in 1:m) 
 faces = sample(die, 600, rep=T)
 X = rle(sort(faces))$lengths
 q[i] = sum((X-E)^2/E) 

mean(q >= 11.07)
[1] 0.04864

pv = 1 - pchisq(q, 5)
mean(pv <= .05)
[1] 0.04864

Because rolls of fair dice are simulated, it is not surprising to see that
$Q > 11.07$ for about 5% of the 600-roll experiments. Equivalently, about 5% of the P-values are below 0.05.

enter image description here

edited Aug 1 at 1:05

answered Jul 31 at 19:33

BruceET

33.1k61440

edited Aug 1 at 1:05

answered Jul 31 at 19:33

BruceET

33.1k61440

answered Jul 31 at 19:33

BruceET

33.1k61440

answered Jul 31 at 19:33

BruceET

33.1k61440

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

Search This Blog

ukmuiik