Can two distinct datasets of the same size have the same median and the same deviation from a real number?
Clash Royale CLAN TAG#URR8PPP
up vote
0
down vote
favorite
Can two distinct datasets of the same size have the same median and the same deviation from any real number?
For example Let $a_1 < a_2 < ··· < a_n$ and $b_1 < b_2 < ··· < b_n$ be real numbers such that $sum_i=1^n |a_i-x|=sum_i=1^n|b_i-x|$,where $x$ is any real.
Now can it be proved that $a_i=b_i$?
statistics standard-deviation
add a comment |Â
up vote
0
down vote
favorite
Can two distinct datasets of the same size have the same median and the same deviation from any real number?
For example Let $a_1 < a_2 < ··· < a_n$ and $b_1 < b_2 < ··· < b_n$ be real numbers such that $sum_i=1^n |a_i-x|=sum_i=1^n|b_i-x|$,where $x$ is any real.
Now can it be proved that $a_i=b_i$?
statistics standard-deviation
Is the deviation to be from one real number, as the first sentence suggests, or all real numbers, as the next to last suggests?
– Ross Millikan
Jul 24 at 3:06
Any real number $x$
– Legend Killer
Jul 24 at 3:07
add a comment |Â
up vote
0
down vote
favorite
up vote
0
down vote
favorite
Can two distinct datasets of the same size have the same median and the same deviation from any real number?
For example Let $a_1 < a_2 < ··· < a_n$ and $b_1 < b_2 < ··· < b_n$ be real numbers such that $sum_i=1^n |a_i-x|=sum_i=1^n|b_i-x|$,where $x$ is any real.
Now can it be proved that $a_i=b_i$?
statistics standard-deviation
Can two distinct datasets of the same size have the same median and the same deviation from any real number?
For example Let $a_1 < a_2 < ··· < a_n$ and $b_1 < b_2 < ··· < b_n$ be real numbers such that $sum_i=1^n |a_i-x|=sum_i=1^n|b_i-x|$,where $x$ is any real.
Now can it be proved that $a_i=b_i$?
statistics standard-deviation
edited Jul 24 at 3:07
asked Jul 24 at 3:00
Legend Killer
1,500523
1,500523
Is the deviation to be from one real number, as the first sentence suggests, or all real numbers, as the next to last suggests?
– Ross Millikan
Jul 24 at 3:06
Any real number $x$
– Legend Killer
Jul 24 at 3:07
add a comment |Â
Is the deviation to be from one real number, as the first sentence suggests, or all real numbers, as the next to last suggests?
– Ross Millikan
Jul 24 at 3:06
Any real number $x$
– Legend Killer
Jul 24 at 3:07
Is the deviation to be from one real number, as the first sentence suggests, or all real numbers, as the next to last suggests?
– Ross Millikan
Jul 24 at 3:06
Is the deviation to be from one real number, as the first sentence suggests, or all real numbers, as the next to last suggests?
– Ross Millikan
Jul 24 at 3:06
Any real number $x$
– Legend Killer
Jul 24 at 3:07
Any real number $x$
– Legend Killer
Jul 24 at 3:07
add a comment |Â
1 Answer
1
active
oldest
votes
up vote
1
down vote
accepted
The total deviation constraint is enough without the requirement on the median and we can prove that the size of the two sets is the same. Let $clt a_1,b_1$ and $d=c-1$. The total deviation of the $a$s from $d$ is greater than the total deviation of the $a$s from $c$ by the number of the $a$s because $c$ is $1$ closer to each one. If the total deviation of the $a$s from $d$ equals the total deviation of the $b$s from $d$ and similarly for $c$ there must be the same number of $a$s and $b$s. As you have done, call that number $n.$
Now assume $a_1 neq b_1$. WOLOG we can assume $a_1 lt b_1$ and define $e=min(frac 12 (b_1-a_1), frac 12(a_2-a_1))gt 0$. We are given that the total deviation of both the $a$s and $b$s from $a_1$ is some number $f$. The total deviation of the $a$s from $a_1+e$ is $f+e-(n-1)e=f-(n-2)e$ because we are going away from $a_1$ and towards all the other $a$s. The total deviation of the $b$s from $a_1+e$ is $f-ne$ because we are going towards all of them. This is a contradiction, so $a_1=b_1$. We can repeat the argument now for $a_2$ and $b_2$ and so on up the line.
How can you calculate the sum of deviations of $a$ from $a_1+e$?
– Legend Killer
Jul 24 at 3:58
Once I have the sum of deviations from $a_1$, which I defined as $f$, I look at how each term in the sum of deviations changes. When you go to $a_1+e$ you get $e$ further from $a_1$ and $e$ closer to all the rest. I chose $e$ small enough to ensure you don't move past any of the others to make this work. You also get $e$ closer to all the $b$s. That is the heart of the argument. Take the $a$s as $1,2,3$ and $e=0.1$ and do the calculation by hand to see how it works.
– Ross Millikan
Jul 24 at 4:02
It is a brilliant proof , sir.I actually proved the two datasets are equal in size by taking x to be the median of the set a
– Legend Killer
Jul 24 at 4:11
I don't see how the median can prove the datasets are equal in size. Take a dataset and add one value above the median and one below. You have a new dataset with the same median and more values. Maybe you have more information than that. It is true that the median has the property that it is a minimum in the absolute deviation and the slope of the absolute deviation reflects the size of the dataset. If the slopes are equal the size of the datasets are equal.
– Ross Millikan
Jul 24 at 4:19
My approach was $sum |a_i-x| $ is least when $x=a_m$ ,that is median of set $a$.As the two sums are equal $sum |b_i-x|$ is also least when $x=a_m$. But we know $sum|b_i - x|$ is least when $x=b_m$.So, in fact the two medians are same and so they have equal number of observations below them
– Legend Killer
Jul 24 at 4:25
 |Â
show 1 more comment
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
accepted
The total deviation constraint is enough without the requirement on the median and we can prove that the size of the two sets is the same. Let $clt a_1,b_1$ and $d=c-1$. The total deviation of the $a$s from $d$ is greater than the total deviation of the $a$s from $c$ by the number of the $a$s because $c$ is $1$ closer to each one. If the total deviation of the $a$s from $d$ equals the total deviation of the $b$s from $d$ and similarly for $c$ there must be the same number of $a$s and $b$s. As you have done, call that number $n.$
Now assume $a_1 neq b_1$. WOLOG we can assume $a_1 lt b_1$ and define $e=min(frac 12 (b_1-a_1), frac 12(a_2-a_1))gt 0$. We are given that the total deviation of both the $a$s and $b$s from $a_1$ is some number $f$. The total deviation of the $a$s from $a_1+e$ is $f+e-(n-1)e=f-(n-2)e$ because we are going away from $a_1$ and towards all the other $a$s. The total deviation of the $b$s from $a_1+e$ is $f-ne$ because we are going towards all of them. This is a contradiction, so $a_1=b_1$. We can repeat the argument now for $a_2$ and $b_2$ and so on up the line.
How can you calculate the sum of deviations of $a$ from $a_1+e$?
– Legend Killer
Jul 24 at 3:58
Once I have the sum of deviations from $a_1$, which I defined as $f$, I look at how each term in the sum of deviations changes. When you go to $a_1+e$ you get $e$ further from $a_1$ and $e$ closer to all the rest. I chose $e$ small enough to ensure you don't move past any of the others to make this work. You also get $e$ closer to all the $b$s. That is the heart of the argument. Take the $a$s as $1,2,3$ and $e=0.1$ and do the calculation by hand to see how it works.
– Ross Millikan
Jul 24 at 4:02
It is a brilliant proof , sir.I actually proved the two datasets are equal in size by taking x to be the median of the set a
– Legend Killer
Jul 24 at 4:11
I don't see how the median can prove the datasets are equal in size. Take a dataset and add one value above the median and one below. You have a new dataset with the same median and more values. Maybe you have more information than that. It is true that the median has the property that it is a minimum in the absolute deviation and the slope of the absolute deviation reflects the size of the dataset. If the slopes are equal the size of the datasets are equal.
– Ross Millikan
Jul 24 at 4:19
My approach was $sum |a_i-x| $ is least when $x=a_m$ ,that is median of set $a$.As the two sums are equal $sum |b_i-x|$ is also least when $x=a_m$. But we know $sum|b_i - x|$ is least when $x=b_m$.So, in fact the two medians are same and so they have equal number of observations below them
– Legend Killer
Jul 24 at 4:25
 |Â
show 1 more comment
up vote
1
down vote
accepted
The total deviation constraint is enough without the requirement on the median and we can prove that the size of the two sets is the same. Let $clt a_1,b_1$ and $d=c-1$. The total deviation of the $a$s from $d$ is greater than the total deviation of the $a$s from $c$ by the number of the $a$s because $c$ is $1$ closer to each one. If the total deviation of the $a$s from $d$ equals the total deviation of the $b$s from $d$ and similarly for $c$ there must be the same number of $a$s and $b$s. As you have done, call that number $n.$
Now assume $a_1 neq b_1$. WOLOG we can assume $a_1 lt b_1$ and define $e=min(frac 12 (b_1-a_1), frac 12(a_2-a_1))gt 0$. We are given that the total deviation of both the $a$s and $b$s from $a_1$ is some number $f$. The total deviation of the $a$s from $a_1+e$ is $f+e-(n-1)e=f-(n-2)e$ because we are going away from $a_1$ and towards all the other $a$s. The total deviation of the $b$s from $a_1+e$ is $f-ne$ because we are going towards all of them. This is a contradiction, so $a_1=b_1$. We can repeat the argument now for $a_2$ and $b_2$ and so on up the line.
How can you calculate the sum of deviations of $a$ from $a_1+e$?
– Legend Killer
Jul 24 at 3:58
Once I have the sum of deviations from $a_1$, which I defined as $f$, I look at how each term in the sum of deviations changes. When you go to $a_1+e$ you get $e$ further from $a_1$ and $e$ closer to all the rest. I chose $e$ small enough to ensure you don't move past any of the others to make this work. You also get $e$ closer to all the $b$s. That is the heart of the argument. Take the $a$s as $1,2,3$ and $e=0.1$ and do the calculation by hand to see how it works.
– Ross Millikan
Jul 24 at 4:02
It is a brilliant proof , sir.I actually proved the two datasets are equal in size by taking x to be the median of the set a
– Legend Killer
Jul 24 at 4:11
I don't see how the median can prove the datasets are equal in size. Take a dataset and add one value above the median and one below. You have a new dataset with the same median and more values. Maybe you have more information than that. It is true that the median has the property that it is a minimum in the absolute deviation and the slope of the absolute deviation reflects the size of the dataset. If the slopes are equal the size of the datasets are equal.
– Ross Millikan
Jul 24 at 4:19
My approach was $sum |a_i-x| $ is least when $x=a_m$ ,that is median of set $a$.As the two sums are equal $sum |b_i-x|$ is also least when $x=a_m$. But we know $sum|b_i - x|$ is least when $x=b_m$.So, in fact the two medians are same and so they have equal number of observations below them
– Legend Killer
Jul 24 at 4:25
 |Â
show 1 more comment
up vote
1
down vote
accepted
up vote
1
down vote
accepted
The total deviation constraint is enough without the requirement on the median and we can prove that the size of the two sets is the same. Let $clt a_1,b_1$ and $d=c-1$. The total deviation of the $a$s from $d$ is greater than the total deviation of the $a$s from $c$ by the number of the $a$s because $c$ is $1$ closer to each one. If the total deviation of the $a$s from $d$ equals the total deviation of the $b$s from $d$ and similarly for $c$ there must be the same number of $a$s and $b$s. As you have done, call that number $n.$
Now assume $a_1 neq b_1$. WOLOG we can assume $a_1 lt b_1$ and define $e=min(frac 12 (b_1-a_1), frac 12(a_2-a_1))gt 0$. We are given that the total deviation of both the $a$s and $b$s from $a_1$ is some number $f$. The total deviation of the $a$s from $a_1+e$ is $f+e-(n-1)e=f-(n-2)e$ because we are going away from $a_1$ and towards all the other $a$s. The total deviation of the $b$s from $a_1+e$ is $f-ne$ because we are going towards all of them. This is a contradiction, so $a_1=b_1$. We can repeat the argument now for $a_2$ and $b_2$ and so on up the line.
The total deviation constraint is enough without the requirement on the median and we can prove that the size of the two sets is the same. Let $clt a_1,b_1$ and $d=c-1$. The total deviation of the $a$s from $d$ is greater than the total deviation of the $a$s from $c$ by the number of the $a$s because $c$ is $1$ closer to each one. If the total deviation of the $a$s from $d$ equals the total deviation of the $b$s from $d$ and similarly for $c$ there must be the same number of $a$s and $b$s. As you have done, call that number $n.$
Now assume $a_1 neq b_1$. WOLOG we can assume $a_1 lt b_1$ and define $e=min(frac 12 (b_1-a_1), frac 12(a_2-a_1))gt 0$. We are given that the total deviation of both the $a$s and $b$s from $a_1$ is some number $f$. The total deviation of the $a$s from $a_1+e$ is $f+e-(n-1)e=f-(n-2)e$ because we are going away from $a_1$ and towards all the other $a$s. The total deviation of the $b$s from $a_1+e$ is $f-ne$ because we are going towards all of them. This is a contradiction, so $a_1=b_1$. We can repeat the argument now for $a_2$ and $b_2$ and so on up the line.
edited Jul 24 at 3:19
answered Jul 24 at 3:05


Ross Millikan
275k21186351
275k21186351
How can you calculate the sum of deviations of $a$ from $a_1+e$?
– Legend Killer
Jul 24 at 3:58
Once I have the sum of deviations from $a_1$, which I defined as $f$, I look at how each term in the sum of deviations changes. When you go to $a_1+e$ you get $e$ further from $a_1$ and $e$ closer to all the rest. I chose $e$ small enough to ensure you don't move past any of the others to make this work. You also get $e$ closer to all the $b$s. That is the heart of the argument. Take the $a$s as $1,2,3$ and $e=0.1$ and do the calculation by hand to see how it works.
– Ross Millikan
Jul 24 at 4:02
It is a brilliant proof , sir.I actually proved the two datasets are equal in size by taking x to be the median of the set a
– Legend Killer
Jul 24 at 4:11
I don't see how the median can prove the datasets are equal in size. Take a dataset and add one value above the median and one below. You have a new dataset with the same median and more values. Maybe you have more information than that. It is true that the median has the property that it is a minimum in the absolute deviation and the slope of the absolute deviation reflects the size of the dataset. If the slopes are equal the size of the datasets are equal.
– Ross Millikan
Jul 24 at 4:19
My approach was $sum |a_i-x| $ is least when $x=a_m$ ,that is median of set $a$.As the two sums are equal $sum |b_i-x|$ is also least when $x=a_m$. But we know $sum|b_i - x|$ is least when $x=b_m$.So, in fact the two medians are same and so they have equal number of observations below them
– Legend Killer
Jul 24 at 4:25
 |Â
show 1 more comment
How can you calculate the sum of deviations of $a$ from $a_1+e$?
– Legend Killer
Jul 24 at 3:58
Once I have the sum of deviations from $a_1$, which I defined as $f$, I look at how each term in the sum of deviations changes. When you go to $a_1+e$ you get $e$ further from $a_1$ and $e$ closer to all the rest. I chose $e$ small enough to ensure you don't move past any of the others to make this work. You also get $e$ closer to all the $b$s. That is the heart of the argument. Take the $a$s as $1,2,3$ and $e=0.1$ and do the calculation by hand to see how it works.
– Ross Millikan
Jul 24 at 4:02
It is a brilliant proof , sir.I actually proved the two datasets are equal in size by taking x to be the median of the set a
– Legend Killer
Jul 24 at 4:11
I don't see how the median can prove the datasets are equal in size. Take a dataset and add one value above the median and one below. You have a new dataset with the same median and more values. Maybe you have more information than that. It is true that the median has the property that it is a minimum in the absolute deviation and the slope of the absolute deviation reflects the size of the dataset. If the slopes are equal the size of the datasets are equal.
– Ross Millikan
Jul 24 at 4:19
My approach was $sum |a_i-x| $ is least when $x=a_m$ ,that is median of set $a$.As the two sums are equal $sum |b_i-x|$ is also least when $x=a_m$. But we know $sum|b_i - x|$ is least when $x=b_m$.So, in fact the two medians are same and so they have equal number of observations below them
– Legend Killer
Jul 24 at 4:25
How can you calculate the sum of deviations of $a$ from $a_1+e$?
– Legend Killer
Jul 24 at 3:58
How can you calculate the sum of deviations of $a$ from $a_1+e$?
– Legend Killer
Jul 24 at 3:58
Once I have the sum of deviations from $a_1$, which I defined as $f$, I look at how each term in the sum of deviations changes. When you go to $a_1+e$ you get $e$ further from $a_1$ and $e$ closer to all the rest. I chose $e$ small enough to ensure you don't move past any of the others to make this work. You also get $e$ closer to all the $b$s. That is the heart of the argument. Take the $a$s as $1,2,3$ and $e=0.1$ and do the calculation by hand to see how it works.
– Ross Millikan
Jul 24 at 4:02
Once I have the sum of deviations from $a_1$, which I defined as $f$, I look at how each term in the sum of deviations changes. When you go to $a_1+e$ you get $e$ further from $a_1$ and $e$ closer to all the rest. I chose $e$ small enough to ensure you don't move past any of the others to make this work. You also get $e$ closer to all the $b$s. That is the heart of the argument. Take the $a$s as $1,2,3$ and $e=0.1$ and do the calculation by hand to see how it works.
– Ross Millikan
Jul 24 at 4:02
It is a brilliant proof , sir.I actually proved the two datasets are equal in size by taking x to be the median of the set a
– Legend Killer
Jul 24 at 4:11
It is a brilliant proof , sir.I actually proved the two datasets are equal in size by taking x to be the median of the set a
– Legend Killer
Jul 24 at 4:11
I don't see how the median can prove the datasets are equal in size. Take a dataset and add one value above the median and one below. You have a new dataset with the same median and more values. Maybe you have more information than that. It is true that the median has the property that it is a minimum in the absolute deviation and the slope of the absolute deviation reflects the size of the dataset. If the slopes are equal the size of the datasets are equal.
– Ross Millikan
Jul 24 at 4:19
I don't see how the median can prove the datasets are equal in size. Take a dataset and add one value above the median and one below. You have a new dataset with the same median and more values. Maybe you have more information than that. It is true that the median has the property that it is a minimum in the absolute deviation and the slope of the absolute deviation reflects the size of the dataset. If the slopes are equal the size of the datasets are equal.
– Ross Millikan
Jul 24 at 4:19
My approach was $sum |a_i-x| $ is least when $x=a_m$ ,that is median of set $a$.As the two sums are equal $sum |b_i-x|$ is also least when $x=a_m$. But we know $sum|b_i - x|$ is least when $x=b_m$.So, in fact the two medians are same and so they have equal number of observations below them
– Legend Killer
Jul 24 at 4:25
My approach was $sum |a_i-x| $ is least when $x=a_m$ ,that is median of set $a$.As the two sums are equal $sum |b_i-x|$ is also least when $x=a_m$. But we know $sum|b_i - x|$ is least when $x=b_m$.So, in fact the two medians are same and so they have equal number of observations below them
– Legend Killer
Jul 24 at 4:25
 |Â
show 1 more comment
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2860965%2fcan-two-distinct-datasets-of-the-same-size-have-the-same-median-and-the-same-dev%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Is the deviation to be from one real number, as the first sentence suggests, or all real numbers, as the next to last suggests?
– Ross Millikan
Jul 24 at 3:06
Any real number $x$
– Legend Killer
Jul 24 at 3:07