$n$-sphere enclosing the Birkhoff polytope

up vote
4
down vote

favorite

I am not a mathematician by training, so please feel free to correct my logic or descriptions when necessary:

Let $P$ denote a $textitpermutation matrix$:
$$
beginequation
P := X in 0,1^ntimes n : X mathbf1_n = mathbf1_n,,,mathbf1_n^T X = mathbf1_n^T.
endequation
$$
where $mathbf1_n$ denotes an $n$-dimensional ones vector.

When doing optimizations, to circumvent the discrete nature, such matrices are generally relaxed using the doubly-stochastic matrices $mathcalDP_n$, that live on the Birkhoff Polytope, an $(n - 1)^2$-dimensional convex submanifold of the ambient $mathbbR^ntimes n$, defined as:
$$
beginalign
mathcalDP_n = ,,X :& X_ij>0 ,,, i,j in1,...,n ,,, wedge sumlimits_i=1^n X_ij=1 ,wedge, sumlimits_j=1^n X_ij=1 ,,
endalign
$$
In other words, row and column sums equate to $1$ and entries can take real values between $0$ and $1$.

The permutation matrices live on the vertices of Birkhoff Polytope. One can also describe the discrete set of permutation matrices as the intersection of Birkhoff Polytope $mathcalDP_n$ with the orthogonal group (special case of Stiefel manifold when $m=n$):
$$
P=X in mathcalDP_n : X X^T=mathbfI
$$
The orthogonal group $O(n)$ can be regarded as an $(n(nÃ¢Â€Â”1)/2)$-dimensional surface in the $n^2$-dimensional Euclidean space. This surface is a subset of the sphere of radius $n^1/2$.

So, the orthogonal group touches the polytope exactly on the permutation matrices. In low-dimensions, one can probably think this as a tessellated shape, approximating the sphere. Or, in the other way around, we can speak of an encapsulating sphere of the polytope (though for higher dimensions this might not be true).

Now comes my question: As $n rightarrow infty$, does the sphere become a good approximation of the polytope, or does the gap grow? In other words, instead of using the Birkhoff polytope itself as a relaxation of the discrete permutations, can we live with using just the sphere? And, if possible, would it be possible to bound the error of approximation?

Thanks!

edited Jul 22 at 15:50

Rodrigo de Azevedo

12.6k41751

asked Jul 21 at 22:49

Tolga Birdal

20128

By "error of approximation", do you mean the Frobenius norm of the difference between a doubly schoastic matrix and the nearest point on the sphere? And if so, isn't the maximum distance attained at the matrix with constant entries $frac1n$?
â€“Â joriki
Jul 21 at 23:24

I meant the largest (or mean) error one attains by walking on the sphere rather than on the polytope (assuming that Birkhoff is the correct relaxation of course).
â€“Â Tolga Birdal
Jul 21 at 23:29

"largest (or mean)" in the sense that you'd be interested in knowing either? What do you want to average over? The sphere and/or the polytope? And what do you mean by "the correct relaxation" -- correct in what sense?
â€“Â joriki
Jul 21 at 23:39

Let's say we walk from $P_0$ to $P_1$. On Birkhoff the path would be a straight line, an edge rather, if they are neighbors. On $O(n)$, that would be a geodesic of the Stiefel, and takes longer to travel. Therefore, I am assuming that Birkhoff is a better relaxation. However, how better? As dimension grows, do we gain or do we lose? Any measure would be fine in that case, as soon as it can give a hint on the quality of approximation. Does that make sense? You might also be right that $1/n$ becomes the farthest point of $mathcalDP_n$ to $mathcalS_n$. Maybe it is possible to show this?
â€“Â Tolga Birdal
Jul 21 at 23:44

add a commentÂ |Â

up vote
4
down vote

favorite

I am not a mathematician by training, so please feel free to correct my logic or descriptions when necessary:

Thanks!

edited Jul 22 at 15:50

Rodrigo de Azevedo

12.6k41751

asked Jul 21 at 22:49

Tolga Birdal

20128

By "error of approximation", do you mean the Frobenius norm of the difference between a doubly schoastic matrix and the nearest point on the sphere? And if so, isn't the maximum distance attained at the matrix with constant entries $frac1n$?
â€“Â joriki
Jul 21 at 23:24

I meant the largest (or mean) error one attains by walking on the sphere rather than on the polytope (assuming that Birkhoff is the correct relaxation of course).
â€“Â Tolga Birdal
Jul 21 at 23:29

"largest (or mean)" in the sense that you'd be interested in knowing either? What do you want to average over? The sphere and/or the polytope? And what do you mean by "the correct relaxation" -- correct in what sense?
â€“Â joriki
Jul 21 at 23:39

Let's say we walk from $P_0$ to $P_1$. On Birkhoff the path would be a straight line, an edge rather, if they are neighbors. On $O(n)$, that would be a geodesic of the Stiefel, and takes longer to travel. Therefore, I am assuming that Birkhoff is a better relaxation. However, how better? As dimension grows, do we gain or do we lose? Any measure would be fine in that case, as soon as it can give a hint on the quality of approximation. Does that make sense? You might also be right that $1/n$ becomes the farthest point of $mathcalDP_n$ to $mathcalS_n$. Maybe it is possible to show this?
â€“Â Tolga Birdal
Jul 21 at 23:44

add a commentÂ |Â

up vote
4
down vote

favorite

I am not a mathematician by training, so please feel free to correct my logic or descriptions when necessary:

Thanks!

edited Jul 22 at 15:50

Rodrigo de Azevedo

12.6k41751

asked Jul 21 at 22:49

Tolga Birdal

20128

I am not a mathematician by training, so please feel free to correct my logic or descriptions when necessary:

Thanks!

edited Jul 22 at 15:50

Rodrigo de Azevedo

12.6k41751

asked Jul 21 at 22:49

Tolga Birdal

20128

edited Jul 22 at 15:50

Rodrigo de Azevedo

12.6k41751

edited Jul 22 at 15:50

Rodrigo de Azevedo

12.6k41751

edited Jul 22 at 15:50

Rodrigo de Azevedo

12.6k41751

asked Jul 21 at 22:49

Tolga Birdal

20128

asked Jul 21 at 22:49

Tolga Birdal

20128

asked Jul 21 at 22:49

Tolga Birdal

20128

By "error of approximation", do you mean the Frobenius norm of the difference between a doubly schoastic matrix and the nearest point on the sphere? And if so, isn't the maximum distance attained at the matrix with constant entries $frac1n$?
â€“Â joriki
Jul 21 at 23:24

I meant the largest (or mean) error one attains by walking on the sphere rather than on the polytope (assuming that Birkhoff is the correct relaxation of course).
â€“Â Tolga Birdal
Jul 21 at 23:29

"largest (or mean)" in the sense that you'd be interested in knowing either? What do you want to average over? The sphere and/or the polytope? And what do you mean by "the correct relaxation" -- correct in what sense?
â€“Â joriki
Jul 21 at 23:39

Let's say we walk from $P_0$ to $P_1$. On Birkhoff the path would be a straight line, an edge rather, if they are neighbors. On $O(n)$, that would be a geodesic of the Stiefel, and takes longer to travel. Therefore, I am assuming that Birkhoff is a better relaxation. However, how better? As dimension grows, do we gain or do we lose? Any measure would be fine in that case, as soon as it can give a hint on the quality of approximation. Does that make sense? You might also be right that $1/n$ becomes the farthest point of $mathcalDP_n$ to $mathcalS_n$. Maybe it is possible to show this?
â€“Â Tolga Birdal
Jul 21 at 23:44

add a commentÂ |Â

By "error of approximation", do you mean the Frobenius norm of the difference between a doubly schoastic matrix and the nearest point on the sphere? And if so, isn't the maximum distance attained at the matrix with constant entries $frac1n$?
â€“Â joriki
Jul 21 at 23:24

I meant the largest (or mean) error one attains by walking on the sphere rather than on the polytope (assuming that Birkhoff is the correct relaxation of course).
â€“Â Tolga Birdal
Jul 21 at 23:29

"largest (or mean)" in the sense that you'd be interested in knowing either? What do you want to average over? The sphere and/or the polytope? And what do you mean by "the correct relaxation" -- correct in what sense?
â€“Â joriki
Jul 21 at 23:39

Let's say we walk from $P_0$ to $P_1$. On Birkhoff the path would be a straight line, an edge rather, if they are neighbors. On $O(n)$, that would be a geodesic of the Stiefel, and takes longer to travel. Therefore, I am assuming that Birkhoff is a better relaxation. However, how better? As dimension grows, do we gain or do we lose? Any measure would be fine in that case, as soon as it can give a hint on the quality of approximation. Does that make sense? You might also be right that $1/n$ becomes the farthest point of $mathcalDP_n$ to $mathcalS_n$. Maybe it is possible to show this?
â€“Â Tolga Birdal
Jul 21 at 23:44

By "error of approximation", do you mean the Frobenius norm of the difference between a doubly schoastic matrix and the nearest point on the sphere? And if so, isn't the maximum distance attained at the matrix with constant entries $frac1n$?
â€“Â joriki
Jul 21 at 23:24

I meant the largest (or mean) error one attains by walking on the sphere rather than on the polytope (assuming that Birkhoff is the correct relaxation of course).
â€“Â Tolga Birdal
Jul 21 at 23:29

"largest (or mean)" in the sense that you'd be interested in knowing either? What do you want to average over? The sphere and/or the polytope? And what do you mean by "the correct relaxation" -- correct in what sense?
â€“Â joriki
Jul 21 at 23:39

Let's say we walk from $P_0$ to $P_1$. On Birkhoff the path would be a straight line, an edge rather, if they are neighbors. On $O(n)$, that would be a geodesic of the Stiefel, and takes longer to travel. Therefore, I am assuming that Birkhoff is a better relaxation. However, how better? As dimension grows, do we gain or do we lose? Any measure would be fine in that case, as soon as it can give a hint on the quality of approximation. Does that make sense? You might also be right that $1/n$ becomes the farthest point of $mathcalDP_n$ to $mathcalS_n$. Maybe it is possible to show this?
â€“Â Tolga Birdal
Jul 21 at 23:44

add a commentÂ |Â

1 Answer
1

active

oldest

votes

up vote
1
down vote

Let $J$ be the $n times n$ matrix each of whose entries are $1$. Then the ray $mathbbR_geq 0 J$ exits $DP_n$ at $tfrac1n J$ but exits the sphere at $tfrac1sqrtn J$.

Let $lambda$ be the linear function $lambda(X) = sum_i,j X_ij$. Then $lambda$ is optimized on $DP_n$ at $tfrac1n J$ and on the sphere at $tfrac1sqrtn J$, so the ratio between the true optimum and the spherical approximation is $sqrtn$, which goes to $infty$ as $n$ grows.

answered Aug 1 at 14:05

David E Speyer

44.2k4120197

Good thought, I have two questions though: 1. How can one prove that we are not shooting the ray at a location close to one of the vertices, but to the center of the face? And would that ray be orthogonal to the face? Maybe this was encoded in the second part but wasn't clear to me.
â€“Â Tolga Birdal
Aug 2 at 8:18

add a commentÂ |Â

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "69"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2858947%2fn-sphere-enclosing-the-birkhoff-polytope%23new-answer', 'question_page');

);

Post as a guest

Name

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
1
down vote

Let $J$ be the $n times n$ matrix each of whose entries are $1$. Then the ray $mathbbR_geq 0 J$ exits $DP_n$ at $tfrac1n J$ but exits the sphere at $tfrac1sqrtn J$.

answered Aug 1 at 14:05

David E Speyer

44.2k4120197

Good thought, I have two questions though: 1. How can one prove that we are not shooting the ray at a location close to one of the vertices, but to the center of the face? And would that ray be orthogonal to the face? Maybe this was encoded in the second part but wasn't clear to me.
â€“Â Tolga Birdal
Aug 2 at 8:18

add a commentÂ |Â

up vote
1
down vote

Let $J$ be the $n times n$ matrix each of whose entries are $1$. Then the ray $mathbbR_geq 0 J$ exits $DP_n$ at $tfrac1n J$ but exits the sphere at $tfrac1sqrtn J$.

answered Aug 1 at 14:05

David E Speyer

44.2k4120197

Good thought, I have two questions though: 1. How can one prove that we are not shooting the ray at a location close to one of the vertices, but to the center of the face? And would that ray be orthogonal to the face? Maybe this was encoded in the second part but wasn't clear to me.
â€“Â Tolga Birdal
Aug 2 at 8:18

add a commentÂ |Â

up vote
1
down vote

Let $J$ be the $n times n$ matrix each of whose entries are $1$. Then the ray $mathbbR_geq 0 J$ exits $DP_n$ at $tfrac1n J$ but exits the sphere at $tfrac1sqrtn J$.

answered Aug 1 at 14:05

David E Speyer

44.2k4120197

Let $J$ be the $n times n$ matrix each of whose entries are $1$. Then the ray $mathbbR_geq 0 J$ exits $DP_n$ at $tfrac1n J$ but exits the sphere at $tfrac1sqrtn J$.

answered Aug 1 at 14:05

David E Speyer

44.2k4120197

answered Aug 1 at 14:05

David E Speyer

44.2k4120197

answered Aug 1 at 14:05

David E Speyer

44.2k4120197

answered Aug 1 at 14:05

David E Speyer

44.2k4120197

Good thought, I have two questions though: 1. How can one prove that we are not shooting the ray at a location close to one of the vertices, but to the center of the face? And would that ray be orthogonal to the face? Maybe this was encoded in the second part but wasn't clear to me.
â€“Â Tolga Birdal
Aug 2 at 8:18

add a commentÂ |Â

Good thought, I have two questions though: 1. How can one prove that we are not shooting the ray at a location close to one of the vertices, but to the center of the face? And would that ray be orthogonal to the face? Maybe this was encoded in the second part but wasn't clear to me.
â€“Â Tolga Birdal
Aug 2 at 8:18

Good thought, I have two questions though: 1. How can one prove that we are not shooting the ray at a location close to one of the vertices, but to the center of the face? And would that ray be orthogonal to the face? Maybe this was encoded in the second part but wasn't clear to me.
â€“Â Tolga Birdal
Aug 2 at 8:18

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

Search This Blog

ukmuiik