The gradient of the Frobenius norm under a similarity transformation
Clash Royale CLAN TAG#URR8PPP
up vote
0
down vote
favorite
I am looking for the gradient of the following cost function
$||T^-1AT|| + ||T^-1BT||$
with respect to $T$. $A$, $B$ are real square matrices. $T$ is a coordinate change of corresponding dimension. The norm can either be Frobenius or L2 (either euclidean or spectral).
My problem occurs since the diagonalization transformation on $A$ or the jordan canonical transformation which minimizes its norm (under certain assumptions), can cause $||B||$ to explode and viceversa. I believe this is a very hard, impossible, problem to find global minima and I am looking into all mathematical insight and algorithms I could use to help me find a consistently a similar local minima
Thank you
calculus linear-algebra matrices control-theory numerical-optimization
add a comment |Â
up vote
0
down vote
favorite
I am looking for the gradient of the following cost function
$||T^-1AT|| + ||T^-1BT||$
with respect to $T$. $A$, $B$ are real square matrices. $T$ is a coordinate change of corresponding dimension. The norm can either be Frobenius or L2 (either euclidean or spectral).
My problem occurs since the diagonalization transformation on $A$ or the jordan canonical transformation which minimizes its norm (under certain assumptions), can cause $||B||$ to explode and viceversa. I believe this is a very hard, impossible, problem to find global minima and I am looking into all mathematical insight and algorithms I could use to help me find a consistently a similar local minima
Thank you
calculus linear-algebra matrices control-theory numerical-optimization
add a comment |Â
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I am looking for the gradient of the following cost function
$||T^-1AT|| + ||T^-1BT||$
with respect to $T$. $A$, $B$ are real square matrices. $T$ is a coordinate change of corresponding dimension. The norm can either be Frobenius or L2 (either euclidean or spectral).
My problem occurs since the diagonalization transformation on $A$ or the jordan canonical transformation which minimizes its norm (under certain assumptions), can cause $||B||$ to explode and viceversa. I believe this is a very hard, impossible, problem to find global minima and I am looking into all mathematical insight and algorithms I could use to help me find a consistently a similar local minima
Thank you
calculus linear-algebra matrices control-theory numerical-optimization
I am looking for the gradient of the following cost function
$||T^-1AT|| + ||T^-1BT||$
with respect to $T$. $A$, $B$ are real square matrices. $T$ is a coordinate change of corresponding dimension. The norm can either be Frobenius or L2 (either euclidean or spectral).
My problem occurs since the diagonalization transformation on $A$ or the jordan canonical transformation which minimizes its norm (under certain assumptions), can cause $||B||$ to explode and viceversa. I believe this is a very hard, impossible, problem to find global minima and I am looking into all mathematical insight and algorithms I could use to help me find a consistently a similar local minima
Thank you
calculus linear-algebra matrices control-theory numerical-optimization
asked Jul 31 at 14:40
Alex Pacheco
304
304
add a comment |Â
add a comment |Â
1 Answer
1
active
oldest
votes
up vote
3
down vote
accepted
Let's use a colon to denote the trace/Frobenius product,
$$eqalignA$$
And let's define the variables
$$eqalign_F^2 &= $$
Find the differential and then the gradient of $alpha$
$$eqalign
2alpha,dalpha
&= 2X:dX cr
&= 2X:T^-1big(A,dT-dT,Xbig) cr
&= 2T^-TX:big(A,dT-dT,Xbig) cr
&= 2big(A^TT^-TX-T^-TXX^Tbig):dT cr
&= 2T^-Tbig(X^TX-XX^Tbig):dT cr
fracpartialalphapartial T
&= alpha^-1T^-Tbig(X^TX-XX^Tbig) cr
$$
The calculation for $beta$ is similar and yields
$$eqalign
fracpartialbetapartial T
&= beta^-1T^-Tbig(Y^TY-YY^Tbig) cr
$$
So, if we choose the Frobenius norm, then your cost function $(phi)$ and its gradient is given by
$$eqalign
phi &= alpha + beta cr
fracpartialphipartial T
&= fracpartialalphapartial T + fracpartialbetapartial T cr
&= T^-TBigg(fracX^TX-XX^T + fracY^TY-YY^TYBigg) crcr
$$
Note that the cyclic property of the trace gives us several ways to rearrange the terms in a Frobenius product. For example, all of the following are equivalent
$$eqalign
A:BC &= A^T:(BC)^T cr
&= BC:A cr
&= AC^T:B cr
&= B^TA:C cr
$$
I dont understand how the last step in finding the differential and then the gradient of $alpha$ holds. How can $dT$ inside a frobenius product (i.e, inside a trace) come to the left hand side as a division? Thank you so very much, this is great help!
â Alex Pacheco
Aug 1 at 21:45
add a comment |Â
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
3
down vote
accepted
Let's use a colon to denote the trace/Frobenius product,
$$eqalignA$$
And let's define the variables
$$eqalign_F^2 &= $$
Find the differential and then the gradient of $alpha$
$$eqalign
2alpha,dalpha
&= 2X:dX cr
&= 2X:T^-1big(A,dT-dT,Xbig) cr
&= 2T^-TX:big(A,dT-dT,Xbig) cr
&= 2big(A^TT^-TX-T^-TXX^Tbig):dT cr
&= 2T^-Tbig(X^TX-XX^Tbig):dT cr
fracpartialalphapartial T
&= alpha^-1T^-Tbig(X^TX-XX^Tbig) cr
$$
The calculation for $beta$ is similar and yields
$$eqalign
fracpartialbetapartial T
&= beta^-1T^-Tbig(Y^TY-YY^Tbig) cr
$$
So, if we choose the Frobenius norm, then your cost function $(phi)$ and its gradient is given by
$$eqalign
phi &= alpha + beta cr
fracpartialphipartial T
&= fracpartialalphapartial T + fracpartialbetapartial T cr
&= T^-TBigg(fracX^TX-XX^T + fracY^TY-YY^TYBigg) crcr
$$
Note that the cyclic property of the trace gives us several ways to rearrange the terms in a Frobenius product. For example, all of the following are equivalent
$$eqalign
A:BC &= A^T:(BC)^T cr
&= BC:A cr
&= AC^T:B cr
&= B^TA:C cr
$$
I dont understand how the last step in finding the differential and then the gradient of $alpha$ holds. How can $dT$ inside a frobenius product (i.e, inside a trace) come to the left hand side as a division? Thank you so very much, this is great help!
â Alex Pacheco
Aug 1 at 21:45
add a comment |Â
up vote
3
down vote
accepted
Let's use a colon to denote the trace/Frobenius product,
$$eqalignA$$
And let's define the variables
$$eqalign_F^2 &= $$
Find the differential and then the gradient of $alpha$
$$eqalign
2alpha,dalpha
&= 2X:dX cr
&= 2X:T^-1big(A,dT-dT,Xbig) cr
&= 2T^-TX:big(A,dT-dT,Xbig) cr
&= 2big(A^TT^-TX-T^-TXX^Tbig):dT cr
&= 2T^-Tbig(X^TX-XX^Tbig):dT cr
fracpartialalphapartial T
&= alpha^-1T^-Tbig(X^TX-XX^Tbig) cr
$$
The calculation for $beta$ is similar and yields
$$eqalign
fracpartialbetapartial T
&= beta^-1T^-Tbig(Y^TY-YY^Tbig) cr
$$
So, if we choose the Frobenius norm, then your cost function $(phi)$ and its gradient is given by
$$eqalign
phi &= alpha + beta cr
fracpartialphipartial T
&= fracpartialalphapartial T + fracpartialbetapartial T cr
&= T^-TBigg(fracX^TX-XX^T + fracY^TY-YY^TYBigg) crcr
$$
Note that the cyclic property of the trace gives us several ways to rearrange the terms in a Frobenius product. For example, all of the following are equivalent
$$eqalign
A:BC &= A^T:(BC)^T cr
&= BC:A cr
&= AC^T:B cr
&= B^TA:C cr
$$
I dont understand how the last step in finding the differential and then the gradient of $alpha$ holds. How can $dT$ inside a frobenius product (i.e, inside a trace) come to the left hand side as a division? Thank you so very much, this is great help!
â Alex Pacheco
Aug 1 at 21:45
add a comment |Â
up vote
3
down vote
accepted
up vote
3
down vote
accepted
Let's use a colon to denote the trace/Frobenius product,
$$eqalignA$$
And let's define the variables
$$eqalign_F^2 &= $$
Find the differential and then the gradient of $alpha$
$$eqalign
2alpha,dalpha
&= 2X:dX cr
&= 2X:T^-1big(A,dT-dT,Xbig) cr
&= 2T^-TX:big(A,dT-dT,Xbig) cr
&= 2big(A^TT^-TX-T^-TXX^Tbig):dT cr
&= 2T^-Tbig(X^TX-XX^Tbig):dT cr
fracpartialalphapartial T
&= alpha^-1T^-Tbig(X^TX-XX^Tbig) cr
$$
The calculation for $beta$ is similar and yields
$$eqalign
fracpartialbetapartial T
&= beta^-1T^-Tbig(Y^TY-YY^Tbig) cr
$$
So, if we choose the Frobenius norm, then your cost function $(phi)$ and its gradient is given by
$$eqalign
phi &= alpha + beta cr
fracpartialphipartial T
&= fracpartialalphapartial T + fracpartialbetapartial T cr
&= T^-TBigg(fracX^TX-XX^T + fracY^TY-YY^TYBigg) crcr
$$
Note that the cyclic property of the trace gives us several ways to rearrange the terms in a Frobenius product. For example, all of the following are equivalent
$$eqalign
A:BC &= A^T:(BC)^T cr
&= BC:A cr
&= AC^T:B cr
&= B^TA:C cr
$$
Let's use a colon to denote the trace/Frobenius product,
$$eqalignA$$
And let's define the variables
$$eqalign_F^2 &= $$
Find the differential and then the gradient of $alpha$
$$eqalign
2alpha,dalpha
&= 2X:dX cr
&= 2X:T^-1big(A,dT-dT,Xbig) cr
&= 2T^-TX:big(A,dT-dT,Xbig) cr
&= 2big(A^TT^-TX-T^-TXX^Tbig):dT cr
&= 2T^-Tbig(X^TX-XX^Tbig):dT cr
fracpartialalphapartial T
&= alpha^-1T^-Tbig(X^TX-XX^Tbig) cr
$$
The calculation for $beta$ is similar and yields
$$eqalign
fracpartialbetapartial T
&= beta^-1T^-Tbig(Y^TY-YY^Tbig) cr
$$
So, if we choose the Frobenius norm, then your cost function $(phi)$ and its gradient is given by
$$eqalign
phi &= alpha + beta cr
fracpartialphipartial T
&= fracpartialalphapartial T + fracpartialbetapartial T cr
&= T^-TBigg(fracX^TX-XX^T + fracY^TY-YY^TYBigg) crcr
$$
Note that the cyclic property of the trace gives us several ways to rearrange the terms in a Frobenius product. For example, all of the following are equivalent
$$eqalign
A:BC &= A^T:(BC)^T cr
&= BC:A cr
&= AC^T:B cr
&= B^TA:C cr
$$
edited Jul 31 at 17:13
answered Jul 31 at 16:27
greg
5,6331715
5,6331715
I dont understand how the last step in finding the differential and then the gradient of $alpha$ holds. How can $dT$ inside a frobenius product (i.e, inside a trace) come to the left hand side as a division? Thank you so very much, this is great help!
â Alex Pacheco
Aug 1 at 21:45
add a comment |Â
I dont understand how the last step in finding the differential and then the gradient of $alpha$ holds. How can $dT$ inside a frobenius product (i.e, inside a trace) come to the left hand side as a division? Thank you so very much, this is great help!
â Alex Pacheco
Aug 1 at 21:45
I dont understand how the last step in finding the differential and then the gradient of $alpha$ holds. How can $dT$ inside a frobenius product (i.e, inside a trace) come to the left hand side as a division? Thank you so very much, this is great help!
â Alex Pacheco
Aug 1 at 21:45
I dont understand how the last step in finding the differential and then the gradient of $alpha$ holds. How can $dT$ inside a frobenius product (i.e, inside a trace) come to the left hand side as a division? Thank you so very much, this is great help!
â Alex Pacheco
Aug 1 at 21:45
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2868110%2fthe-gradient-of-the-frobenius-norm-under-a-similarity-transformation%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password