The gradient of the Frobenius norm under a similarity transformation

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
0
down vote

favorite












I am looking for the gradient of the following cost function



$||T^-1AT|| + ||T^-1BT||$



with respect to $T$. $A$, $B$ are real square matrices. $T$ is a coordinate change of corresponding dimension. The norm can either be Frobenius or L2 (either euclidean or spectral).



My problem occurs since the diagonalization transformation on $A$ or the jordan canonical transformation which minimizes its norm (under certain assumptions), can cause $||B||$ to explode and viceversa. I believe this is a very hard, impossible, problem to find global minima and I am looking into all mathematical insight and algorithms I could use to help me find a consistently a similar local minima



Thank you







share|cite|improve this question























    up vote
    0
    down vote

    favorite












    I am looking for the gradient of the following cost function



    $||T^-1AT|| + ||T^-1BT||$



    with respect to $T$. $A$, $B$ are real square matrices. $T$ is a coordinate change of corresponding dimension. The norm can either be Frobenius or L2 (either euclidean or spectral).



    My problem occurs since the diagonalization transformation on $A$ or the jordan canonical transformation which minimizes its norm (under certain assumptions), can cause $||B||$ to explode and viceversa. I believe this is a very hard, impossible, problem to find global minima and I am looking into all mathematical insight and algorithms I could use to help me find a consistently a similar local minima



    Thank you







    share|cite|improve this question





















      up vote
      0
      down vote

      favorite









      up vote
      0
      down vote

      favorite











      I am looking for the gradient of the following cost function



      $||T^-1AT|| + ||T^-1BT||$



      with respect to $T$. $A$, $B$ are real square matrices. $T$ is a coordinate change of corresponding dimension. The norm can either be Frobenius or L2 (either euclidean or spectral).



      My problem occurs since the diagonalization transformation on $A$ or the jordan canonical transformation which minimizes its norm (under certain assumptions), can cause $||B||$ to explode and viceversa. I believe this is a very hard, impossible, problem to find global minima and I am looking into all mathematical insight and algorithms I could use to help me find a consistently a similar local minima



      Thank you







      share|cite|improve this question











      I am looking for the gradient of the following cost function



      $||T^-1AT|| + ||T^-1BT||$



      with respect to $T$. $A$, $B$ are real square matrices. $T$ is a coordinate change of corresponding dimension. The norm can either be Frobenius or L2 (either euclidean or spectral).



      My problem occurs since the diagonalization transformation on $A$ or the jordan canonical transformation which minimizes its norm (under certain assumptions), can cause $||B||$ to explode and viceversa. I believe this is a very hard, impossible, problem to find global minima and I am looking into all mathematical insight and algorithms I could use to help me find a consistently a similar local minima



      Thank you









      share|cite|improve this question










      share|cite|improve this question




      share|cite|improve this question









      asked Jul 31 at 14:40









      Alex Pacheco

      304




      304




















          1 Answer
          1






          active

          oldest

          votes

















          up vote
          3
          down vote



          accepted










          Let's use a colon to denote the trace/Frobenius product,
          $$eqalignA$$
          And let's define the variables
          $$eqalign_F^2 &= $$
          Find the differential and then the gradient of $alpha$
          $$eqalign
          2alpha,dalpha
          &= 2X:dX cr
          &= 2X:T^-1big(A,dT-dT,Xbig) cr
          &= 2T^-TX:big(A,dT-dT,Xbig) cr
          &= 2big(A^TT^-TX-T^-TXX^Tbig):dT cr
          &= 2T^-Tbig(X^TX-XX^Tbig):dT cr
          fracpartialalphapartial T
          &= alpha^-1T^-Tbig(X^TX-XX^Tbig) cr
          $$
          The calculation for $beta$ is similar and yields
          $$eqalign
          fracpartialbetapartial T
          &= beta^-1T^-Tbig(Y^TY-YY^Tbig) cr
          $$
          So, if we choose the Frobenius norm, then your cost function $(phi)$ and its gradient is given by

          $$eqalign
          phi &= alpha + beta cr
          fracpartialphipartial T
          &= fracpartialalphapartial T + fracpartialbetapartial T cr
          &= T^-TBigg(fracX^TX-XX^T + fracY^TY-YY^TYBigg) crcr
          $$
          Note that the cyclic property of the trace gives us several ways to rearrange the terms in a Frobenius product. For example, all of the following are equivalent
          $$eqalign
          A:BC &= A^T:(BC)^T cr
          &= BC:A cr
          &= AC^T:B cr
          &= B^TA:C cr
          $$






          share|cite|improve this answer























          • I dont understand how the last step in finding the differential and then the gradient of $alpha$ holds. How can $dT$ inside a frobenius product (i.e, inside a trace) come to the left hand side as a division? Thank you so very much, this is great help!
            – Alex Pacheco
            Aug 1 at 21:45











          Your Answer




          StackExchange.ifUsing("editor", function ()
          return StackExchange.using("mathjaxEditing", function ()
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
          );
          );
          , "mathjax-editing");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "69"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          convertImagesToLinks: true,
          noModals: false,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          noCode: true, onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );








           

          draft saved


          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2868110%2fthe-gradient-of-the-frobenius-norm-under-a-similarity-transformation%23new-answer', 'question_page');

          );

          Post as a guest






























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          3
          down vote



          accepted










          Let's use a colon to denote the trace/Frobenius product,
          $$eqalignA$$
          And let's define the variables
          $$eqalign_F^2 &= $$
          Find the differential and then the gradient of $alpha$
          $$eqalign
          2alpha,dalpha
          &= 2X:dX cr
          &= 2X:T^-1big(A,dT-dT,Xbig) cr
          &= 2T^-TX:big(A,dT-dT,Xbig) cr
          &= 2big(A^TT^-TX-T^-TXX^Tbig):dT cr
          &= 2T^-Tbig(X^TX-XX^Tbig):dT cr
          fracpartialalphapartial T
          &= alpha^-1T^-Tbig(X^TX-XX^Tbig) cr
          $$
          The calculation for $beta$ is similar and yields
          $$eqalign
          fracpartialbetapartial T
          &= beta^-1T^-Tbig(Y^TY-YY^Tbig) cr
          $$
          So, if we choose the Frobenius norm, then your cost function $(phi)$ and its gradient is given by

          $$eqalign
          phi &= alpha + beta cr
          fracpartialphipartial T
          &= fracpartialalphapartial T + fracpartialbetapartial T cr
          &= T^-TBigg(fracX^TX-XX^T + fracY^TY-YY^TYBigg) crcr
          $$
          Note that the cyclic property of the trace gives us several ways to rearrange the terms in a Frobenius product. For example, all of the following are equivalent
          $$eqalign
          A:BC &= A^T:(BC)^T cr
          &= BC:A cr
          &= AC^T:B cr
          &= B^TA:C cr
          $$






          share|cite|improve this answer























          • I dont understand how the last step in finding the differential and then the gradient of $alpha$ holds. How can $dT$ inside a frobenius product (i.e, inside a trace) come to the left hand side as a division? Thank you so very much, this is great help!
            – Alex Pacheco
            Aug 1 at 21:45















          up vote
          3
          down vote



          accepted










          Let's use a colon to denote the trace/Frobenius product,
          $$eqalignA$$
          And let's define the variables
          $$eqalign_F^2 &= $$
          Find the differential and then the gradient of $alpha$
          $$eqalign
          2alpha,dalpha
          &= 2X:dX cr
          &= 2X:T^-1big(A,dT-dT,Xbig) cr
          &= 2T^-TX:big(A,dT-dT,Xbig) cr
          &= 2big(A^TT^-TX-T^-TXX^Tbig):dT cr
          &= 2T^-Tbig(X^TX-XX^Tbig):dT cr
          fracpartialalphapartial T
          &= alpha^-1T^-Tbig(X^TX-XX^Tbig) cr
          $$
          The calculation for $beta$ is similar and yields
          $$eqalign
          fracpartialbetapartial T
          &= beta^-1T^-Tbig(Y^TY-YY^Tbig) cr
          $$
          So, if we choose the Frobenius norm, then your cost function $(phi)$ and its gradient is given by

          $$eqalign
          phi &= alpha + beta cr
          fracpartialphipartial T
          &= fracpartialalphapartial T + fracpartialbetapartial T cr
          &= T^-TBigg(fracX^TX-XX^T + fracY^TY-YY^TYBigg) crcr
          $$
          Note that the cyclic property of the trace gives us several ways to rearrange the terms in a Frobenius product. For example, all of the following are equivalent
          $$eqalign
          A:BC &= A^T:(BC)^T cr
          &= BC:A cr
          &= AC^T:B cr
          &= B^TA:C cr
          $$






          share|cite|improve this answer























          • I dont understand how the last step in finding the differential and then the gradient of $alpha$ holds. How can $dT$ inside a frobenius product (i.e, inside a trace) come to the left hand side as a division? Thank you so very much, this is great help!
            – Alex Pacheco
            Aug 1 at 21:45













          up vote
          3
          down vote



          accepted







          up vote
          3
          down vote



          accepted






          Let's use a colon to denote the trace/Frobenius product,
          $$eqalignA$$
          And let's define the variables
          $$eqalign_F^2 &= $$
          Find the differential and then the gradient of $alpha$
          $$eqalign
          2alpha,dalpha
          &= 2X:dX cr
          &= 2X:T^-1big(A,dT-dT,Xbig) cr
          &= 2T^-TX:big(A,dT-dT,Xbig) cr
          &= 2big(A^TT^-TX-T^-TXX^Tbig):dT cr
          &= 2T^-Tbig(X^TX-XX^Tbig):dT cr
          fracpartialalphapartial T
          &= alpha^-1T^-Tbig(X^TX-XX^Tbig) cr
          $$
          The calculation for $beta$ is similar and yields
          $$eqalign
          fracpartialbetapartial T
          &= beta^-1T^-Tbig(Y^TY-YY^Tbig) cr
          $$
          So, if we choose the Frobenius norm, then your cost function $(phi)$ and its gradient is given by

          $$eqalign
          phi &= alpha + beta cr
          fracpartialphipartial T
          &= fracpartialalphapartial T + fracpartialbetapartial T cr
          &= T^-TBigg(fracX^TX-XX^T + fracY^TY-YY^TYBigg) crcr
          $$
          Note that the cyclic property of the trace gives us several ways to rearrange the terms in a Frobenius product. For example, all of the following are equivalent
          $$eqalign
          A:BC &= A^T:(BC)^T cr
          &= BC:A cr
          &= AC^T:B cr
          &= B^TA:C cr
          $$






          share|cite|improve this answer















          Let's use a colon to denote the trace/Frobenius product,
          $$eqalignA$$
          And let's define the variables
          $$eqalign_F^2 &= $$
          Find the differential and then the gradient of $alpha$
          $$eqalign
          2alpha,dalpha
          &= 2X:dX cr
          &= 2X:T^-1big(A,dT-dT,Xbig) cr
          &= 2T^-TX:big(A,dT-dT,Xbig) cr
          &= 2big(A^TT^-TX-T^-TXX^Tbig):dT cr
          &= 2T^-Tbig(X^TX-XX^Tbig):dT cr
          fracpartialalphapartial T
          &= alpha^-1T^-Tbig(X^TX-XX^Tbig) cr
          $$
          The calculation for $beta$ is similar and yields
          $$eqalign
          fracpartialbetapartial T
          &= beta^-1T^-Tbig(Y^TY-YY^Tbig) cr
          $$
          So, if we choose the Frobenius norm, then your cost function $(phi)$ and its gradient is given by

          $$eqalign
          phi &= alpha + beta cr
          fracpartialphipartial T
          &= fracpartialalphapartial T + fracpartialbetapartial T cr
          &= T^-TBigg(fracX^TX-XX^T + fracY^TY-YY^TYBigg) crcr
          $$
          Note that the cyclic property of the trace gives us several ways to rearrange the terms in a Frobenius product. For example, all of the following are equivalent
          $$eqalign
          A:BC &= A^T:(BC)^T cr
          &= BC:A cr
          &= AC^T:B cr
          &= B^TA:C cr
          $$







          share|cite|improve this answer















          share|cite|improve this answer



          share|cite|improve this answer








          edited Jul 31 at 17:13


























          answered Jul 31 at 16:27









          greg

          5,6331715




          5,6331715











          • I dont understand how the last step in finding the differential and then the gradient of $alpha$ holds. How can $dT$ inside a frobenius product (i.e, inside a trace) come to the left hand side as a division? Thank you so very much, this is great help!
            – Alex Pacheco
            Aug 1 at 21:45

















          • I dont understand how the last step in finding the differential and then the gradient of $alpha$ holds. How can $dT$ inside a frobenius product (i.e, inside a trace) come to the left hand side as a division? Thank you so very much, this is great help!
            – Alex Pacheco
            Aug 1 at 21:45
















          I dont understand how the last step in finding the differential and then the gradient of $alpha$ holds. How can $dT$ inside a frobenius product (i.e, inside a trace) come to the left hand side as a division? Thank you so very much, this is great help!
          – Alex Pacheco
          Aug 1 at 21:45





          I dont understand how the last step in finding the differential and then the gradient of $alpha$ holds. How can $dT$ inside a frobenius product (i.e, inside a trace) come to the left hand side as a division? Thank you so very much, this is great help!
          – Alex Pacheco
          Aug 1 at 21:45













           

          draft saved


          draft discarded


























           


          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2868110%2fthe-gradient-of-the-frobenius-norm-under-a-similarity-transformation%23new-answer', 'question_page');

          );

          Post as a guest













































































          Comments

          Popular posts from this blog

          Relationship between determinant of matrix and determinant of adjoint?

          Color the edges and diagonals of a regular polygon

          What is the equation of a 3D cone with generalised tilt?