How do I find the minimum of this function?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
2
down vote

favorite












This might seem trivial to some of you, but I can't for the life of me figure out how to solve this.



$$undersetxarg min (x - b)^T Ax$$
$$x in mathbbR^n$$
We may assume A to be invertable, but it is not symmetric.



My idea was to calculate the first and second derivative.



I know that $fracdx^Tdx = (fracdxdx)^T$, but when I try to apply the chain rule, I get
$$fracddx = Ax + (x-b)^Tx$$
which doesn't make sense, as it's a vector plus a scalar.
Even if there is another way to find the x for which the function is minimal, I am now more interested in how to derive this kind of formula.







share|cite|improve this question





















  • The argument is a vector, so you need to take the gradient, not the derivative.
    – Yves Daoust
    Jul 31 at 15:02














up vote
2
down vote

favorite












This might seem trivial to some of you, but I can't for the life of me figure out how to solve this.



$$undersetxarg min (x - b)^T Ax$$
$$x in mathbbR^n$$
We may assume A to be invertable, but it is not symmetric.



My idea was to calculate the first and second derivative.



I know that $fracdx^Tdx = (fracdxdx)^T$, but when I try to apply the chain rule, I get
$$fracddx = Ax + (x-b)^Tx$$
which doesn't make sense, as it's a vector plus a scalar.
Even if there is another way to find the x for which the function is minimal, I am now more interested in how to derive this kind of formula.







share|cite|improve this question





















  • The argument is a vector, so you need to take the gradient, not the derivative.
    – Yves Daoust
    Jul 31 at 15:02












up vote
2
down vote

favorite









up vote
2
down vote

favorite











This might seem trivial to some of you, but I can't for the life of me figure out how to solve this.



$$undersetxarg min (x - b)^T Ax$$
$$x in mathbbR^n$$
We may assume A to be invertable, but it is not symmetric.



My idea was to calculate the first and second derivative.



I know that $fracdx^Tdx = (fracdxdx)^T$, but when I try to apply the chain rule, I get
$$fracddx = Ax + (x-b)^Tx$$
which doesn't make sense, as it's a vector plus a scalar.
Even if there is another way to find the x for which the function is minimal, I am now more interested in how to derive this kind of formula.







share|cite|improve this question













This might seem trivial to some of you, but I can't for the life of me figure out how to solve this.



$$undersetxarg min (x - b)^T Ax$$
$$x in mathbbR^n$$
We may assume A to be invertable, but it is not symmetric.



My idea was to calculate the first and second derivative.



I know that $fracdx^Tdx = (fracdxdx)^T$, but when I try to apply the chain rule, I get
$$fracddx = Ax + (x-b)^Tx$$
which doesn't make sense, as it's a vector plus a scalar.
Even if there is another way to find the x for which the function is minimal, I am now more interested in how to derive this kind of formula.









share|cite|improve this question












share|cite|improve this question




share|cite|improve this question








edited Jul 31 at 15:11
























asked Jul 31 at 14:56









empty-barrel

133




133











  • The argument is a vector, so you need to take the gradient, not the derivative.
    – Yves Daoust
    Jul 31 at 15:02
















  • The argument is a vector, so you need to take the gradient, not the derivative.
    – Yves Daoust
    Jul 31 at 15:02















The argument is a vector, so you need to take the gradient, not the derivative.
– Yves Daoust
Jul 31 at 15:02




The argument is a vector, so you need to take the gradient, not the derivative.
– Yves Daoust
Jul 31 at 15:02










4 Answers
4






active

oldest

votes

















up vote
2
down vote



accepted










$$f(x) = (x-b)^TAx = x^TAx - b^TAx$$
$$fracpartialpartial xf(x) = (A+A^T)x - A^Tb = 0$$
that is the minimizer should satisfy
$$(A+A^T)x^* = A^Tb $$
If $A$ is invertible then
$$x^* = (A+A^T)^-1A^Tb = big(A^-T(A+A^T)big)^-1b = (I + A^-TA)^-1 b$$






share|cite|improve this answer























  • Taking the inverse is not a linear operation, so the last line is wrong. Take $A$ symmetric, for example. Then you would have $x^* = 2b$...
    – Roberto Rastapopoulos
    Jul 31 at 15:15











  • @RobertoRastapopoulos totally right !! thanks
    – Ahmad Bazzi
    Jul 31 at 15:16










  • @RobertoRastapopoulos: let the factor $A^T$ enter the parenthesis... $(A+A^T)^-1A^T=(A^-T(A+A^T))^-1=(A^-TA+I)^-1$.
    – Yves Daoust
    Jul 31 at 15:16











  • @RobertoRastapopoulos: right. I still had the time to fix.
    – Yves Daoust
    Jul 31 at 15:21










  • Thanks for the fast answer. Give me some time to work through it.
    – empty-barrel
    Jul 31 at 15:32

















up vote
0
down vote













You can rewrite it to a standard quadratic program and use corresponding methods as follows:



$(x-b)^T A x = x^T A x - b^T A x = x^T A x - c^T x$



for $c := A^T b$.



Your method can work too but your derivative calculation was wrong, it would be:



$ fracddx (x-b)^T A x = (x-b)^T A + x^T A = 0 Leftrightarrow 2A^T x = A^Tb $






share|cite|improve this answer




























    up vote
    0
    down vote













    As mentioned in the comments, you need to take the gradient, not the derivative.



    I usually get confused when trying to take the gradient of a function written in vector/matrix form (as opposed to coordinate form), so I use the following method. Let $f(x)=(x-b)^top Ax$, and consider $f(x+epsilon)-f(x)$ for a small vector $epsilon$. The result is
    $$
    epsilon^top Ax + x^top Aepsilon+requirecancelcancelepsilon^top Aepsilon - epsilon^top Ab=epsilon^top (A(x-b)+A^top x)
    $$
    Note that the $epsilon^top Aepsilon$ is canceled because it goes to zero quadratically quickly as $|epsilon|to 0$, whereas the other terms converge to $0$ linearly.



    Since $f(x+epsilon)-f(x)approx epsilon^T(A(x-b)+A^top x)$, the gradient is $A(x-b)+A^top x$. Setting this equal to zero, you get $$x=(A+A^top)^-1Ab.$$
    To determine if this is indeed a minimum, you need to look at the second derivative. This works out to be $A+A^top$, which will be a minimum as long as this is positive definite.






    share|cite|improve this answer























    • Thanks a lot. I like this approach with the $epsilon$.
      – empty-barrel
      Jul 31 at 15:39

















    up vote
    0
    down vote













    To differentiate this sort of functions, it is easier to develop them as sums:
    Let
    $$
    f(x) = Ax + (x - b)^T x = sum_i=1^n sum_j=1^n a_ij , (x_i - b_i) , x_j,
    $$
    where $a_ij, 1 leq i, j leq n$ are the elements of $A$
    and $b_i$, $1 leq i leq n$ are the elements of $b$.
    Differentiating w.r.t. $x_k$,
    beginalign
    fracpartial fpartial x_k &= sum_i=1^n sum_j=1^n a_ij , left( delta_ik , x_j + (x_i - b_i) , delta_jk right), \
    &= sum_j=1^n left( a_kj, x_j right) + sum_i=1^n left( a_ik , (x_i - b_i) right),
    endalign
    so if you gather the derivatives in a vector $(nabla f)_k = partial_x_k f$,
    beginalign
    nabla f = Ax + A^T(x - b)
    endalign
    The gradient is 0 when
    $$
    (A + A^T) , x = A^T , b
    $$






    share|cite|improve this answer























    • Where does the x in your last line come from? I guess you multiply $A^T$ into $(x-b)$, but shouldn't it be $A^T b$ then?
      – empty-barrel
      Jul 31 at 15:32











    • It should, thanks :)
      – Roberto Rastapopoulos
      Jul 31 at 16:12










    Your Answer




    StackExchange.ifUsing("editor", function ()
    return StackExchange.using("mathjaxEditing", function ()
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
    );
    );
    , "mathjax-editing");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "69"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    convertImagesToLinks: true,
    noModals: false,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    noCode: true, onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );








     

    draft saved


    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2868134%2fhow-do-i-find-the-minimum-of-this-function%23new-answer', 'question_page');

    );

    Post as a guest






























    4 Answers
    4






    active

    oldest

    votes








    4 Answers
    4






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    2
    down vote



    accepted










    $$f(x) = (x-b)^TAx = x^TAx - b^TAx$$
    $$fracpartialpartial xf(x) = (A+A^T)x - A^Tb = 0$$
    that is the minimizer should satisfy
    $$(A+A^T)x^* = A^Tb $$
    If $A$ is invertible then
    $$x^* = (A+A^T)^-1A^Tb = big(A^-T(A+A^T)big)^-1b = (I + A^-TA)^-1 b$$






    share|cite|improve this answer























    • Taking the inverse is not a linear operation, so the last line is wrong. Take $A$ symmetric, for example. Then you would have $x^* = 2b$...
      – Roberto Rastapopoulos
      Jul 31 at 15:15











    • @RobertoRastapopoulos totally right !! thanks
      – Ahmad Bazzi
      Jul 31 at 15:16










    • @RobertoRastapopoulos: let the factor $A^T$ enter the parenthesis... $(A+A^T)^-1A^T=(A^-T(A+A^T))^-1=(A^-TA+I)^-1$.
      – Yves Daoust
      Jul 31 at 15:16











    • @RobertoRastapopoulos: right. I still had the time to fix.
      – Yves Daoust
      Jul 31 at 15:21










    • Thanks for the fast answer. Give me some time to work through it.
      – empty-barrel
      Jul 31 at 15:32














    up vote
    2
    down vote



    accepted










    $$f(x) = (x-b)^TAx = x^TAx - b^TAx$$
    $$fracpartialpartial xf(x) = (A+A^T)x - A^Tb = 0$$
    that is the minimizer should satisfy
    $$(A+A^T)x^* = A^Tb $$
    If $A$ is invertible then
    $$x^* = (A+A^T)^-1A^Tb = big(A^-T(A+A^T)big)^-1b = (I + A^-TA)^-1 b$$






    share|cite|improve this answer























    • Taking the inverse is not a linear operation, so the last line is wrong. Take $A$ symmetric, for example. Then you would have $x^* = 2b$...
      – Roberto Rastapopoulos
      Jul 31 at 15:15











    • @RobertoRastapopoulos totally right !! thanks
      – Ahmad Bazzi
      Jul 31 at 15:16










    • @RobertoRastapopoulos: let the factor $A^T$ enter the parenthesis... $(A+A^T)^-1A^T=(A^-T(A+A^T))^-1=(A^-TA+I)^-1$.
      – Yves Daoust
      Jul 31 at 15:16











    • @RobertoRastapopoulos: right. I still had the time to fix.
      – Yves Daoust
      Jul 31 at 15:21










    • Thanks for the fast answer. Give me some time to work through it.
      – empty-barrel
      Jul 31 at 15:32












    up vote
    2
    down vote



    accepted







    up vote
    2
    down vote



    accepted






    $$f(x) = (x-b)^TAx = x^TAx - b^TAx$$
    $$fracpartialpartial xf(x) = (A+A^T)x - A^Tb = 0$$
    that is the minimizer should satisfy
    $$(A+A^T)x^* = A^Tb $$
    If $A$ is invertible then
    $$x^* = (A+A^T)^-1A^Tb = big(A^-T(A+A^T)big)^-1b = (I + A^-TA)^-1 b$$






    share|cite|improve this answer















    $$f(x) = (x-b)^TAx = x^TAx - b^TAx$$
    $$fracpartialpartial xf(x) = (A+A^T)x - A^Tb = 0$$
    that is the minimizer should satisfy
    $$(A+A^T)x^* = A^Tb $$
    If $A$ is invertible then
    $$x^* = (A+A^T)^-1A^Tb = big(A^-T(A+A^T)big)^-1b = (I + A^-TA)^-1 b$$







    share|cite|improve this answer















    share|cite|improve this answer



    share|cite|improve this answer








    edited Jul 31 at 15:20









    Roberto Rastapopoulos

    635321




    635321











    answered Jul 31 at 15:12









    Ahmad Bazzi

    2,245417




    2,245417











    • Taking the inverse is not a linear operation, so the last line is wrong. Take $A$ symmetric, for example. Then you would have $x^* = 2b$...
      – Roberto Rastapopoulos
      Jul 31 at 15:15











    • @RobertoRastapopoulos totally right !! thanks
      – Ahmad Bazzi
      Jul 31 at 15:16










    • @RobertoRastapopoulos: let the factor $A^T$ enter the parenthesis... $(A+A^T)^-1A^T=(A^-T(A+A^T))^-1=(A^-TA+I)^-1$.
      – Yves Daoust
      Jul 31 at 15:16











    • @RobertoRastapopoulos: right. I still had the time to fix.
      – Yves Daoust
      Jul 31 at 15:21










    • Thanks for the fast answer. Give me some time to work through it.
      – empty-barrel
      Jul 31 at 15:32
















    • Taking the inverse is not a linear operation, so the last line is wrong. Take $A$ symmetric, for example. Then you would have $x^* = 2b$...
      – Roberto Rastapopoulos
      Jul 31 at 15:15











    • @RobertoRastapopoulos totally right !! thanks
      – Ahmad Bazzi
      Jul 31 at 15:16










    • @RobertoRastapopoulos: let the factor $A^T$ enter the parenthesis... $(A+A^T)^-1A^T=(A^-T(A+A^T))^-1=(A^-TA+I)^-1$.
      – Yves Daoust
      Jul 31 at 15:16











    • @RobertoRastapopoulos: right. I still had the time to fix.
      – Yves Daoust
      Jul 31 at 15:21










    • Thanks for the fast answer. Give me some time to work through it.
      – empty-barrel
      Jul 31 at 15:32















    Taking the inverse is not a linear operation, so the last line is wrong. Take $A$ symmetric, for example. Then you would have $x^* = 2b$...
    – Roberto Rastapopoulos
    Jul 31 at 15:15





    Taking the inverse is not a linear operation, so the last line is wrong. Take $A$ symmetric, for example. Then you would have $x^* = 2b$...
    – Roberto Rastapopoulos
    Jul 31 at 15:15













    @RobertoRastapopoulos totally right !! thanks
    – Ahmad Bazzi
    Jul 31 at 15:16




    @RobertoRastapopoulos totally right !! thanks
    – Ahmad Bazzi
    Jul 31 at 15:16












    @RobertoRastapopoulos: let the factor $A^T$ enter the parenthesis... $(A+A^T)^-1A^T=(A^-T(A+A^T))^-1=(A^-TA+I)^-1$.
    – Yves Daoust
    Jul 31 at 15:16





    @RobertoRastapopoulos: let the factor $A^T$ enter the parenthesis... $(A+A^T)^-1A^T=(A^-T(A+A^T))^-1=(A^-TA+I)^-1$.
    – Yves Daoust
    Jul 31 at 15:16













    @RobertoRastapopoulos: right. I still had the time to fix.
    – Yves Daoust
    Jul 31 at 15:21




    @RobertoRastapopoulos: right. I still had the time to fix.
    – Yves Daoust
    Jul 31 at 15:21












    Thanks for the fast answer. Give me some time to work through it.
    – empty-barrel
    Jul 31 at 15:32




    Thanks for the fast answer. Give me some time to work through it.
    – empty-barrel
    Jul 31 at 15:32










    up vote
    0
    down vote













    You can rewrite it to a standard quadratic program and use corresponding methods as follows:



    $(x-b)^T A x = x^T A x - b^T A x = x^T A x - c^T x$



    for $c := A^T b$.



    Your method can work too but your derivative calculation was wrong, it would be:



    $ fracddx (x-b)^T A x = (x-b)^T A + x^T A = 0 Leftrightarrow 2A^T x = A^Tb $






    share|cite|improve this answer

























      up vote
      0
      down vote













      You can rewrite it to a standard quadratic program and use corresponding methods as follows:



      $(x-b)^T A x = x^T A x - b^T A x = x^T A x - c^T x$



      for $c := A^T b$.



      Your method can work too but your derivative calculation was wrong, it would be:



      $ fracddx (x-b)^T A x = (x-b)^T A + x^T A = 0 Leftrightarrow 2A^T x = A^Tb $






      share|cite|improve this answer























        up vote
        0
        down vote










        up vote
        0
        down vote









        You can rewrite it to a standard quadratic program and use corresponding methods as follows:



        $(x-b)^T A x = x^T A x - b^T A x = x^T A x - c^T x$



        for $c := A^T b$.



        Your method can work too but your derivative calculation was wrong, it would be:



        $ fracddx (x-b)^T A x = (x-b)^T A + x^T A = 0 Leftrightarrow 2A^T x = A^Tb $






        share|cite|improve this answer













        You can rewrite it to a standard quadratic program and use corresponding methods as follows:



        $(x-b)^T A x = x^T A x - b^T A x = x^T A x - c^T x$



        for $c := A^T b$.



        Your method can work too but your derivative calculation was wrong, it would be:



        $ fracddx (x-b)^T A x = (x-b)^T A + x^T A = 0 Leftrightarrow 2A^T x = A^Tb $







        share|cite|improve this answer













        share|cite|improve this answer



        share|cite|improve this answer











        answered Jul 31 at 15:11









        til

        694




        694




















            up vote
            0
            down vote













            As mentioned in the comments, you need to take the gradient, not the derivative.



            I usually get confused when trying to take the gradient of a function written in vector/matrix form (as opposed to coordinate form), so I use the following method. Let $f(x)=(x-b)^top Ax$, and consider $f(x+epsilon)-f(x)$ for a small vector $epsilon$. The result is
            $$
            epsilon^top Ax + x^top Aepsilon+requirecancelcancelepsilon^top Aepsilon - epsilon^top Ab=epsilon^top (A(x-b)+A^top x)
            $$
            Note that the $epsilon^top Aepsilon$ is canceled because it goes to zero quadratically quickly as $|epsilon|to 0$, whereas the other terms converge to $0$ linearly.



            Since $f(x+epsilon)-f(x)approx epsilon^T(A(x-b)+A^top x)$, the gradient is $A(x-b)+A^top x$. Setting this equal to zero, you get $$x=(A+A^top)^-1Ab.$$
            To determine if this is indeed a minimum, you need to look at the second derivative. This works out to be $A+A^top$, which will be a minimum as long as this is positive definite.






            share|cite|improve this answer























            • Thanks a lot. I like this approach with the $epsilon$.
              – empty-barrel
              Jul 31 at 15:39














            up vote
            0
            down vote













            As mentioned in the comments, you need to take the gradient, not the derivative.



            I usually get confused when trying to take the gradient of a function written in vector/matrix form (as opposed to coordinate form), so I use the following method. Let $f(x)=(x-b)^top Ax$, and consider $f(x+epsilon)-f(x)$ for a small vector $epsilon$. The result is
            $$
            epsilon^top Ax + x^top Aepsilon+requirecancelcancelepsilon^top Aepsilon - epsilon^top Ab=epsilon^top (A(x-b)+A^top x)
            $$
            Note that the $epsilon^top Aepsilon$ is canceled because it goes to zero quadratically quickly as $|epsilon|to 0$, whereas the other terms converge to $0$ linearly.



            Since $f(x+epsilon)-f(x)approx epsilon^T(A(x-b)+A^top x)$, the gradient is $A(x-b)+A^top x$. Setting this equal to zero, you get $$x=(A+A^top)^-1Ab.$$
            To determine if this is indeed a minimum, you need to look at the second derivative. This works out to be $A+A^top$, which will be a minimum as long as this is positive definite.






            share|cite|improve this answer























            • Thanks a lot. I like this approach with the $epsilon$.
              – empty-barrel
              Jul 31 at 15:39












            up vote
            0
            down vote










            up vote
            0
            down vote









            As mentioned in the comments, you need to take the gradient, not the derivative.



            I usually get confused when trying to take the gradient of a function written in vector/matrix form (as opposed to coordinate form), so I use the following method. Let $f(x)=(x-b)^top Ax$, and consider $f(x+epsilon)-f(x)$ for a small vector $epsilon$. The result is
            $$
            epsilon^top Ax + x^top Aepsilon+requirecancelcancelepsilon^top Aepsilon - epsilon^top Ab=epsilon^top (A(x-b)+A^top x)
            $$
            Note that the $epsilon^top Aepsilon$ is canceled because it goes to zero quadratically quickly as $|epsilon|to 0$, whereas the other terms converge to $0$ linearly.



            Since $f(x+epsilon)-f(x)approx epsilon^T(A(x-b)+A^top x)$, the gradient is $A(x-b)+A^top x$. Setting this equal to zero, you get $$x=(A+A^top)^-1Ab.$$
            To determine if this is indeed a minimum, you need to look at the second derivative. This works out to be $A+A^top$, which will be a minimum as long as this is positive definite.






            share|cite|improve this answer















            As mentioned in the comments, you need to take the gradient, not the derivative.



            I usually get confused when trying to take the gradient of a function written in vector/matrix form (as opposed to coordinate form), so I use the following method. Let $f(x)=(x-b)^top Ax$, and consider $f(x+epsilon)-f(x)$ for a small vector $epsilon$. The result is
            $$
            epsilon^top Ax + x^top Aepsilon+requirecancelcancelepsilon^top Aepsilon - epsilon^top Ab=epsilon^top (A(x-b)+A^top x)
            $$
            Note that the $epsilon^top Aepsilon$ is canceled because it goes to zero quadratically quickly as $|epsilon|to 0$, whereas the other terms converge to $0$ linearly.



            Since $f(x+epsilon)-f(x)approx epsilon^T(A(x-b)+A^top x)$, the gradient is $A(x-b)+A^top x$. Setting this equal to zero, you get $$x=(A+A^top)^-1Ab.$$
            To determine if this is indeed a minimum, you need to look at the second derivative. This works out to be $A+A^top$, which will be a minimum as long as this is positive definite.







            share|cite|improve this answer















            share|cite|improve this answer



            share|cite|improve this answer








            edited Jul 31 at 15:22


























            answered Jul 31 at 15:09









            Mike Earnest

            14.7k11644




            14.7k11644











            • Thanks a lot. I like this approach with the $epsilon$.
              – empty-barrel
              Jul 31 at 15:39
















            • Thanks a lot. I like this approach with the $epsilon$.
              – empty-barrel
              Jul 31 at 15:39















            Thanks a lot. I like this approach with the $epsilon$.
            – empty-barrel
            Jul 31 at 15:39




            Thanks a lot. I like this approach with the $epsilon$.
            – empty-barrel
            Jul 31 at 15:39










            up vote
            0
            down vote













            To differentiate this sort of functions, it is easier to develop them as sums:
            Let
            $$
            f(x) = Ax + (x - b)^T x = sum_i=1^n sum_j=1^n a_ij , (x_i - b_i) , x_j,
            $$
            where $a_ij, 1 leq i, j leq n$ are the elements of $A$
            and $b_i$, $1 leq i leq n$ are the elements of $b$.
            Differentiating w.r.t. $x_k$,
            beginalign
            fracpartial fpartial x_k &= sum_i=1^n sum_j=1^n a_ij , left( delta_ik , x_j + (x_i - b_i) , delta_jk right), \
            &= sum_j=1^n left( a_kj, x_j right) + sum_i=1^n left( a_ik , (x_i - b_i) right),
            endalign
            so if you gather the derivatives in a vector $(nabla f)_k = partial_x_k f$,
            beginalign
            nabla f = Ax + A^T(x - b)
            endalign
            The gradient is 0 when
            $$
            (A + A^T) , x = A^T , b
            $$






            share|cite|improve this answer























            • Where does the x in your last line come from? I guess you multiply $A^T$ into $(x-b)$, but shouldn't it be $A^T b$ then?
              – empty-barrel
              Jul 31 at 15:32











            • It should, thanks :)
              – Roberto Rastapopoulos
              Jul 31 at 16:12














            up vote
            0
            down vote













            To differentiate this sort of functions, it is easier to develop them as sums:
            Let
            $$
            f(x) = Ax + (x - b)^T x = sum_i=1^n sum_j=1^n a_ij , (x_i - b_i) , x_j,
            $$
            where $a_ij, 1 leq i, j leq n$ are the elements of $A$
            and $b_i$, $1 leq i leq n$ are the elements of $b$.
            Differentiating w.r.t. $x_k$,
            beginalign
            fracpartial fpartial x_k &= sum_i=1^n sum_j=1^n a_ij , left( delta_ik , x_j + (x_i - b_i) , delta_jk right), \
            &= sum_j=1^n left( a_kj, x_j right) + sum_i=1^n left( a_ik , (x_i - b_i) right),
            endalign
            so if you gather the derivatives in a vector $(nabla f)_k = partial_x_k f$,
            beginalign
            nabla f = Ax + A^T(x - b)
            endalign
            The gradient is 0 when
            $$
            (A + A^T) , x = A^T , b
            $$






            share|cite|improve this answer























            • Where does the x in your last line come from? I guess you multiply $A^T$ into $(x-b)$, but shouldn't it be $A^T b$ then?
              – empty-barrel
              Jul 31 at 15:32











            • It should, thanks :)
              – Roberto Rastapopoulos
              Jul 31 at 16:12












            up vote
            0
            down vote










            up vote
            0
            down vote









            To differentiate this sort of functions, it is easier to develop them as sums:
            Let
            $$
            f(x) = Ax + (x - b)^T x = sum_i=1^n sum_j=1^n a_ij , (x_i - b_i) , x_j,
            $$
            where $a_ij, 1 leq i, j leq n$ are the elements of $A$
            and $b_i$, $1 leq i leq n$ are the elements of $b$.
            Differentiating w.r.t. $x_k$,
            beginalign
            fracpartial fpartial x_k &= sum_i=1^n sum_j=1^n a_ij , left( delta_ik , x_j + (x_i - b_i) , delta_jk right), \
            &= sum_j=1^n left( a_kj, x_j right) + sum_i=1^n left( a_ik , (x_i - b_i) right),
            endalign
            so if you gather the derivatives in a vector $(nabla f)_k = partial_x_k f$,
            beginalign
            nabla f = Ax + A^T(x - b)
            endalign
            The gradient is 0 when
            $$
            (A + A^T) , x = A^T , b
            $$






            share|cite|improve this answer















            To differentiate this sort of functions, it is easier to develop them as sums:
            Let
            $$
            f(x) = Ax + (x - b)^T x = sum_i=1^n sum_j=1^n a_ij , (x_i - b_i) , x_j,
            $$
            where $a_ij, 1 leq i, j leq n$ are the elements of $A$
            and $b_i$, $1 leq i leq n$ are the elements of $b$.
            Differentiating w.r.t. $x_k$,
            beginalign
            fracpartial fpartial x_k &= sum_i=1^n sum_j=1^n a_ij , left( delta_ik , x_j + (x_i - b_i) , delta_jk right), \
            &= sum_j=1^n left( a_kj, x_j right) + sum_i=1^n left( a_ik , (x_i - b_i) right),
            endalign
            so if you gather the derivatives in a vector $(nabla f)_k = partial_x_k f$,
            beginalign
            nabla f = Ax + A^T(x - b)
            endalign
            The gradient is 0 when
            $$
            (A + A^T) , x = A^T , b
            $$







            share|cite|improve this answer















            share|cite|improve this answer



            share|cite|improve this answer








            edited Jul 31 at 16:12


























            answered Jul 31 at 15:11









            Roberto Rastapopoulos

            635321




            635321











            • Where does the x in your last line come from? I guess you multiply $A^T$ into $(x-b)$, but shouldn't it be $A^T b$ then?
              – empty-barrel
              Jul 31 at 15:32











            • It should, thanks :)
              – Roberto Rastapopoulos
              Jul 31 at 16:12
















            • Where does the x in your last line come from? I guess you multiply $A^T$ into $(x-b)$, but shouldn't it be $A^T b$ then?
              – empty-barrel
              Jul 31 at 15:32











            • It should, thanks :)
              – Roberto Rastapopoulos
              Jul 31 at 16:12















            Where does the x in your last line come from? I guess you multiply $A^T$ into $(x-b)$, but shouldn't it be $A^T b$ then?
            – empty-barrel
            Jul 31 at 15:32





            Where does the x in your last line come from? I guess you multiply $A^T$ into $(x-b)$, but shouldn't it be $A^T b$ then?
            – empty-barrel
            Jul 31 at 15:32













            It should, thanks :)
            – Roberto Rastapopoulos
            Jul 31 at 16:12




            It should, thanks :)
            – Roberto Rastapopoulos
            Jul 31 at 16:12












             

            draft saved


            draft discarded


























             


            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2868134%2fhow-do-i-find-the-minimum-of-this-function%23new-answer', 'question_page');

            );

            Post as a guest













































































            Comments

            Popular posts from this blog

            What is the equation of a 3D cone with generalised tilt?

            Color the edges and diagonals of a regular polygon

            Relationship between determinant of matrix and determinant of adjoint?