Proof of why Lagrangian Multipliers work
Clash Royale CLAN TAG#URR8PPP
up vote
3
down vote
favorite
Suppose $vecx in mathbbR^n$ and we wish to optimize $f(vecx)$ subject to k constraints: $g_i(vecx) = c_i quad forall i in 1,...,k$
Now it is clear that at a local max/min, say $vecx^*, nabla f(vecx^*)$ is orthogonal to the any curve $r(x)$ that satisfies the $g_i$ constraints (i.e. $nabla f(vecx^*) cdot r^'(vecx^*) = 0) quadquad textbf(1)$.
Therefore, it is also clear to me that if $nabla f(vecx^*) = sum_i=1^k lambda_i nabla g_i(vecx^*)$ then $textbf(1)$ will hold (since each of the $nabla g_i(vecx^*)$ are orthogonal to the curve themselves. However, I am confused as to why the converse is true.
Why is it that if $nabla f(vecx^*) cdot r^'(vecx^*) = 0, nabla f(vecx^*)$ MUST be in the span of the gradients of the $g_i$?
calculus lagrange-multiplier
add a comment |Â
up vote
3
down vote
favorite
Suppose $vecx in mathbbR^n$ and we wish to optimize $f(vecx)$ subject to k constraints: $g_i(vecx) = c_i quad forall i in 1,...,k$
Now it is clear that at a local max/min, say $vecx^*, nabla f(vecx^*)$ is orthogonal to the any curve $r(x)$ that satisfies the $g_i$ constraints (i.e. $nabla f(vecx^*) cdot r^'(vecx^*) = 0) quadquad textbf(1)$.
Therefore, it is also clear to me that if $nabla f(vecx^*) = sum_i=1^k lambda_i nabla g_i(vecx^*)$ then $textbf(1)$ will hold (since each of the $nabla g_i(vecx^*)$ are orthogonal to the curve themselves. However, I am confused as to why the converse is true.
Why is it that if $nabla f(vecx^*) cdot r^'(vecx^*) = 0, nabla f(vecx^*)$ MUST be in the span of the gradients of the $g_i$?
calculus lagrange-multiplier
What do you mean by curve $ r (x) $ ?
â S.t.a.l.k.e.r
Jul 15 at 5:29
"Curve" is probably the wrong word to use since I just mean something analogous to a space curve in 3D or a plane curve in 2D (as in $r(vecx(t))$ is any parameterization that satisfies $g_i(r(vecx(t))) = c_i$)
â Richard Groenewald
Jul 15 at 5:55
2
This is just duality, if $A,B$ are (closed) subspaces then $A subset B$ iff $B^bot subset A^bot$. (That is, if whenever $r$ is in the orthogonal complement of the $nabla g_i(x)$, we have $r$ in the orthogonal complement of $nabla f(x)$, then $nabla f(x)$ is in the span of the constraint gradients.)
â copper.hat
Jul 15 at 6:06
This is the key to your answer en.m.wikipedia.org/wiki/Implicit_function_theorem
â S.t.a.l.k.e.r
Jul 15 at 8:18
add a comment |Â
up vote
3
down vote
favorite
up vote
3
down vote
favorite
Suppose $vecx in mathbbR^n$ and we wish to optimize $f(vecx)$ subject to k constraints: $g_i(vecx) = c_i quad forall i in 1,...,k$
Now it is clear that at a local max/min, say $vecx^*, nabla f(vecx^*)$ is orthogonal to the any curve $r(x)$ that satisfies the $g_i$ constraints (i.e. $nabla f(vecx^*) cdot r^'(vecx^*) = 0) quadquad textbf(1)$.
Therefore, it is also clear to me that if $nabla f(vecx^*) = sum_i=1^k lambda_i nabla g_i(vecx^*)$ then $textbf(1)$ will hold (since each of the $nabla g_i(vecx^*)$ are orthogonal to the curve themselves. However, I am confused as to why the converse is true.
Why is it that if $nabla f(vecx^*) cdot r^'(vecx^*) = 0, nabla f(vecx^*)$ MUST be in the span of the gradients of the $g_i$?
calculus lagrange-multiplier
Suppose $vecx in mathbbR^n$ and we wish to optimize $f(vecx)$ subject to k constraints: $g_i(vecx) = c_i quad forall i in 1,...,k$
Now it is clear that at a local max/min, say $vecx^*, nabla f(vecx^*)$ is orthogonal to the any curve $r(x)$ that satisfies the $g_i$ constraints (i.e. $nabla f(vecx^*) cdot r^'(vecx^*) = 0) quadquad textbf(1)$.
Therefore, it is also clear to me that if $nabla f(vecx^*) = sum_i=1^k lambda_i nabla g_i(vecx^*)$ then $textbf(1)$ will hold (since each of the $nabla g_i(vecx^*)$ are orthogonal to the curve themselves. However, I am confused as to why the converse is true.
Why is it that if $nabla f(vecx^*) cdot r^'(vecx^*) = 0, nabla f(vecx^*)$ MUST be in the span of the gradients of the $g_i$?
calculus lagrange-multiplier
asked Jul 15 at 2:31
Richard Groenewald
262
262
What do you mean by curve $ r (x) $ ?
â S.t.a.l.k.e.r
Jul 15 at 5:29
"Curve" is probably the wrong word to use since I just mean something analogous to a space curve in 3D or a plane curve in 2D (as in $r(vecx(t))$ is any parameterization that satisfies $g_i(r(vecx(t))) = c_i$)
â Richard Groenewald
Jul 15 at 5:55
2
This is just duality, if $A,B$ are (closed) subspaces then $A subset B$ iff $B^bot subset A^bot$. (That is, if whenever $r$ is in the orthogonal complement of the $nabla g_i(x)$, we have $r$ in the orthogonal complement of $nabla f(x)$, then $nabla f(x)$ is in the span of the constraint gradients.)
â copper.hat
Jul 15 at 6:06
This is the key to your answer en.m.wikipedia.org/wiki/Implicit_function_theorem
â S.t.a.l.k.e.r
Jul 15 at 8:18
add a comment |Â
What do you mean by curve $ r (x) $ ?
â S.t.a.l.k.e.r
Jul 15 at 5:29
"Curve" is probably the wrong word to use since I just mean something analogous to a space curve in 3D or a plane curve in 2D (as in $r(vecx(t))$ is any parameterization that satisfies $g_i(r(vecx(t))) = c_i$)
â Richard Groenewald
Jul 15 at 5:55
2
This is just duality, if $A,B$ are (closed) subspaces then $A subset B$ iff $B^bot subset A^bot$. (That is, if whenever $r$ is in the orthogonal complement of the $nabla g_i(x)$, we have $r$ in the orthogonal complement of $nabla f(x)$, then $nabla f(x)$ is in the span of the constraint gradients.)
â copper.hat
Jul 15 at 6:06
This is the key to your answer en.m.wikipedia.org/wiki/Implicit_function_theorem
â S.t.a.l.k.e.r
Jul 15 at 8:18
What do you mean by curve $ r (x) $ ?
â S.t.a.l.k.e.r
Jul 15 at 5:29
What do you mean by curve $ r (x) $ ?
â S.t.a.l.k.e.r
Jul 15 at 5:29
"Curve" is probably the wrong word to use since I just mean something analogous to a space curve in 3D or a plane curve in 2D (as in $r(vecx(t))$ is any parameterization that satisfies $g_i(r(vecx(t))) = c_i$)
â Richard Groenewald
Jul 15 at 5:55
"Curve" is probably the wrong word to use since I just mean something analogous to a space curve in 3D or a plane curve in 2D (as in $r(vecx(t))$ is any parameterization that satisfies $g_i(r(vecx(t))) = c_i$)
â Richard Groenewald
Jul 15 at 5:55
2
2
This is just duality, if $A,B$ are (closed) subspaces then $A subset B$ iff $B^bot subset A^bot$. (That is, if whenever $r$ is in the orthogonal complement of the $nabla g_i(x)$, we have $r$ in the orthogonal complement of $nabla f(x)$, then $nabla f(x)$ is in the span of the constraint gradients.)
â copper.hat
Jul 15 at 6:06
This is just duality, if $A,B$ are (closed) subspaces then $A subset B$ iff $B^bot subset A^bot$. (That is, if whenever $r$ is in the orthogonal complement of the $nabla g_i(x)$, we have $r$ in the orthogonal complement of $nabla f(x)$, then $nabla f(x)$ is in the span of the constraint gradients.)
â copper.hat
Jul 15 at 6:06
This is the key to your answer en.m.wikipedia.org/wiki/Implicit_function_theorem
â S.t.a.l.k.e.r
Jul 15 at 8:18
This is the key to your answer en.m.wikipedia.org/wiki/Implicit_function_theorem
â S.t.a.l.k.e.r
Jul 15 at 8:18
add a comment |Â
1 Answer
1
active
oldest
votes
up vote
0
down vote
The Lagrange Mutiplier is based on Implicit function theorem and inverse function theorem.
I will use 3 variables and 1 constraint to demonstrate why it works, the idea can be extended to any finite number of variables and constraints.
Let's assume your functions have partial derivatives that allows application of implicit function theorem which they must if Lagrange multiplier is to be used.
$ f (x_1, x_2, x_3)$
$g(x_1, x_2, x_3) = 0$
Because of the constraint you have $ x_1=h (x_2, x_3) $
$ f (h, x_2, x_3 ) $
$ g (h, x_2, x_3 )=0 $
Stationary points occur at $fracdfdx_1=fracdfdx_2=fracdfdx_1=0$
$0=fracdfdx_2=fracpartial fpartial x_2+fracpartial fpartial hfracpartial hpartial x_2$
From the constraint $0=fracdgdx_2=fracpartial gpartial x_2+fracpartial gpartial hfracpartial hpartial x_2$
Now using the 2 equations
$0=fracpartial fpartial x_2+lambdafracpartial gpartial x_2$
Where $lambda = -fracpartial fpartial h/fracpartial gpartial h$
For $ x_3 $ we have $0=fracpartial fpartial x_3+lambdafracpartial gpartial x_3$
Using the same trick with $ x_2 (x_1, x_3) $ you should observe that
$lambda=-fracpartial fpartial h/fracpartial gpartial h=-fracpartial fpartial x_2/fracpartial gpartial x_2=-fracpartial fpartial x_3/fracpartial gpartial x_3$
And get
$0=fracpartial fpartial x_1+lambdafracpartial gpartial x_1$
To extend the proof to finite number of variables and constraints :
$ f (x_1,.., x_n) $
Write $ x_1 (x_2,..., x_n) $ ,....,$ x_k(x_k+1,..., x_n)$
Where $ k $ is number of constraints , write out the partial derivatives at the stationary point and set to zero then you end up with equations you asked in your question
I know how to implement the method, I was asking for an explanation of the proof. This is NOT helpful at all.
â Richard Groenewald
Jul 15 at 17:20
add a comment |Â
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
The Lagrange Mutiplier is based on Implicit function theorem and inverse function theorem.
I will use 3 variables and 1 constraint to demonstrate why it works, the idea can be extended to any finite number of variables and constraints.
Let's assume your functions have partial derivatives that allows application of implicit function theorem which they must if Lagrange multiplier is to be used.
$ f (x_1, x_2, x_3)$
$g(x_1, x_2, x_3) = 0$
Because of the constraint you have $ x_1=h (x_2, x_3) $
$ f (h, x_2, x_3 ) $
$ g (h, x_2, x_3 )=0 $
Stationary points occur at $fracdfdx_1=fracdfdx_2=fracdfdx_1=0$
$0=fracdfdx_2=fracpartial fpartial x_2+fracpartial fpartial hfracpartial hpartial x_2$
From the constraint $0=fracdgdx_2=fracpartial gpartial x_2+fracpartial gpartial hfracpartial hpartial x_2$
Now using the 2 equations
$0=fracpartial fpartial x_2+lambdafracpartial gpartial x_2$
Where $lambda = -fracpartial fpartial h/fracpartial gpartial h$
For $ x_3 $ we have $0=fracpartial fpartial x_3+lambdafracpartial gpartial x_3$
Using the same trick with $ x_2 (x_1, x_3) $ you should observe that
$lambda=-fracpartial fpartial h/fracpartial gpartial h=-fracpartial fpartial x_2/fracpartial gpartial x_2=-fracpartial fpartial x_3/fracpartial gpartial x_3$
And get
$0=fracpartial fpartial x_1+lambdafracpartial gpartial x_1$
To extend the proof to finite number of variables and constraints :
$ f (x_1,.., x_n) $
Write $ x_1 (x_2,..., x_n) $ ,....,$ x_k(x_k+1,..., x_n)$
Where $ k $ is number of constraints , write out the partial derivatives at the stationary point and set to zero then you end up with equations you asked in your question
I know how to implement the method, I was asking for an explanation of the proof. This is NOT helpful at all.
â Richard Groenewald
Jul 15 at 17:20
add a comment |Â
up vote
0
down vote
The Lagrange Mutiplier is based on Implicit function theorem and inverse function theorem.
I will use 3 variables and 1 constraint to demonstrate why it works, the idea can be extended to any finite number of variables and constraints.
Let's assume your functions have partial derivatives that allows application of implicit function theorem which they must if Lagrange multiplier is to be used.
$ f (x_1, x_2, x_3)$
$g(x_1, x_2, x_3) = 0$
Because of the constraint you have $ x_1=h (x_2, x_3) $
$ f (h, x_2, x_3 ) $
$ g (h, x_2, x_3 )=0 $
Stationary points occur at $fracdfdx_1=fracdfdx_2=fracdfdx_1=0$
$0=fracdfdx_2=fracpartial fpartial x_2+fracpartial fpartial hfracpartial hpartial x_2$
From the constraint $0=fracdgdx_2=fracpartial gpartial x_2+fracpartial gpartial hfracpartial hpartial x_2$
Now using the 2 equations
$0=fracpartial fpartial x_2+lambdafracpartial gpartial x_2$
Where $lambda = -fracpartial fpartial h/fracpartial gpartial h$
For $ x_3 $ we have $0=fracpartial fpartial x_3+lambdafracpartial gpartial x_3$
Using the same trick with $ x_2 (x_1, x_3) $ you should observe that
$lambda=-fracpartial fpartial h/fracpartial gpartial h=-fracpartial fpartial x_2/fracpartial gpartial x_2=-fracpartial fpartial x_3/fracpartial gpartial x_3$
And get
$0=fracpartial fpartial x_1+lambdafracpartial gpartial x_1$
To extend the proof to finite number of variables and constraints :
$ f (x_1,.., x_n) $
Write $ x_1 (x_2,..., x_n) $ ,....,$ x_k(x_k+1,..., x_n)$
Where $ k $ is number of constraints , write out the partial derivatives at the stationary point and set to zero then you end up with equations you asked in your question
I know how to implement the method, I was asking for an explanation of the proof. This is NOT helpful at all.
â Richard Groenewald
Jul 15 at 17:20
add a comment |Â
up vote
0
down vote
up vote
0
down vote
The Lagrange Mutiplier is based on Implicit function theorem and inverse function theorem.
I will use 3 variables and 1 constraint to demonstrate why it works, the idea can be extended to any finite number of variables and constraints.
Let's assume your functions have partial derivatives that allows application of implicit function theorem which they must if Lagrange multiplier is to be used.
$ f (x_1, x_2, x_3)$
$g(x_1, x_2, x_3) = 0$
Because of the constraint you have $ x_1=h (x_2, x_3) $
$ f (h, x_2, x_3 ) $
$ g (h, x_2, x_3 )=0 $
Stationary points occur at $fracdfdx_1=fracdfdx_2=fracdfdx_1=0$
$0=fracdfdx_2=fracpartial fpartial x_2+fracpartial fpartial hfracpartial hpartial x_2$
From the constraint $0=fracdgdx_2=fracpartial gpartial x_2+fracpartial gpartial hfracpartial hpartial x_2$
Now using the 2 equations
$0=fracpartial fpartial x_2+lambdafracpartial gpartial x_2$
Where $lambda = -fracpartial fpartial h/fracpartial gpartial h$
For $ x_3 $ we have $0=fracpartial fpartial x_3+lambdafracpartial gpartial x_3$
Using the same trick with $ x_2 (x_1, x_3) $ you should observe that
$lambda=-fracpartial fpartial h/fracpartial gpartial h=-fracpartial fpartial x_2/fracpartial gpartial x_2=-fracpartial fpartial x_3/fracpartial gpartial x_3$
And get
$0=fracpartial fpartial x_1+lambdafracpartial gpartial x_1$
To extend the proof to finite number of variables and constraints :
$ f (x_1,.., x_n) $
Write $ x_1 (x_2,..., x_n) $ ,....,$ x_k(x_k+1,..., x_n)$
Where $ k $ is number of constraints , write out the partial derivatives at the stationary point and set to zero then you end up with equations you asked in your question
The Lagrange Mutiplier is based on Implicit function theorem and inverse function theorem.
I will use 3 variables and 1 constraint to demonstrate why it works, the idea can be extended to any finite number of variables and constraints.
Let's assume your functions have partial derivatives that allows application of implicit function theorem which they must if Lagrange multiplier is to be used.
$ f (x_1, x_2, x_3)$
$g(x_1, x_2, x_3) = 0$
Because of the constraint you have $ x_1=h (x_2, x_3) $
$ f (h, x_2, x_3 ) $
$ g (h, x_2, x_3 )=0 $
Stationary points occur at $fracdfdx_1=fracdfdx_2=fracdfdx_1=0$
$0=fracdfdx_2=fracpartial fpartial x_2+fracpartial fpartial hfracpartial hpartial x_2$
From the constraint $0=fracdgdx_2=fracpartial gpartial x_2+fracpartial gpartial hfracpartial hpartial x_2$
Now using the 2 equations
$0=fracpartial fpartial x_2+lambdafracpartial gpartial x_2$
Where $lambda = -fracpartial fpartial h/fracpartial gpartial h$
For $ x_3 $ we have $0=fracpartial fpartial x_3+lambdafracpartial gpartial x_3$
Using the same trick with $ x_2 (x_1, x_3) $ you should observe that
$lambda=-fracpartial fpartial h/fracpartial gpartial h=-fracpartial fpartial x_2/fracpartial gpartial x_2=-fracpartial fpartial x_3/fracpartial gpartial x_3$
And get
$0=fracpartial fpartial x_1+lambdafracpartial gpartial x_1$
To extend the proof to finite number of variables and constraints :
$ f (x_1,.., x_n) $
Write $ x_1 (x_2,..., x_n) $ ,....,$ x_k(x_k+1,..., x_n)$
Where $ k $ is number of constraints , write out the partial derivatives at the stationary point and set to zero then you end up with equations you asked in your question
edited Jul 15 at 8:16
answered Jul 15 at 7:22
S.t.a.l.k.e.r
719320
719320
I know how to implement the method, I was asking for an explanation of the proof. This is NOT helpful at all.
â Richard Groenewald
Jul 15 at 17:20
add a comment |Â
I know how to implement the method, I was asking for an explanation of the proof. This is NOT helpful at all.
â Richard Groenewald
Jul 15 at 17:20
I know how to implement the method, I was asking for an explanation of the proof. This is NOT helpful at all.
â Richard Groenewald
Jul 15 at 17:20
I know how to implement the method, I was asking for an explanation of the proof. This is NOT helpful at all.
â Richard Groenewald
Jul 15 at 17:20
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2852134%2fproof-of-why-lagrangian-multipliers-work%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
What do you mean by curve $ r (x) $ ?
â S.t.a.l.k.e.r
Jul 15 at 5:29
"Curve" is probably the wrong word to use since I just mean something analogous to a space curve in 3D or a plane curve in 2D (as in $r(vecx(t))$ is any parameterization that satisfies $g_i(r(vecx(t))) = c_i$)
â Richard Groenewald
Jul 15 at 5:55
2
This is just duality, if $A,B$ are (closed) subspaces then $A subset B$ iff $B^bot subset A^bot$. (That is, if whenever $r$ is in the orthogonal complement of the $nabla g_i(x)$, we have $r$ in the orthogonal complement of $nabla f(x)$, then $nabla f(x)$ is in the span of the constraint gradients.)
â copper.hat
Jul 15 at 6:06
This is the key to your answer en.m.wikipedia.org/wiki/Implicit_function_theorem
â S.t.a.l.k.e.r
Jul 15 at 8:18