Proof of why Lagrangian Multipliers work

up vote
3
down vote

favorite

Suppose $vecx in mathbbR^n$ and we wish to optimize $f(vecx)$ subject to k constraints: $g_i(vecx) = c_i quad forall i in 1,...,k$

Now it is clear that at a local max/min, say $vecx^*, nabla f(vecx^*)$ is orthogonal to the any curve $r(x)$ that satisfies the $g_i$ constraints (i.e. $nabla f(vecx^*) cdot r^'(vecx^*) = 0) quadquad textbf(1)$.

Therefore, it is also clear to me that if $nabla f(vecx^*) = sum_i=1^k lambda_i nabla g_i(vecx^*)$ then $textbf(1)$ will hold (since each of the $nabla g_i(vecx^*)$ are orthogonal to the curve themselves. However, I am confused as to why the converse is true.

Why is it that if $nabla f(vecx^*) cdot r^'(vecx^*) = 0, nabla f(vecx^*)$ MUST be in the span of the gradients of the $g_i$?

asked Jul 15 at 2:31

Richard Groenewald

262

What do you mean by curve $ r (x) $ ?
â€“Â S.t.a.l.k.e.r
Jul 15 at 5:29

"Curve" is probably the wrong word to use since I just mean something analogous to a space curve in 3D or a plane curve in 2D (as in $r(vecx(t))$ is any parameterization that satisfies $g_i(r(vecx(t))) = c_i$)
â€“Â Richard Groenewald
Jul 15 at 5:55

2

This is just duality, if $A,B$ are (closed) subspaces then $A subset B$ iff $B^bot subset A^bot$. (That is, if whenever $r$ is in the orthogonal complement of the $nabla g_i(x)$, we have $r$ in the orthogonal complement of $nabla f(x)$, then $nabla f(x)$ is in the span of the constraint gradients.)
â€“Â copper.hat
Jul 15 at 6:06

This is the key to your answer en.m.wikipedia.org/wiki/Implicit_function_theorem
â€“Â S.t.a.l.k.e.r
Jul 15 at 8:18

add a commentÂ |Â

up vote
3
down vote

favorite

Suppose $vecx in mathbbR^n$ and we wish to optimize $f(vecx)$ subject to k constraints: $g_i(vecx) = c_i quad forall i in 1,...,k$

Why is it that if $nabla f(vecx^*) cdot r^'(vecx^*) = 0, nabla f(vecx^*)$ MUST be in the span of the gradients of the $g_i$?

asked Jul 15 at 2:31

Richard Groenewald

262

What do you mean by curve $ r (x) $ ?
â€“Â S.t.a.l.k.e.r
Jul 15 at 5:29

"Curve" is probably the wrong word to use since I just mean something analogous to a space curve in 3D or a plane curve in 2D (as in $r(vecx(t))$ is any parameterization that satisfies $g_i(r(vecx(t))) = c_i$)
â€“Â Richard Groenewald
Jul 15 at 5:55

2

This is just duality, if $A,B$ are (closed) subspaces then $A subset B$ iff $B^bot subset A^bot$. (That is, if whenever $r$ is in the orthogonal complement of the $nabla g_i(x)$, we have $r$ in the orthogonal complement of $nabla f(x)$, then $nabla f(x)$ is in the span of the constraint gradients.)
â€“Â copper.hat
Jul 15 at 6:06

This is the key to your answer en.m.wikipedia.org/wiki/Implicit_function_theorem
â€“Â S.t.a.l.k.e.r
Jul 15 at 8:18

add a commentÂ |Â

up vote
3
down vote

favorite

Suppose $vecx in mathbbR^n$ and we wish to optimize $f(vecx)$ subject to k constraints: $g_i(vecx) = c_i quad forall i in 1,...,k$

Why is it that if $nabla f(vecx^*) cdot r^'(vecx^*) = 0, nabla f(vecx^*)$ MUST be in the span of the gradients of the $g_i$?

asked Jul 15 at 2:31

Richard Groenewald

262

Suppose $vecx in mathbbR^n$ and we wish to optimize $f(vecx)$ subject to k constraints: $g_i(vecx) = c_i quad forall i in 1,...,k$

Why is it that if $nabla f(vecx^*) cdot r^'(vecx^*) = 0, nabla f(vecx^*)$ MUST be in the span of the gradients of the $g_i$?

asked Jul 15 at 2:31

Richard Groenewald

262

asked Jul 15 at 2:31

Richard Groenewald

262

asked Jul 15 at 2:31

Richard Groenewald

262

asked Jul 15 at 2:31

Richard Groenewald

262

What do you mean by curve $ r (x) $ ?
â€“Â S.t.a.l.k.e.r
Jul 15 at 5:29

"Curve" is probably the wrong word to use since I just mean something analogous to a space curve in 3D or a plane curve in 2D (as in $r(vecx(t))$ is any parameterization that satisfies $g_i(r(vecx(t))) = c_i$)
â€“Â Richard Groenewald
Jul 15 at 5:55

2

This is just duality, if $A,B$ are (closed) subspaces then $A subset B$ iff $B^bot subset A^bot$. (That is, if whenever $r$ is in the orthogonal complement of the $nabla g_i(x)$, we have $r$ in the orthogonal complement of $nabla f(x)$, then $nabla f(x)$ is in the span of the constraint gradients.)
â€“Â copper.hat
Jul 15 at 6:06

This is the key to your answer en.m.wikipedia.org/wiki/Implicit_function_theorem
â€“Â S.t.a.l.k.e.r
Jul 15 at 8:18

add a commentÂ |Â

What do you mean by curve $ r (x) $ ?
â€“Â S.t.a.l.k.e.r
Jul 15 at 5:29

"Curve" is probably the wrong word to use since I just mean something analogous to a space curve in 3D or a plane curve in 2D (as in $r(vecx(t))$ is any parameterization that satisfies $g_i(r(vecx(t))) = c_i$)
â€“Â Richard Groenewald
Jul 15 at 5:55

2

This is just duality, if $A,B$ are (closed) subspaces then $A subset B$ iff $B^bot subset A^bot$. (That is, if whenever $r$ is in the orthogonal complement of the $nabla g_i(x)$, we have $r$ in the orthogonal complement of $nabla f(x)$, then $nabla f(x)$ is in the span of the constraint gradients.)
â€“Â copper.hat
Jul 15 at 6:06

This is the key to your answer en.m.wikipedia.org/wiki/Implicit_function_theorem
â€“Â S.t.a.l.k.e.r
Jul 15 at 8:18

What do you mean by curve $ r (x) $ ?
â€“Â S.t.a.l.k.e.r
Jul 15 at 5:29

"Curve" is probably the wrong word to use since I just mean something analogous to a space curve in 3D or a plane curve in 2D (as in $r(vecx(t))$ is any parameterization that satisfies $g_i(r(vecx(t))) = c_i$)
â€“Â Richard Groenewald
Jul 15 at 5:55

This is just duality, if $A,B$ are (closed) subspaces then $A subset B$ iff $B^bot subset A^bot$. (That is, if whenever $r$ is in the orthogonal complement of the $nabla g_i(x)$, we have $r$ in the orthogonal complement of $nabla f(x)$, then $nabla f(x)$ is in the span of the constraint gradients.)
â€“Â copper.hat
Jul 15 at 6:06

This is the key to your answer en.m.wikipedia.org/wiki/Implicit_function_theorem
â€“Â S.t.a.l.k.e.r
Jul 15 at 8:18

add a commentÂ |Â

1 Answer
1

active

oldest

votes

up vote
0
down vote

The Lagrange Mutiplier is based on Implicit function theorem and inverse function theorem.

I will use 3 variables and 1 constraint to demonstrate why it works, the idea can be extended to any finite number of variables and constraints.

Let's assume your functions have partial derivatives that allows application of implicit function theorem which they must if Lagrange multiplier is to be used.

$ f (x_1, x_2, x_3)$

$g(x_1, x_2, x_3) = 0$

Because of the constraint you have $ x_1=h (x_2, x_3) $

$ f (h, x_2, x_3 ) $

$ g (h, x_2, x_3 )=0 $

Stationary points occur at $fracdfdx_1=fracdfdx_2=fracdfdx_1=0$

$0=fracdfdx_2=fracpartial fpartial x_2+fracpartial fpartial hfracpartial hpartial x_2$

From the constraint $0=fracdgdx_2=fracpartial gpartial x_2+fracpartial gpartial hfracpartial hpartial x_2$

Now using the 2 equations

$0=fracpartial fpartial x_2+lambdafracpartial gpartial x_2$

Where $lambda = -fracpartial fpartial h/fracpartial gpartial h$

For $ x_3 $ we have $0=fracpartial fpartial x_3+lambdafracpartial gpartial x_3$

Using the same trick with $ x_2 (x_1, x_3) $ you should observe that

$lambda=-fracpartial fpartial h/fracpartial gpartial h=-fracpartial fpartial x_2/fracpartial gpartial x_2=-fracpartial fpartial x_3/fracpartial gpartial x_3$

And get

$0=fracpartial fpartial x_1+lambdafracpartial gpartial x_1$

To extend the proof to finite number of variables and constraints :

$ f (x_1,.., x_n) $

Write $ x_1 (x_2,..., x_n) $ ,....,$ x_k(x_k+1,..., x_n)$

Where $ k $ is number of constraints , write out the partial derivatives at the stationary point and set to zero then you end up with equations you asked in your question

edited Jul 15 at 8:16

answered Jul 15 at 7:22

S.t.a.l.k.e.r

719320

I know how to implement the method, I was asking for an explanation of the proof. This is NOT helpful at all.
â€“Â Richard Groenewald
Jul 15 at 17:20

add a commentÂ |Â

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "69"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2852134%2fproof-of-why-lagrangian-multipliers-work%23new-answer', 'question_page');

);

Post as a guest

Name

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
0
down vote

The Lagrange Mutiplier is based on Implicit function theorem and inverse function theorem.

I will use 3 variables and 1 constraint to demonstrate why it works, the idea can be extended to any finite number of variables and constraints.

Let's assume your functions have partial derivatives that allows application of implicit function theorem which they must if Lagrange multiplier is to be used.

$ f (x_1, x_2, x_3)$

$g(x_1, x_2, x_3) = 0$

Because of the constraint you have $ x_1=h (x_2, x_3) $

$ f (h, x_2, x_3 ) $

$ g (h, x_2, x_3 )=0 $

Stationary points occur at $fracdfdx_1=fracdfdx_2=fracdfdx_1=0$

$0=fracdfdx_2=fracpartial fpartial x_2+fracpartial fpartial hfracpartial hpartial x_2$

From the constraint $0=fracdgdx_2=fracpartial gpartial x_2+fracpartial gpartial hfracpartial hpartial x_2$

Now using the 2 equations

$0=fracpartial fpartial x_2+lambdafracpartial gpartial x_2$

Where $lambda = -fracpartial fpartial h/fracpartial gpartial h$

For $ x_3 $ we have $0=fracpartial fpartial x_3+lambdafracpartial gpartial x_3$

Using the same trick with $ x_2 (x_1, x_3) $ you should observe that

$lambda=-fracpartial fpartial h/fracpartial gpartial h=-fracpartial fpartial x_2/fracpartial gpartial x_2=-fracpartial fpartial x_3/fracpartial gpartial x_3$

And get

$0=fracpartial fpartial x_1+lambdafracpartial gpartial x_1$

To extend the proof to finite number of variables and constraints :

$ f (x_1,.., x_n) $

Write $ x_1 (x_2,..., x_n) $ ,....,$ x_k(x_k+1,..., x_n)$

Where $ k $ is number of constraints , write out the partial derivatives at the stationary point and set to zero then you end up with equations you asked in your question

edited Jul 15 at 8:16

answered Jul 15 at 7:22

S.t.a.l.k.e.r

719320

I know how to implement the method, I was asking for an explanation of the proof. This is NOT helpful at all.
â€“Â Richard Groenewald
Jul 15 at 17:20

add a commentÂ |Â

up vote
0
down vote

The Lagrange Mutiplier is based on Implicit function theorem and inverse function theorem.

I will use 3 variables and 1 constraint to demonstrate why it works, the idea can be extended to any finite number of variables and constraints.

Let's assume your functions have partial derivatives that allows application of implicit function theorem which they must if Lagrange multiplier is to be used.

$ f (x_1, x_2, x_3)$

$g(x_1, x_2, x_3) = 0$

Because of the constraint you have $ x_1=h (x_2, x_3) $

$ f (h, x_2, x_3 ) $

$ g (h, x_2, x_3 )=0 $

Stationary points occur at $fracdfdx_1=fracdfdx_2=fracdfdx_1=0$

$0=fracdfdx_2=fracpartial fpartial x_2+fracpartial fpartial hfracpartial hpartial x_2$

From the constraint $0=fracdgdx_2=fracpartial gpartial x_2+fracpartial gpartial hfracpartial hpartial x_2$

Now using the 2 equations

$0=fracpartial fpartial x_2+lambdafracpartial gpartial x_2$

Where $lambda = -fracpartial fpartial h/fracpartial gpartial h$

For $ x_3 $ we have $0=fracpartial fpartial x_3+lambdafracpartial gpartial x_3$

Using the same trick with $ x_2 (x_1, x_3) $ you should observe that

$lambda=-fracpartial fpartial h/fracpartial gpartial h=-fracpartial fpartial x_2/fracpartial gpartial x_2=-fracpartial fpartial x_3/fracpartial gpartial x_3$

And get

$0=fracpartial fpartial x_1+lambdafracpartial gpartial x_1$

To extend the proof to finite number of variables and constraints :

$ f (x_1,.., x_n) $

Write $ x_1 (x_2,..., x_n) $ ,....,$ x_k(x_k+1,..., x_n)$

Where $ k $ is number of constraints , write out the partial derivatives at the stationary point and set to zero then you end up with equations you asked in your question

edited Jul 15 at 8:16

answered Jul 15 at 7:22

S.t.a.l.k.e.r

719320

I know how to implement the method, I was asking for an explanation of the proof. This is NOT helpful at all.
â€“Â Richard Groenewald
Jul 15 at 17:20

add a commentÂ |Â

up vote
0
down vote

The Lagrange Mutiplier is based on Implicit function theorem and inverse function theorem.

I will use 3 variables and 1 constraint to demonstrate why it works, the idea can be extended to any finite number of variables and constraints.

Let's assume your functions have partial derivatives that allows application of implicit function theorem which they must if Lagrange multiplier is to be used.

$ f (x_1, x_2, x_3)$

$g(x_1, x_2, x_3) = 0$

Because of the constraint you have $ x_1=h (x_2, x_3) $

$ f (h, x_2, x_3 ) $

$ g (h, x_2, x_3 )=0 $

Stationary points occur at $fracdfdx_1=fracdfdx_2=fracdfdx_1=0$

$0=fracdfdx_2=fracpartial fpartial x_2+fracpartial fpartial hfracpartial hpartial x_2$

From the constraint $0=fracdgdx_2=fracpartial gpartial x_2+fracpartial gpartial hfracpartial hpartial x_2$

Now using the 2 equations

$0=fracpartial fpartial x_2+lambdafracpartial gpartial x_2$

Where $lambda = -fracpartial fpartial h/fracpartial gpartial h$

For $ x_3 $ we have $0=fracpartial fpartial x_3+lambdafracpartial gpartial x_3$

Using the same trick with $ x_2 (x_1, x_3) $ you should observe that

$lambda=-fracpartial fpartial h/fracpartial gpartial h=-fracpartial fpartial x_2/fracpartial gpartial x_2=-fracpartial fpartial x_3/fracpartial gpartial x_3$

And get

$0=fracpartial fpartial x_1+lambdafracpartial gpartial x_1$

To extend the proof to finite number of variables and constraints :

$ f (x_1,.., x_n) $

Write $ x_1 (x_2,..., x_n) $ ,....,$ x_k(x_k+1,..., x_n)$

Where $ k $ is number of constraints , write out the partial derivatives at the stationary point and set to zero then you end up with equations you asked in your question

edited Jul 15 at 8:16

answered Jul 15 at 7:22

S.t.a.l.k.e.r

719320

The Lagrange Mutiplier is based on Implicit function theorem and inverse function theorem.

I will use 3 variables and 1 constraint to demonstrate why it works, the idea can be extended to any finite number of variables and constraints.

Let's assume your functions have partial derivatives that allows application of implicit function theorem which they must if Lagrange multiplier is to be used.

$ f (x_1, x_2, x_3)$

$g(x_1, x_2, x_3) = 0$

Because of the constraint you have $ x_1=h (x_2, x_3) $

$ f (h, x_2, x_3 ) $

$ g (h, x_2, x_3 )=0 $

Stationary points occur at $fracdfdx_1=fracdfdx_2=fracdfdx_1=0$

$0=fracdfdx_2=fracpartial fpartial x_2+fracpartial fpartial hfracpartial hpartial x_2$

From the constraint $0=fracdgdx_2=fracpartial gpartial x_2+fracpartial gpartial hfracpartial hpartial x_2$

Now using the 2 equations

$0=fracpartial fpartial x_2+lambdafracpartial gpartial x_2$

Where $lambda = -fracpartial fpartial h/fracpartial gpartial h$

For $ x_3 $ we have $0=fracpartial fpartial x_3+lambdafracpartial gpartial x_3$

Using the same trick with $ x_2 (x_1, x_3) $ you should observe that

$lambda=-fracpartial fpartial h/fracpartial gpartial h=-fracpartial fpartial x_2/fracpartial gpartial x_2=-fracpartial fpartial x_3/fracpartial gpartial x_3$

And get

$0=fracpartial fpartial x_1+lambdafracpartial gpartial x_1$

To extend the proof to finite number of variables and constraints :

$ f (x_1,.., x_n) $

Write $ x_1 (x_2,..., x_n) $ ,....,$ x_k(x_k+1,..., x_n)$

Where $ k $ is number of constraints , write out the partial derivatives at the stationary point and set to zero then you end up with equations you asked in your question

edited Jul 15 at 8:16

answered Jul 15 at 7:22

S.t.a.l.k.e.r

719320

edited Jul 15 at 8:16

answered Jul 15 at 7:22

S.t.a.l.k.e.r

719320

answered Jul 15 at 7:22

S.t.a.l.k.e.r

719320

answered Jul 15 at 7:22

S.t.a.l.k.e.r

719320

I know how to implement the method, I was asking for an explanation of the proof. This is NOT helpful at all.
â€“Â Richard Groenewald
Jul 15 at 17:20

add a commentÂ |Â

I know how to implement the method, I was asking for an explanation of the proof. This is NOT helpful at all.
â€“Â Richard Groenewald
Jul 15 at 17:20

I know how to implement the method, I was asking for an explanation of the proof. This is NOT helpful at all.
â€“Â Richard Groenewald
Jul 15 at 17:20

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

Search This Blog

ukmuiik